Bug ID 722013: MCPD restarts on all secondary blades post config-sync involving APM customization group

Last Modified: Nov 07, 2022

Bug Tracker

Affected Product:  See more info
BIG-IP APM(all modules)

Known Affected Versions:
12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3, 12.1.3.1, 12.1.3.2, 12.1.3.3, 12.1.3.4, 12.1.3.5, 12.1.3.6, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4

Fixed In:
14.1.0, 14.0.0.5, 13.1.1.2, 12.1.3.7

Opened: May 25, 2018
Severity: 2-Critical

Symptoms

After performing a series of specific config-sync operations between multi-blade systems, the MCPD daemon restarts on all secondary blades of one of the systems. Each affected blade will log an error message similar to the following example: -- err mcpd[5038]: 01070711:3: Caught runtime exception, Failed to collect files (snapshot path () req (7853) pmgr (7853) not valid msg (create_if { customization_group { customization_group_name "/Common/Test-A_secure_access_client_customization" customization_group_app_id "" customization_group_config_source 0 customization_group_cache_path "/config/filestore/files_d/Common_d/customization_group_d/:Common:Test-A_secure_access_client_customization_66596_1" customization_group_system_path "" customization_group_is_system 0 customization_group_access 3 customization_group_is_dynamic 0 customization_group_checksum "SHA1:62:fd61541c1097d460e42c50904684def2794ba70d" customization_group_create_time 1524643393 customization_group_last_update_time 1524643393 customization_group_created_by "admin" customization_group_last_updated_by "admin" customization_group_size 62 customization_group_mode 33188 customization_group_update_revision 1

Impact

While the MCPD daemon restarts, the blade is offline, and many other BIG-IP daemons need to restart along with MCPD. -- If the source_system is Active, the load capacity of the system temporarily decreases proportionally to the number of blades restarting. Additionally, if the systems use the min-up-members or HA-Group features, losing a certain number of secondary blades may trigger a failover. -- If the source_system is Standby, some of the previously mirrored information may become lost as TMM on secondary blades restarts along with MCPD. Should this system later become the Active system, traffic may be impacted by the lack of previously mirrored information.

Conditions

This issue occurs when all of the following conditions are met: - You have configured a device-group consisting of multi-blade systems (such as VIPRION devices or vCMP guests). - Systems are provisioned for APM. - The device-group is configured for incremental manual synchronizations. - On one system (for example, source_system), you create an APM configuration object (for example, a connectivity profile) that causes the automatic creation of an associated customization group. - You synchronize the configuration from the source_system to the device-group. - On the source_system, you create a new configuration object of any kind (for example, an LTM node). - Instead of synchronizing the newest configuration to the device-group, you synchronize the configuration from another device in the device-group to the source_system (which effectively undoes your recent change). - The MCPD daemon restarts on all secondary blades of the source_system.

Workaround

None.

Fix Information

The MCPD daemon no longer restarts on secondary blades after config-sync of APM.

Behavior Change