Last Modified: Nov 07, 2022
Affected Product:
See more info
BIG-IP ASM, FPS
Known Affected Versions:
11.6.0, 11.6.0 HF1, 11.6.0 HF2, 11.6.0 HF3, 11.6.0 HF4
Fixed In:
12.0.0, 11.6.0 HF5
Opened: Feb 19, 2015
Severity: 3-Major
Related Article:
K16697
The mcpd daemon of a secondary blade reports failure and is restarted, causing the blade to be offline and not handle traffic for a few minutes.
During the mcpd restart, the blade is offline and not handling traffic for a few minutes. There is no impact to traffic handled by the primary blade.
A multi-blade device (cluster) is part of a trust domain, and one of the other devices in the trust domain is being rebooted. The mcpd failure may occur within a time frame of between a few minutes, and up to 24 hours. The failure should only happen once, and not repeat until the next time that a device in the trust-domain is being rebooted.
The mcpd failure is caused by inconsistency between the primary and the secondary blades, after a reboot of a different device in the trust domain. So, the workaround is to check and fix the inconsistency after every reboot of any device in the trust domain. There is no need to do this when only one of the blades is being rebooted. After any reboot of a device in the trust-domain, perform the following actions: ( 1. ) Check for inconsistency: On each blade of each cluster in the trust-domain, run the following command: tmsh -c 'list security datasync device-stats /Common/datasync-device-*/*cs-asm-dosl7* table' You should see an object for each of the devices (clusters) in the trust domain. For example, if two multi-blade devices are joined in the trust-domain: vcmp1 and vcmp2, both having 2 blades. [root@vcmp1:/S2-green-S:Active:In Sync (Sync Only)] config # tmsh -c 'list security datasync device-stats /Common/datasync-device-*/*cs-asm-dosl7* table' security datasync device-stats datasync-device-vcmp1.qa.com/datasync-device-vcmp1.qa.com-cs-asm-dosl7-stats { table cs-asm-dosl7 } security datasync device-stats datasync-device-vcmp2.qa.com/datasync-device-vcmp2.qa.com-cs-asm-dosl7-stats { table cs-asm-dosl7 } This shows both vcmp1 and vcmp2, so the state is good, no further action needed on this device. However, in the faulty state, the secondary blade of vcmp2 will show: [root@vcmp2:/S2-green-S:Active:In Sync (Sync Only)] config # tmsh -c 'list security datasync device-stats /Common/datasync-device-*/*cs-asm-dosl7* table' security datasync device-stats datasync-device-vcmp1.qa.com/datasync-device-vcmp1.qa.com-cs-asm-dosl7-stats { table cs-asm-dosl7 } The vcmp2 device is missing. The means that the state is inconsistent, and an mcpd failure may happen sometime within 24 hours. ( 2. ) Fix the inconsistency if needed: To fix the state, force a sync of the datasync device groups from vcmp1 (if vcmp2 had the faulty state). If vcmp2 had the inconsistency, run the following commands on vcmp1 : tmsh modify cm device-group datasync-global-dg devices modify { vcmp1.qa.com { set-sync-leader } } Wait a few seconds tmsh modify cm device-group datasync-device-vcmp1.qa.com-dg devices modify { vcmp1.qa.com { set-sync-leader } } tmsh modify cm device-group datasync-device-vcmp2.qa.com-dg devices modify { vcmp1.qa.com { set-sync-leader } } Wait a few more seconds, then check again the state using the instructions in step #1. (tmsh -c 'list security datasync device-stats /Common/datasync-device-*/*cs-asm-dosl7* table') All blades should be good now. Repeat steps #1 and #2 on each of the blades, in each of the clusters that are part of a trust-domain, when a device is being rebooted.
The mcpd daemon of a secondary blade in a cluster no longer fails and restarts, when the cluster is part of a trust domain, and one of the other devices in the trust-domain is being rebooted.