Bug ID 544171: bigd loses connection to mcpd on debug data dump

Last Modified: Mar 21, 2019

Bug Tracker

Affected Product:  See more info
BIG-IP LTM(all modules)

Known Affected Versions:
11.4.1, 11.5.0, 11.5.1, 11.5.1 HF1, 11.5.1 HF10, 11.5.1 HF11, 11.5.1 HF2, 11.5.1 HF3, 11.5.1 HF4, 11.5.1 HF5, 11.5.1 HF6, 11.5.1 HF7, 11.5.1 HF8, 11.5.1 HF9, 11.5.2, 11.5.2 HF1, 11.5.3, 11.5.3 HF1, 11.5.3 HF2, 11.5.4, 11.5.4 HF1, 11.5.4 HF2, 11.5.4 HF3, 11.5.4 HF4, 11.5.5, 11.5.6, 11.5.7, 11.5.8, 11.5.9, 11.6.0, 11.6.0 HF1, 11.6.0 HF2, 11.6.0 HF3, 11.6.0 HF4, 11.6.0 HF5, 11.6.0 HF6, 11.6.0 HF7, 11.6.0 HF8, 11.6.1, 11.6.1 HF1, 11.6.1 HF2, 11.6.2, 11.6.2 HF1, 11.6.3, 11.6.3.1, 11.6.3.2, 11.6.3.3, 11.6.3.4, 11.6.4, 12.0.0, 12.0.0 HF1, 12.0.0 HF2, 12.0.0 HF3, 12.0.0 HF4, 12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3, 12.1.3.1, 12.1.3.2, 12.1.3.3, 12.1.3.4, 12.1.3.5, 12.1.3.6, 12.1.3.7, 12.1.4, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1

Fixed In:
13.1.0

Opened: Sep 05, 2015
Severity: 4-Minor

Symptoms

The 'bigd' connection to 'mcpd' may be lost on large configurations (more than 1 KB pool members) when a debug data dump is triggered through a manual 'kill -USR1 <bigd-pid>', possibly resulting in only a partial diagnostic data dump. When the 'bigd' process is manually killed to trigger a diagnostic data dump, large configurations (with more than a thousand pool members) may cause the 'bigd' process to appear 'stuck' as those instances are logged, causing the process to be killed and restarted by the 'sod' daemon. In this case, it is possible that only a partial diagnostic dump is performed before the 'bigd' process is restarted.

Impact

The 'bigd' diagnostic dump may be incomplete, as the process was terminated before all logging information is written.

Conditions

-- 'bigd' is running with a large configuration (more than a thousand pool members). -- The 'bigd' process is manually killed to trigger a diagnostic dump (such as through 'kill -USR1 <bigd-pid>'). -- The 'sod' daemon finds the 'bigd' process unresponsive (causing it to terminate and restart 'bigd').

Workaround

Turn off the 'bigd' heartbeat monitoring before manually initiating a 'bigd' diagnostic dump; or run a smaller representative configuration before triggering the diagnostic dump. Note: Make sure to turn back on the 'bigd' heartbeat monitoring afterward.

Fix Information

The 'bigd' connection to 'mcpd' remains intact on large configurations (more than 1 KB pool members) when a debug data dump is triggered through a manual 'kill -USR1 <bigd-pid>', such that the diagnostic data dump is always complete.

Behavior Change