Bug ID 539130: bigd may crash due to a heartbeat timeout

Last Modified: Feb 13, 2019

Bug Tracker

Affected Product:  See more info
BIG-IP LTM(all modules)

Known Affected Versions:
10.2.4, 11.4.1, 11.5.1, 11.5.1 HF1, 11.5.1 HF10, 11.5.1 HF11, 11.5.1 HF2, 11.5.1 HF3, 11.5.1 HF4, 11.5.1 HF5, 11.5.1 HF6, 11.5.1 HF7, 11.5.1 HF8, 11.5.1 HF9, 11.5.2, 11.5.2 HF1, 11.5.3, 11.5.3 HF1, 11.5.3 HF2, 11.6.0, 11.6.0 HF1, 11.6.0 HF2, 11.6.0 HF3, 11.6.0 HF4, 11.6.0 HF5, 11.6.0 HF6, 11.6.0 HF7, 11.6.0 HF8, 12.0.0

Fixed In:
12.1.0, 12.0.0 HF1, 11.6.1, 11.5.4, 11.4.1 HF10, 10.2.4 HF13

Opened: Aug 11, 2015
Severity: 2-Critical
Related AskF5 Article:
K70695033

Symptoms

bigd crashes and generates a core file. The system logs entries in /var/log/ltm that are similar to the following: sod[5853]: 01140029:5: HA daemon_heartbeat bigd fails action is restart. This issue is more likely to occur if /var/log/ltm contains entries similar to the following: info bigd[5947]: reap_child: child process PID = 9198 exited with signal = 9.

Impact

bigd crashes and generates a core file. Monitoring is interrupted.

Conditions

External monitors that run for a long time and are killed by the next iteration of the monitor. For example, the LTM external monitor 'sample_monitor' contains logic to kill a running monitor if it runs too long.

Workaround

None.

Fix Information

External monitors that run for a long time and are killed by the next iteration of the monitor now recover without bigd crashing and generating a core file.

Behavior Change

bigd now logs child process exit messages in /var/log/bigdlog (so bigd.debug must be enabled) rather than in /var/log/ltm. This allows the logging to be controllable. Successful command exits are also logged for completeness since this the log messages only appears when debugging is enabled.