Bug ID 539130: bigd may crash due to a heartbeat timeout

Last Modified: Nov 22, 2021

Affected Product(s):
BIG-IP LTM(all modules)

Known Affected Versions:
12.0.0, 11.6.0, 11.5.3, 11.4.1, 10.2.4

Fixed In:
12.1.0, 12.0.0 HF1, 11.6.1, 11.5.4, 11.4.1 HF10, 10.2.4 HF13

Opened: Aug 11, 2015

Severity: 2-Critical

Related Article: K70695033

Symptoms

bigd crashes and generates a core file. The system logs entries in /var/log/ltm that are similar to the following: sod[5853]: 01140029:5: HA daemon_heartbeat bigd fails action is restart. This issue is more likely to occur if /var/log/ltm contains entries similar to the following: info bigd[5947]: reap_child: child process PID = 9198 exited with signal = 9.

Impact

bigd crashes and generates a core file. Monitoring is interrupted.

Conditions

External monitors that run for a long time and are killed by the next iteration of the monitor. For example, the LTM external monitor 'sample_monitor' contains logic to kill a running monitor if it runs too long.

Workaround

None.

Fix Information

External monitors that run for a long time and are killed by the next iteration of the monitor now recover without bigd crashing and generating a core file.

Behavior Change

bigd now logs child process exit messages in /var/log/bigdlog (so bigd.debug must be enabled) rather than in /var/log/ltm. This allows the logging to be controllable. Successful command exits are also logged for completeness since this the log messages only appears when debugging is enabled.

Guides & references

K10134038: F5 Bug Tracker Filter Names and Tips