Bug ID 1091785: DBDaemon restarts unexpectedly and/or fails to restart under heavy load

Last Modified: Sep 17, 2022

Bug Tracker

Affected Product:  See more info
BIG-IP LTM(all modules)

Known Affected Versions:
13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 13.1.3.5, 13.1.3.6, 13.1.4, 13.1.4.1, 13.1.5, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4, 14.0.0.5, 14.0.1, 14.0.1.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 14.1.2.7, 14.1.2.8, 14.1.3, 14.1.3.1, 14.1.4, 14.1.4.1, 14.1.4.2, 14.1.4.3, 14.1.4.4, 14.1.4.5, 14.1.4.6, 14.1.5, 14.1.5.1, 15.0.0, 15.0.1, 15.0.1.1, 15.0.1.2, 15.0.1.3, 15.0.1.4, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 15.1.0.5, 15.1.1, 15.1.2, 15.1.2.1, 15.1.3, 15.1.3.1, 15.1.4, 15.1.4.1, 15.1.5, 15.1.5.1, 15.1.6, 15.1.6.1, 16.0.0, 16.0.0.1, 16.0.1, 16.0.1.1, 16.0.1.2, 16.1.0, 16.1.1, 16.1.2, 16.1.2.1, 16.1.2.2, 16.1.3, 16.1.3.1, 17.0.0, 17.0.0.1

Opened: Mar 30, 2022
Severity: 3-Major

Symptoms

While under heavy load, the Database monitor daemon (DBDaemon) may: - Restart for no apparent reason - Restart repeatedly in rapid succession - Log the following error while attempting to restart: java.net.BindException: Address already in use (Bind failed) - Fail to start (remain down) after several attempts, leaving database monitors disabled and marking monitored resources Down

Impact

Restart for no apparent reason Fail to start (remain down) after several attempts, leaving database monitors disabled and marking monitored resources Down

Conditions

- Configure one or more GTM database monitors with short probe-timeout, interval and timeout values (e.g., 2, 5, 16 respectively) - Configure a large number (e.g., 2,000) of GTM [or perhaps LTM?] database monitor instances (combinations of above monitor + pool member) - Optionally: configure GTM database monitors with debug yes and count 0 (for easier diagnosis, and assumption that count = 0 will generate more stress/concurrency to aid repro; vary as needed) - Watch for DBDaemon restarts (either through changes in the PID returned by ps, or watching for "Starting" messages in DBDaemon logs)

Workaround

None

Fix Information

None

Behavior Change