Bug ID 710028: LTM SQL monitors may stop monitoring if multiple monitors querying same database

Last Modified: Mar 12, 2019

Bug Tracker

Affected Product:  See more info
BIG-IP LTM(all modules)

Known Affected Versions:
11.5.1, 11.5.1 HF1, 11.5.1 HF10, 11.5.1 HF11, 11.5.1 HF2, 11.5.1 HF3, 11.5.1 HF4, 11.5.1 HF5, 11.5.1 HF6, 11.5.1 HF7, 11.5.1 HF8, 11.5.1 HF9, 11.5.2, 11.5.2 HF1, 11.5.3, 11.5.3 HF1, 11.5.3 HF2, 11.5.4, 11.5.4 HF1, 11.5.4 HF2, 11.5.4 HF3, 11.5.4 HF4, 11.5.5, 11.5.6, 11.5.7, 11.5.8, 11.6.0, 11.6.0 HF1, 11.6.0 HF2, 11.6.0 HF3, 11.6.0 HF4, 11.6.0 HF5, 11.6.0 HF6, 11.6.0 HF7, 11.6.0 HF8, 11.6.1, 11.6.1 HF1, 11.6.1 HF2, 11.6.2, 11.6.2 HF1, 11.6.3, 11.6.3.1, 11.6.3.2, 12.0.0, 12.0.0 HF1, 12.0.0 HF2, 12.0.0 HF3, 12.0.0 HF4, 12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3, 12.1.3.1, 12.1.3.2, 12.1.3.3, 12.1.3.4, 12.1.3.5, 12.1.3.6, 12.1.3.7, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.1, 13.1.1.2, 13.1.1.3, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4

Fixed In:
14.1.0, 13.1.1.4, 12.1.4, 11.6.3.3, 11.5.9

Opened: Mar 13, 2018
Severity: 3-Major

Symptoms

When using an SQL monitor to monitor the health of SQL database pool members, one of the health monitors may stop actively monitoring one or more pool members. When this problem occurs, the following error messages may be logged in /var/log/DBDaemon-0.log: [if debug = yes in monitor configuration]: Using cached DB connection for connection string '<connection string>' then multiple, periodic instances of the following message, referencing the same connection string: Abandoning hung SQL query: '<query string>' for: '<connection string>' or: <connection string>(<thread-number>): Hung SQL query; abandoning

Impact

When this problem occurs, the affected pool members are reported down, even though the database is actually up and responding correctly to traffic.

Conditions

This may occur when all of the following conditions are met: -- Using one of the following LTM monitors: mssql, mysql, oracle, postgresql. -- Configuring multiple pool members for the same node (server). -- Configuring multiple SQL monitors that query the same server and database. And when one or both of the following conditions are met: Either: -- The SQL monitor is configured with a non-zero 'count' value. Or: -- An error occurs while querying a SQL database, such as [recorded in the DBDaemon log]: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.

Workaround

When this problem occurs, successful monitoring can be temporarily restored by disabling then re-enabling monitoring of affected pool members. To avoid one possible trigger for this issue (and thus reduce the likelihood of this issue occurring), configure the 'count' parameter in the SQL monitor configuration to a value of '0'.

Fix Information

LTM SQL monitors continue monitoring when multiple monitors/ query the same server and database.

Behavior Change