Bug ID 1025089: Pool members marked down by database monitor due to stale cached connection

Last Modified: Aug 03, 2021

Bug Tracker

Affected Product:  See more info
BIG-IP GTM, LTM(all modules)

Known Affected Versions:
12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3, 12.1.3.1, 12.1.3.2, 12.1.3.3, 12.1.3.4, 12.1.3.5, 12.1.3.6, 12.1.3.7, 12.1.4, 12.1.4.1, 12.1.5, 12.1.5.1, 12.1.5.2, 12.1.5.3, 12.1.6, 13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 13.1.3.5, 13.1.3.6, 13.1.4, 13.1.4.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 14.1.2.7, 14.1.2.8, 14.1.3, 14.1.3.1, 14.1.4, 14.1.4.1, 14.1.4.2, 14.1.4.3, 15.0.0, 15.0.1, 15.0.1.1, 15.0.1.2, 15.0.1.3, 15.0.1.4, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 15.1.0.5, 15.1.1, 15.1.2, 15.1.2.1, 15.1.3, 15.1.3.1, 16.0.0, 16.0.0.1, 16.0.1, 16.0.1.1, 16.0.1.2, 16.1.0

Opened: Jun 11, 2021
Severity: 3-Major

Symptoms

By default, BIG-IP database monitors (mssql, mysql, oracle, postgresql) are configured to keep a connection to the database server open between monitor probes to avoid the overhead of establishing the network connection to the database server for each query operation. If this cached network connection times out or is dropped by the database server, it is marked as "stale" when the next probe occurs, and a new connection is made during the next scheduled monitor probe. In the meantime, due to the lost connection, the monitored pool member may be marked DOWN until the next scheduled monitor probe. This is more likely to occur when a database monitor is used to monitor a GTM pool member instead of an LTM pool member, due to differences between how monitors are configured for GTM versus LTM.

Impact

GTM or LTM pool members monitored by a database monitor may be marked DOWN temporarily if the network connection to the database server is dropped or times out.

Conditions

This may occur under the following conditions: -- GTM or LTM pool members are monitored by a database monitor, configured such that a single probe failure will mark the member DOWN. (Such configuration may be more common for GTM monitors.) -- Either the database server times out or drops the connection for some reason, or no database monitor probes are sent to the database server within a 5 minute interval.

Workaround

To work around this issue, perform one of the following actions: -- Configure the database (mssql, mysql, oracle, postgresql) monitor with a "count" value of "1". This prevents the caching/reuse of network connections to the database server between probes. Thus there is no cached connection to time out/get dropped. However, the overhead of establishing the network connection to the database server will be incurred for each probe. -- Configure the database monitor "interval" and "timeout" values (for an LTM monitor), or the "interval", "timeout", "probe-attempts", "probe-interval" and "probe-timeout" values (for a GTM monitor) such that multiple failed monitor probes are required before the monitored member is marked DOWN.

Fix Information

None

Behavior Change