Bug ID 888341: HA Group failover may fail to complete Active/Standby state transition

Last Modified: Jul 02, 2020

Bug Tracker

Affected Product:  See more info
BIG-IP TMOS(all modules)

Known Affected Versions:
11.6.0, 11.6.0 HF1, 11.6.0 HF2, 11.6.0 HF3, 11.6.0 HF4, 11.6.0 HF5, 11.6.0 HF6, 11.6.0 HF7, 11.6.0 HF8, 11.6.1, 11.6.1 HF1, 11.6.1 HF2, 11.6.2, 11.6.2 HF1, 11.6.3,,,,, 11.6.4, 11.6.5,, 12.0.0, 12.0.0 HF1, 12.0.0 HF2, 12.0.0 HF3, 12.0.0 HF4, 12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3,,,,,,,, 12.1.4,, 12.1.5,, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0,,,,,,,,, 13.1.1,,,,,, 13.1.3,,,,, 14.0.0,,,,,, 14.0.1,, 14.1.0,,,,,,, 14.1.2,,,,,,, 15.0.0, 15.0.1,,,, 15.1.0,,,

Opened: Mar 09, 2020
Severity: 2-Critical


After a long uptime interval (i.e., the sod process has been running uninterrupted for a long time), high availability (HA) Group failover may not complete despite an high availability (HA) Group score change occurring. As a result, a BIG-IP unit with a lower high availability (HA) Group score may remain as the Active device. Note: Uptime required to encounter this issue is dependent on the number of traffic groups: the more traffic groups, the shorter the uptime. For example: -- For 1 floating traffic group, after 2485~ days. -- For 2 floating traffic groups, after 1242~ days. -- For 4 floating traffic groups, after 621~ days. -- For 8 floating traffic groups, after 310~ days. -- For 9 floating traffic groups, after 276~ days. Note: You can confirm sod process uptime in tmsh: # tmsh show /sys service sod


HA Group Active/Standby state transition may not complete despite high availability (HA) Group score change.


-- high availability (HA) Group failover mode configured. Note: No other failover configuration is affected except for high availability (HA) Group failover. o VLAN failsafe failover. o Gateway failsafe failover. o Failover triggered by loss of network failover heartbeat packets. o Failover caused by system failsafe (i.e., the TMM process was terminated on the Active unit).


There is no workaround. The only option is to reboot all BIG-IP units in the device group on a regular interval. The interval is directly dependent on the number of traffic groups.

Fix Information


Behavior Change