Bug ID 1026273: HA failover connectivity using the cluster management address does not work on VIPRION platforms

Last Modified: Nov 22, 2021

Bug Tracker

Affected Product:  See more info
BIG-IP Install/Upgrade, TMOS(all modules)

Known Affected Versions:
16.1.0, 16.0.1.2, 16.0.1.1, 16.0.1, 16.0.0.1, 16.0.0, 15.1.4, 15.1.3.1, 15.1.3, 15.1.2.1, 15.1.2, 14.1.4.4, 14.1.4.3, 14.1.4.2, 14.1.4.1, 14.1.4, 14.1.3.1, 14.1.3, 14.1.2.8, 14.1.2.7, 13.1.4.1, 13.1.4, 13.1.3.6, 13.1.3.5

Opened: Jun 16, 2021
Severity: 3-Major

Symptoms

Upon upgrade to an affected version, failover communication via the management port does not work. You may still see packets passing back and forth, but the listener on the receiving end is not configured, and therefore the channel is not up. Here are a few symptoms you may see: -- Running 'tmsh show cm failover-status' shows a status of 'Error' on the management network. -- Running 'tmctl' commands reports the disconnected state: Example: $ tmctl -l sod_tg_conn_stat -s entry_key,last_msg,status entry_key last_msg status ----------------------------- ---------- ------ 10.76.7.8->10.76.7.9:1026 0 0 <--- Notice there is no 'last message' and 'status' is 0, which means disconnected. 10.76.7.8->17.1.90.2:1026 1623681404 1 -- Looking at 'netstat -pan | grep 1026 command output, you do not see the management port listening on port 1026: Example (notice that the management IP from the above example of 10.76.7.9 is not listed): # netstat -pan | grep 1026 udp 0 0 10.10.10.10:1026 0.0.0.0:* 6035/sod -- Listing /var/run/ contents shows that the chmand.pid file is missing: # ls /var/run/chmand.pid ls: cannot access /var/run/chmand.pid: No such file or directory

Impact

If only the management is configured for failover or there are communication issues over the self IP (such as misconfigured port lockdown settings), then the devices may appear to have unusual behavior such as both going active.

Conditions

-- Running on VIPRION platforms -- Only cluster management IP address is configured: No cluster member IP addresses are configured -- Install a software version where ID810821 is fixed (see https://cdn.f5.com/product/bugtracker/ID810821.html) -- Management IP is configured in the failover configuration

Workaround

-- Configure a cluster member IP address on each individual blade in addition to the Cluster management IP address.

Fix Information

None

Behavior Change