Bug ID 1922489: Confd in the partition_manager container on blades fails to the the HA cluster causing other daemons to hang

Last Modified: Jul 05, 2025

Affected Product(s):
F5OS Velos(all modules)

Known Affected Versions:
F5OS-C 1.8.0, F5OS-C 1.8.1

Opened: Apr 09, 2025

Severity: 3-Major

Symptoms

Sometimes the confd instance running on the blade can disconnect, usually due to control plane networking issues. If this happens when a database write is in progress, the blade confd instance may fail to rejoin the HA cluster. When the happens, the blade status in "show system redundancy nodes" will show as either "offline/reconnecting" or some other status than "replica/services running"

Impact

Blade platform-services will be hung and unable to service tenant requests, and dataplane traffic will be interrupted

Conditions

If a network problem interrupts a database operation

Workaround

If this situation is detected, a "cluster reboot all" command at the partition CLI, or a "cluster reboot node (affected node)" should clear the condition.

Fix Information

None

Behavior Change

Guides & references

K10134038: F5 Bug Tracker Filter Names and Tips