Bug ID 1135181: Controller rolling upgrade may cause blades to reboot into partition "none", deleting tenant data

Last Modified: Oct 06, 2022

Bug Tracker

Affected Product:  See more info
F5OS Velos(all modules)

Known Affected Versions:
1.5.0

Opened: Aug 09, 2022
Severity: 2-Critical

Symptoms

System controller components can hang during controller rolling upgrade, resulting in failure to start the partitions correctly, and other incorrect operation. Partition instance state may show as "failed", "offline", or "running", rather than the normal "running-active"/"running-standby". If switchd hangs in during rolling upgrade, this will cause failure messages when blades reboot.

Impact

Tenant instance data (the virtual disk image) may be deleted from the blades if the blades are rebooted while this issue is occurring.

Conditions

Performing a system controller rolling upgrade to version F5OS-C 1.5.0 from an earlier version.

Workaround

The problem can be avoided by performing an out-of-service upgrade, using the "out-of-service true" option with the system controller "system image set-version" command. If a VELOS chassis has already undergone a rolling upgrade to F5OS-C 1.5.0, reboot both system controllers to get them back into a stable state. If blades in a partition were affected, reboot those blades after rebooting the system controllers. The tenant instance data cannot be recovered, and must be recreated and/or restored from a UCS backup.

Fix Information

None

Behavior Change