Last Modified: May 24, 2023
Known Affected Versions:
Opened: Feb 22, 2023 Severity: 1-Blocking
After a reboot of the system in live upgrade, tenants that were running earlier might not change to a running state. This is due to the HSM board driver stuck in SAFE_STATE instead of OPERATIONAL_STATE. In some cases, the driver changes to an operational state after some amount of time (approximately 10 minutes). But this time might vary upon detection of reset/link failure in the hardware. In some other systems, the driver becomes stuck in SAFE_STATE indefinitely.
Running tenants goes to pending state when this issue occurs in a live upgrade.
Live upgrade/reboot of the rSeries FIPS system with F5OS-A. You may observe the below logs in dmesg- [ 964.105021] liquidsec_pf_vf_driver 0000:ca:00.0: We might have a link issue... resetting [ 964.113688] liquidsec_pf_vf_driver 0000:ca:00.0: RESETTING FIRMWARE... CAUTION
Check contents of cavium_n3fips file as shown below. [appliance]# cat /proc/cavium_n3fips/driver_state HSM 0:OPERATIONAL_STATE If the driver changes to an operational state, perform "docker restart fips-support-pod" to help in recovering. But if the driver state is still "HSM 0:SAFE_STATE", you may need to perform a power cycle reboot (but this will not guarantee recovery).