Bug ID 886693: System may become unresponsive after upgrading

Last Modified: Aug 05, 2020

Bug Tracker

Affected Product:  See more info
BIG-IP Install/Upgrade, TMOS(all modules)

Known Affected Versions:
13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4, 14.0.0.5, 14.0.1, 14.0.1.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 15.0.0, 15.0.1, 15.0.1.1, 15.0.1.2, 15.0.1.3, 15.0.1.4, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 16.0.0

Opened: Mar 02, 2020
Severity: 2-Critical

Symptoms

After upgrading, the system encounters numerous issues: -- Memory exhaustion (RAM plus swap) with no particular process consuming excessive memory. -- High CPU usage with most cycles going to I/O wait. -- System is unresponsive, difficult to log in, slow to accept commands. -- Provisioning is incomplete; there is a small amount of memory amount assigned to 'host' category.

Impact

-- System down, too busy to process traffic -- Difficulty logging in over SSH might require serial console access.

Conditions

-- The configuration works in the previous release, but does not work properly in the release you are upgrading to. -- Device is upgraded and the configuration is rolled forward. -- There may be other conditions preventing the configuration from loading successfully after an upgrade. Exact conditions that trigger this issue are unknown. In the environment in which it occurred, a datagroup had been deleted, but an iRule was still referencing it, see https://cdn.f5.com/product/bugtracker/ID688629.html

Workaround

Reboot to an unaffected, pre-upgrade volume. -- If the system is responsive enough, use 'tmsh reboot volume <N>' on BIG-IP Virtual Edition (VE) or switchboot to select an unaffected volume. -- If the system is completely unresponsive, physically powercycle a physical appliance or reboot a VE from an applicable management panel, then select an unaffected volume from the GRUB menu manually. Note: This requires that you have console access, or even physical access to the BIG-IP device if you are unable to SSH in to the unit. On a physical device, a non-responsive system might require that you flip the power switch. For more information, see K9296: Changing the default boot image location on VIPRION platforms :: https://support.f5.com/csp/article/K9296, K5658: Overview of the switchboot utility :: https://support.f5.com/csp/article/K5658, and K10452: Overview of the GRUB 0.97 configuration file :: https://support.f5.com/csp/article/K10452.

Fix Information

None

Behavior Change