Bug ID 886693: System may become unresponsive after upgrading

Last Modified: Oct 20, 2021

Bug Tracker

Affected Product:  See more info
BIG-IP Install/Upgrade, TMOS(all modules)

Known Affected Versions:
13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 13.1.3.5, 13.1.3.6, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4, 14.0.0.5, 14.0.1, 14.0.1.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 14.1.2.7, 14.1.2.8, 14.1.3, 14.1.3.1, 15.0.0, 15.0.1, 15.0.1.1, 15.0.1.2, 15.0.1.3, 15.0.1.4, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 15.1.0.5, 15.1.1, 15.1.2, 15.1.2.1, 16.0.0, 16.0.0.1, 16.0.1, 16.0.1.1

Fixed In:
16.1.0, 16.0.1.2, 15.1.3, 14.1.4, 13.1.4

Opened: Mar 02, 2020
Severity: 2-Critical

Symptoms

After upgrading, the system encounters numerous issues: -- Memory exhaustion (RAM plus swap) with no particular process consuming excessive memory. -- High CPU usage with most cycles going to I/O wait. -- System is unresponsive, difficult to log in, slow to accept commands. -- Provisioning is incomplete; there is a small amount of memory amount assigned to 'host' category.

Impact

-- System down, too busy to process traffic. -- Difficulty logging in over SSH might require serial console access.

Conditions

-- The configuration loads in the previous release, but does not load successfully on the first boot into the release you are upgrading to. -- Device is upgraded and the configuration is rolled forward. -- There may be other conditions preventing the configuration from loading successfully after an upgrade. Exact conditions that trigger this issue are unknown and could be varied. In the environment in which it occurs, a datagroup is deleted, but an iRule still references it, see: https://cdn.f5.com/product/bugtracker/ID688629.html

Workaround

Reboot to an unaffected, pre-upgrade volume. -- If the system is responsive enough, use 'tmsh reboot volume <N>' or switchboot to select an unaffected volume. -- If the system is completely unresponsive, physically powercycle a physical appliance or reboot a BIG-IP Virtual Edition (VE) from an applicable management panel, and then select an unaffected volume from the GRUB menu manually. Note: This requires that you have console access, or even physical access to the BIG-IP device if you are unable to SSH in to the unit. On a physical device, a non-responsive system might require that you flip the power switch. For more information, see: -- K9296: Changing the default boot image location on VIPRION platforms :: https://support.f5.com/csp/article/K9296 -- K5658: Overview of the switchboot utility :: https://support.f5.com/csp/article/K5658 -- K10452: Overview of the GRUB 0.97 configuration file :: https://support.f5.com/csp/article/K10452.

Fix Information

The system should now remain responsive if the configuration fails to load during an upgrade on the following platforms: -- BIG-IP 2000s / 2200s -- BIG-IP 4000s / 4200v -- BIG-IP i850 / i2600 / i2800 -- BIG-IP Virtual Edition (VE)

Behavior Change