Bug ID 1092913: Tenant CPU pinning can fail when a blade is moved to a new partition and the blade was previously running a deployed tenant

Last Modified: Aug 21, 2022

Bug Tracker

Affected Product:  See more info
F5OS Velos(all modules)

Fixed In:
1.5.0

Opened: Mar 31, 2022
Severity: 2-Critical

Symptoms

Newly deployed tenants on a blade just moved to a new partition will not perform optimally and confd tenant status may show the cpu allocations to the tenant failed.

Impact

It's possible one or more new tenants deployed on the blade in the new partition will fail to be assigned to appropriate cpus. This could affect the performance of those tenants.

Conditions

A blade is running a deployed tenant The blade is moved into a new partition or The blade tenant is deleted, and less than about 2 minutes after that, the blade is moved to a new partition

Workaround

The main issue is that before the blade moved to the new partition, it never had the chance to release the cpus assigned to any tenants that had been deployed on the blade. The first thing is to try and avoid this problem from happening at all be ensuring that tenants are not deployed on a blade before moving it to a new partition. Changing a tenant's deployed state to any other state (provisioned, configured, deleted) is not synchronous, so wait at least 2 minutes before moving the blade to a new partiiton. But if avoidance fails, it is still possible to clean up the tenant cpu allocator database. The simplest steps are: 1. If any tenants are deployed on the blade in the new partition, set them to provisioned, and wait more than 2 minutes. 2. Manually remove the tenant cpu database file on the blade: "rm /opt/f5/cpumgr/cpu_users" 3. Reboot the blade (this will recreate the above file, but with all tenant records cleared) 4. Redeploy any tenants from step 1.

Fix Information

None

Behavior Change