Bug ID 788577: BFD sessions may be reset after CMP state change

Last Modified: Oct 16, 2023

Affected Product(s):
BIG-IP TMOS(all modules)

Known Affected Versions:
11.5.0, 11.5.1, 11.5.2, 11.5.3, 11.5.4, 11.5.5, 11.5.6, 11.5.7, 11.5.8, 11.5.9, 11.5.10, 11.6.0, 11.6.1, 11.6.2, 11.6.3, 11.6.3.1, 11.6.3.2, 11.6.3.3, 11.6.3.4, 11.6.4, 11.6.5, 11.6.5.1, 12.0.0, 12.0.0 HF1, 12.1.0 HF1, 12.0.0 HF2, 12.1.0 HF2, 12.0.0 HF3, 12.0.0 HF4, 12.1.1 HF1, 12.1.1 HF2, 12.1.2 HF1, 12.1.2 HF2, 12.1.0, 12.1.1, 12.1.2, 12.1.3, 12.1.3.1, 12.1.3.2, 12.1.3.3, 12.1.3.4, 12.1.3.5, 12.1.3.6, 12.1.3.7, 12.1.4, 12.1.4.1, 12.1.5, 12.1.5.1, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4, 14.0.0.5, 14.0.1, 14.0.1.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 14.1.2.7, 14.1.2.8, 14.1.3, 15.0.0, 15.0.1, 15.0.1.1, 15.0.1.2, 15.0.1.3, 15.0.1.4, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 15.1.0.5

Fixed In:
16.1.0, 16.0.1, 15.1.1, 14.1.3.1, 13.1.3.5, 12.1.5.2, 11.6.5.2

Opened: May 31, 2019

Severity: 3-Major

Symptoms

A CMP (Clustered Multiprocessing) state change occurs when the state of the BIG-IP system changes. This happens in the following instances: - Blade reset. - Booting up or shutting down. - Running 'bigstart restart'. - Setting a blade state from/to primary/secondary. During these events, Bidirectional Forwarding Detection (BFD) session processing ownership might be migrating from old, processing TMMs to new, selected TMMs. This process is rapid and could lead to contest between several TMMs over who should be the next BFD processing owner. It might also lead to a situation where the BFD session is deleted and immediately recreated. This problem occurs rarely and only on a chassis with more than one blade.

Impact

When the BFD session is recreated, it marks corresponding routing protocol DOWN if it's configured. The protocol might be BGP, OSPF, or any other routing protocols that support BFD. This causes the routing protocol to withdraw dynamic routes learnt by the configured protocol, making it impossible to advertise dynamic routes of affected routing protocols from the BIG-IP system to the configured peers. This can lead to unexpected routing decisions on the BIG-IP system or other devices in the routing mesh. In most cases, unexpected routing decision are from networks learnt by affected routing protocols when the routing process on the BIG-IP system become unreachable. However, this state is short-lived, because the peering will be recreated shortly after the routing protocol restarts. The peering time depends on the routing configuration and responsiveness of other routing devices connected to the BIG-IP system. It's the usual routing convergence period, which includes setting the peering and exchanging routing information and routes.

Conditions

-- VIPRION chassis with more than one blade. -- CMP hash of affected VLAN is changed from the Default value, for example, to Source Address. -- BFD peering is configured. -- CMP state change is occurred on one of the blades. -- BFD connection is redistributed to the processing group (TMMs) on the blade that experienced the CMP state change and the contest between the old TMM owner and the new TMM owner occurs.

Workaround

There are two workarounds, although the latter is probably impractical: -- Change CMP hash of affected VLAN to the Default value. -- Maintain a chassis with a single blade only. Disable or shut down all blades except one.

Fix Information

BFD session is no longer reset during CMP state change.

Behavior Change

Guides & references

K10134038: F5 Bug Tracker Filter Names and Tips