Bug ID 794385: BGP sessions may be reset after CMP state change

Last Modified: Nov 20, 2019

Bug Tracker

Affected Product:  See more info
BIG-IP TMOS(all modules)

Known Affected Versions:
13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4, 14.0.0.5, 14.0.1, 14.0.1.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.4, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 15.0.0, 15.0.1

Opened: Jun 17, 2019
Severity: 3-Major

Symptoms

A CMP (Clustered Multiprocessing) state change occurs when the state of the BIG-IP system changes. This happens in the following instances: - Blade reset. - Booting up or shutting down. - Running 'bigstart restart'. - Setting a blade state from/to primary/secondary. During these events, there is a small chance that ingress ACK packet of previously established BGP connection is going to be disaggregated to the new processing group(TMMs) and selected TMM is ready to process traffic, but is not ready yet to process traffic for existing connection. In this case, connection isn't processed and reset instead.

Impact

Affected BGP peering is reset and dynamic routes learnt by the configured protocol are withdrawn, making it impossible to advertise dynamic routes of affected routing protocols from the BIG-IP system to the configured peers. This can lead to unexpected routing decisions on the BIG-IP system or other devices in the routing mesh. In most cases, unexpected routing decisions are from networks learnt by affected routing protocols when the routing process on the BIG-IP system becomes unreachable. However, this state is short-lived, because the peering is recreated shortly after the routing protocol restarts. The peering time depends on the routing configuration and responsiveness of other routing devices connected to the BIG-IP system. It's the usual routing convergence period, which includes setting the peering and exchanging routing information and routes.

Conditions

-- VIPRION chassis with more than one blade. -- CMP hash of affected VLAN is changed from the Default value, for example, to Source Address. -- BGP peering is configured. -- CMP state change is occurred on one of the blades. -- BGP ingress ACK packet is disaggregated to TMM, which either wrong TMM or not ready to process the packet of already established connection

Workaround

There is no workaround, but the issue was never seen with a configuration where CMP hash of affected VLAN is changed back to Default value.

Fix Information

None

Behavior Change