Bug ID 930825: System should reboot (rather than restart services) when it sees a large number of HSB XLMAC errors

Last Modified: Sep 16, 2020

Bug Tracker

Affected Product:  See more info
BIG-IP TMOS(all modules)

Known Affected Versions:
12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3, 12.1.3.1, 12.1.3.2, 12.1.3.3, 12.1.3.4, 12.1.3.5, 12.1.3.6, 12.1.3.7, 12.1.4, 12.1.4.1, 12.1.5, 12.1.5.1, 12.1.5.2, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 14.0.0, 14.0.0.1, 14.0.0.2, 14.0.0.3, 14.0.0.4, 14.0.0.5, 14.0.1, 14.0.1.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 14.1.2.7, 15.0.0, 15.0.1, 15.0.1.1, 15.0.1.2, 15.0.1.3, 15.0.1.4, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 15.1.0.5, 16.0.0, 16.0.0.1

Opened: Jul 24, 2020
Severity: 3-Major

Symptoms

The following symptoms may be seen when the HSB is experiencing a large number of XLMAC errors and is unable to recover from the errors. After attempting XLMAC recovery fails, the current behavior is to failover to the peer unit and go-offline and down links. This can be seen the TMM logs: -- notice The number of the HSB XLMAC recovery operation 11 or fcs failover count 0 reached threshold 11 on bus: 3. -- notice HA failover action is triggered due to XLMAC/FCS errors on HSB1 on bus 3. -- notice HSBE2 1 disable XLMAC TX/RX at runtime. -- notice HA failover action is cleared. Followed by a failover event.

Impact

The BIG-IP system fails over.

Conditions

It is unknown under what conditions the XLMAC errors occur.

Workaround

Modify the default high availability (HA) action for the switchboard-failsafe to reboot instead of go offline and down links.

Fix Information

None

Behavior Change