Bug ID 727467: Some iSeries appliances can experience traffic disruption when the HA peer is upgraded from 12.1.3 and earlier to 13.1.0 or later.

Last Modified: May 07, 2019

BIG-IP All(all modules)

Known Affected Versions:
12.1.0, 12.1.0 HF1, 12.1.0 HF2, 12.1.1, 12.1.1 HF1, 12.1.1 HF2, 12.1.2, 12.1.2 HF1, 12.1.2 HF2, 12.1.3,,,,,,,, 12.1.4,, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0,,,,,,,,, 13.1.1,, 14.0.0,,,,

Fixed In:

Opened: Jul 10, 2018
Severity: 3-Major


-- CPU core 0 can be seen utilizing 100% CPU. -- Other even cores may show a 40% increase in CPU usage. -- Pool monitors are seen flapping in /var/log/ltm. -- System posts the following messages: + In /var/log/ltm: - err tmm4[21025]: 01340004:3: HA Connection detected dissimilar peer: local npgs 1, remote npgs 1, local npus 8, remote npus 8, local pg 0, remote pg 0, local pu 4, remote pu 0. Connection will be aborted. + In /var/log/tmm: - notice DAGLIB: Invalid table size 12 - notice DAG: Failed to consume DAG data


- High CPU usage. - Traffic disruption.


-- Active unit on a pre- release. -- Standby peer upgraded to a 13.1.0 or later release. -- Device is an iSeries device (i5600 or later). Important: This issue may also affect iSeries HA peers on the same software version if the devices do not share the same model number. Note: Although this also occurs when upgrading to and 13.0.x, the issue is not as severe.


Minimize impact on affected active devices by keeping the upgraded post-13.1.0 unit offline as long as possible before going directly to Active. For example, on a 12.1.3 unit to be upgraded (pre-upgrade): -- Run the following command: tmsh run sys failover offline persist -- Run the following command: tmsh save sys config -- Upgrade to -- Unit comes back up on as 'Forced Offline' and does not communicate with the active unit running 12.1.3 at all. -- Set up HA group and make sure the 12.1.3 Active unit's HA score is lower than -- To cause the unit to go directly to Active and take over traffic, run the following command on the unit running tmsh run sys failover online At this point, the 12.1.3 unit starts to show symptoms of this issue, however, because it is no longer processing traffic, there is no cause for concern.

This release introduces a new bigdb variable DAG.OverrideTableSize. To prevent the issue on an upgraded post-13.1.0 unit, set DAG.OverrideTableSize to 3. In order to return the system to typical CPU usage, you must set the db variable, and then restart tmm by running the following command: bigstart restart tmm (Restarting tmm is required for and newer 13.1.1.x releases.) Note: Because the restart is occurring on the Standby unit, no traffic is disrupted while tmm restarts.

