Last Modified: Sep 13, 2023
Known Affected Versions:
12.1.3, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 12.1.4, 220.127.116.11, 12.1.5, 18.104.22.168, 22.214.171.124, 126.96.36.199, 12.1.6, 13.0.0, 13.0.0 HF1, 13.0.0 HF2, 13.0.0 HF3, 13.0.1, 13.1.0, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 13.1.1, 14.0.0, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199
14.1.0, 188.8.131.52, 184.108.40.206
Opened: Jul 10, 2018 Severity: 3-Major
-- CPU core 0 can be seen utilizing 100% CPU. -- Other even cores may show a 40% increase in CPU usage. -- Pool monitors are seen flapping in /var/log/ltm. -- System posts the following messages: + In /var/log/ltm: - err tmm4: 01340004:3: high availability (HA) Connection detected dissimilar peer: local npgs 1, remote npgs 1, local npus 8, remote npus 8, local pg 0, remote pg 0, local pu 4, remote pu 0. Connection will be aborted. + In /var/log/tmm: - notice DAGLIB: Invalid table size 12 - notice DAG: Failed to consume DAG data
- High CPU usage. - Traffic disruption.
-- Active unit on a pre-220.127.116.11 release. -- Standby peer upgraded to a 13.1.0 or later release. -- Device is an iSeries device (i5600 or later). Important: This issue may also affect iSeries high availability (HA) peers on the same software version if the devices do not share the same model number. Note: Although this also occurs when upgrading to 18.104.22.168 and 13.0.x, the issue is not as severe.
Minimize impact on affected active devices by keeping the upgraded post-13.1.0 unit offline as long as possible before going directly to Active. For example, on a 12.1.3 unit to be upgraded (pre-upgrade): -- Run the following command: tmsh run sys failover offline persist -- Run the following command: tmsh save sys config -- Upgrade to 22.214.171.124. -- Unit comes back up on 126.96.36.199 as 'Forced Offline' and does not communicate with the active unit running 12.1.3 at all. -- Set up high availability (HA) group and make sure the 12.1.3 Active unit's high availability (HA) score is lower than 188.8.131.52. -- To cause the 184.108.40.206 unit to go directly to Active and take over traffic, run the following command on the unit running 220.127.116.11: tmsh run sys failover online At this point, the 12.1.3 unit starts to show symptoms of this issue, however, because it is no longer processing traffic, there is no cause for concern.
This release introduces a new bigdb variable DAG.OverrideTableSize. To prevent the issue on an upgraded post-13.1.0 unit, set DAG.OverrideTableSize to 3. In order to return the system to typical CPU usage, you must set the db variable, and then restart tmm by running the following command: bigstart restart tmm (Restarting tmm is required for 18.104.22.168 and newer 13.1.1.x releases.) Note: Because the restart is occurring on the Standby unit, no traffic is disrupted while tmm restarts.