Last Modified: Oct 19, 2025
Affected Product(s):
BIG_IP_NEXT(SPK) SPK
Known Affected Versions:
1.7.8
Fixed In:
1.7.9
Opened: Jul 03, 2024 Severity: 2-Critical
The DB and Sentinel pods might land into an erroneous state where the Sentinels can point to a non-master DB as the master DB erroneously under certain sequence of events
TMM is not able to establish communication with the Redis DB, which causes a disruption to traffic flow.
The following sequence of events would cause the above scenario to surface: 1.A failover is performed to make DB-2 the master and DB-0/DB-1 the replicas. 2. Scale down the Sentinels to 0, then delete pod DB-2, followed by scaling up the Sentinels to 3, or delete all 3 Sentinel pods and DB-2 together. 3. After step 2, during bootup, the init script in each pod must finish querying the master DB status from Sentinel within 5 seconds.
The mitigation is to scale down both DB and Sentinel pods to 0 and then scale them up to 3 using the steps below: 1. Scale down DB pods to 0: oc scale statefulset/f5-dssm-db --replicas=0 -n <namespace> 2. Scale down Sentinel to 0: oc scale statefulset/f5-dssm-sentinel --replicas=0 -n <namespace> 3. Scale up DB to 3: oc scale statefulset/f5-dssm-db --replicas=3 -n <namespace> 4. Scale up Sentinel to 3: oc scale statefulset/f5-dssm-sentinel --replicas=3 -n <namespace>
The fix is performed in two parts: 1. The TMM code is enhanced to disconnect and try to reconnect to the Redis DB in case it connects to a READONLY DB. 2. The init bootup script is enhanced to handle the intended scenario gracefully