Last Modified: Jan 13, 2023
Opened: Dec 07, 2022
Following a controller rolling upgrade, one or both of the chassis partition controller instances may fail to start completely. This can be seen by running the "show partitions" command. Normal status is that one controller instance will show "running-active" and one will show "running-standby". If any other status is shown (running, offline, failed, or no status), then the database is not operating correctly.
One or both instances of the chassis partition control plane are not operating. This will prevent the chassis partition rolling upgrade, and may stop tenant traffic.
At database startup, it is possible for a chassis partition to hang retrieving the database primary key. The presence of this defect confirmed by observing this message at the end of the partition devel.log file: ERR> 6-Jan-2023::17:51:49.205 partition1 confd: confd encryptedStrings command timed out after 300000 ms inactivity
If the chassis partition is in this state, it can be recovered by disabling the partition, waiting for both instances to transition to "disabled", and then re-enabling. The error state is unlikely to occur unless the partition startup happens during a controller failover.