Bug ID 1399105

Bug Tracker
ID 1399105

Bug ID 1399105: Etcd scale-up operation fails after PXE boot of VELOS system controller

Last Modified: Jan 27, 2026

Affected Product(s):
F5OS F5OS-C(all modules)

Known Affected Versions:
F5OS-C 1.6.1, F5OS-C 1.6.2, F5OS-C 1.6.4

Fixed In:
F5OS-C 1.8.0

Opened: Nov 15, 2023

Severity: 3-Major

Symptoms

When PXE booting a controller, the Kubernetes cluster may not form properly. This happens during the etcd scale-up operation, and the following log message will be seen repeatedly in the active controller's /var/log/openshift.log file: "Failed to scale ETCD instances" In addition, the following log will be seen repeatedly in the PXE booted controller's /var/log/messages file: etcdmain: rejected connection from "100.65.3.51:50788" (error "tls: \"100.65.3.51\" does not match any of DNSNames [\"etcd3.chassis.local\"] (lookup etcd3.chassis.local on 100.65.3.52:53: no such host)", ServerName "", IPAddresses ["100.65.3.50"], DNSNames ["etcd3.chassis.local"]

Impact

The system cannot add the PXE booted controller to the Kubernetes cluster, and the cluster fails to become healthy.

Conditions

-- Chassis-based system/F5OS-C. -- PXE booting one of the controllers.

Workaround

Manually edit the /etc/hosts file on the standby controller (i.e., the controller that is being PXE booted). -- If the active controller is controller-1, ensure the following lines are present: 100.65.3.51 controller-1.chassis.local etcd3.chassis.local 100.65.3.52 controller-2.chassis.local -- If the active controller is controller-2, ensure the following lines are present: 100.65.3.51 controller-1.chassis.local 100.65.3.52 controller-2.chassis.local etcd3.chassis.local -- Note: There may already be entries for 100.65.3.51 and 100.65.3.52 in /etc/hosts; these lines should be replaced by the above lines. An in-progress etcd scaleup operation will continue to fail after editing the /etc/hosts file. After three failures, the PXE booted controller will be totally removed from the cluster. At this time, a message similar to the following will be observed in the active controller's /var/log/openshift.log: "Failed to scale ETCD instances after 3 retries.Removing new controllers from the cluster" The controller will be added back. The next run of the etcd scaleup operation should succeed at that point. This part may take up to an hour.

Fix Information

None

Guides & references

K10134038: F5 Bug Tracker Filter Names and Tips