Bug ID 1185369: F5OS rSeries appliances will not launch tenants after upgrade to F5OS-A 1.3.0

Last Modified: May 29, 2024

Affected Product(s):
F5OS F5OS(all modules)

Known Affected Versions:
F5OS-A 1.3.0

Fixed In:
F5OS-A 1.4.0, F5OS-A 1.3.1

Opened: Nov 02, 2022

Severity: 1-Blocking

Symptoms

After an upgrade to F5OS-A 1.3.0, the system will not be able to deploy tenants. Even if the system software is reverted to the previous version, the issue remains. The system may report a tenant status such as the following: - Tenant deployment failed - Server is not responding - 0/1 nodes are available: 1 node(s) didn't match Pod's node affinity/selector. The "show cluster cluster-status" command will report that the cluster is not ready: cluster cluster-status summary-status "1 Appliance is NOT ready, K3S cluster is NOT ready." There will be error messages in /var/log/messages that mention "x509: certificate signed by unknown authority", for instance: k3s: E1102 16:50:48.340717 44106 kuberuntime_manager.go:790] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to setup network for sandbox \"5ba7aa29305335ce0b6a87b48a570b292f90e1f42f2a2b4ae4fff90a96a55df7\": Multus: [kube-system/klipper-lb-8cht8]: error getting pod: Get \"https://[100.75.0.1]:443/api/v1/namespaces/kube-system/pods/klipper-lb-8cht8?timeout=1m0s\": x509: certificate signed by unknown authority" pod="kube-system/klipper-lb-8cht8"

Impact

The system is unable to deploy tenants. Even if the system is reverted to the previous software version, the issue remains and the system will be unable to launch tenants.

Conditions

- F5OS rSeries appliance - System upgraded to F5OS-A 1.3.0 for the first time

Workaround

Once a system is affected, the fix is to reinstall the Kubernetes cluster. This procedure will take about 10 minutes and will not affect the configuration or data of the tenants. 1. Log in to the rSeries appliance CLI with the root account. 2. To identify if the setup is in an error state, check for the string “x509: certificate signed by unknown authority” in /var/log/messages, or K3S cluster is not healthy and running. 3. Change all deployed tenants to a provisioned state. 4. Stop the appliance_orchestration_manager service by running the following command: systemctl stop appliance_orchestration_manager_container 5. Uninstall K3S by running the following commands: k3s-uninstall.sh rm /var/omd/* /tmp/omd/tokens/* /tmp/omd/appliance-ansible-host 6. Start the appliance_orchestration_manager service by running the following command: systemctl start appliance_orchestration_manager_container 7. Wait about 10 minutes. 8. From the F5OS CLI (log in as admin), check the cluster status: show cluster install-status ; show cluster cluster-status The cluster-status should be "K3S cluster is initialized and ready for use". From a root shell, check that "kubectl get pods -A" shows running containers in both the "kube-system" and "kubevirt" namespaces.

Fix Information

N/A

Behavior Change

Guides & references

K10134038: F5 Bug Tracker Filter Names and Tips