Last Modified: May 28, 2023
Affected Product(s):
F5OS F5OS
Known Affected Versions:
F5OS-A 1.3.0, F5OS-A 1.3.1, F5OS-A 1.3.2
Fixed In:
F5OS-A 1.4.0
Opened: Nov 09, 2022 Severity: 3-Major
On r2x00/r4x00 based systems, tenant launch gets stuck with an error in ConfD tenant status leaf: "error adding container to network \"sriov-net3-tenant1\": SRIOV-CNI failed to load netconf: LoadConf(): failed to get VF information: \"lstat /sys/bus/pci/devices/0000:ec:00.7/physfn/net: no such file or directory" The VFs(aka, SR-IOV Based Virtual Functions) were not seen under a PF(aka, SR-IOV based Physical Function) when run following the command. Command: `ip link show <PF>` PF can be, `x557_1`, `x557_2`, `x557_3`, `x557_4`, `sfp_5`, `sfp_6`, `sfp_7`, `sfp_8`. For example, the faulty PF(x557_4 in this case) has no VFs listed compared to the healthy PF(x557_1 in this case), # ip link show x557_4 18: x557_4: <NO-CARRIER,BROADCAST,MULTICAST,PROMISC,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether 14:a9:d0:01:56:8a brd ff:ff:ff:ff:ff:ff # ip link show x557_1 15: x557_1: <NO-CARRIER,BROADCAST,MULTICAST,PROMISC,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether 14:a9:d0:01:56:87 brd ff:ff:ff:ff:ff:ff vf 2 MAC 00:00:00:00:11:02, spoof checking on, link-state auto, trust off vf 2 MAC 00:00:00:00:11:02, spoof checking on, link-state auto, trust off vf 2 MAC 00:00:00:00:11:02, spoof checking on, link-state auto, trust off vf 2 MAC 00:00:00:00:11:02, spoof checking on, link-state auto, trust off
Tenant launch will be unsuccessful and is not able to connect to the tenant console or over tenant's management connection.
On r4x00 or r2x00 based systems: 1. ConfD tenant status leaf reports "LoadConf(): failed to get VF information". 2. The VFs were not created under one or more PFs. 3. One of the files from "x557_1", "x557_2", "x557_3", "x557_4", "sfp_5", "sfp_6", "sfp_7", "sfp_8" missed from "/sys/class/net" directory. For suppose when x557_4 is a faulty PF(aka, SR-IOV based Physical Function), then `/sys/class/net` shouldn't list x557_4 in its directory. [root@appliance-1 ~]# ls /sys/class/net/x557_4 ls: cannot access /sys/class/net/x557_4: No such file or directory [root@appliance-1 ~]#
Workaround #1 =============== 1. Move the tenant(s)' running-state in ConfD to provisioned. 2. Run "/usr/omd/scripts/config_ice_vfs.sh" script when "/sys/class/net" starts to show missing PF from the list above. 3. Run "kubectl rollout restart daemonset kube-sriov-device-plugin-amd64 -n kube-system". 4. Move the tenant(s)' running-state in ConfD to deployed. Workaround #2 (only when second step takes too long) ================================================== 1. From second step in Workaround #1, if the PF wasn't detected in "/sys/class/net" even after a 20 minute duration, reboot the host to trigger the device probing.
The workarounds should fix the tenants' statuses and move them to a running state.