Last Modified: Apr 05, 2023
Affected Product(s):
F5OS (all modules)
Known Affected Versions:
F5OS-A 1.1.0, F5OS-A 1.1.1, F5OS-A 1.3.0, F5OS-A 1.3.1, F5OS-A 1.3.2
Fixed In:
F5OS-A 1.4.0
Opened: Nov 03, 2022 Severity: 2-Critical
A memory leak exists in the ImageAgent process on the F5OS-A host hypervisor layer of rSeries devices. This process manages the software images on the system. When larger than approximately 2GB, this may create enough memory pressure to affect scheduling of tenant vCPU causing various tenant symptoms that indicate lower performance. These may include (list is not exhaustive): - dropping sporadic packets - tmm reporting Clock advanced in /var/log/ltm logs - cores of tenant daemons - unexpected restart of tenants - restart of F5OS-A processes - sluggish manageability of tenant or rSeries host When the hypervisor layer is nearly out of memory the Linux kernel may trigger the out of memory killer which may terminate processes including those that are tenants. If this happens then oom-killer logs showing ImageAgent with high rss (~500,000 or more) will be present in host qkview logs in: Files > Log > messages logs in iHealth view of rSeries host qkview qkview/subpackages/host-qkview/qkview/filesystem/var/log/messages eg kernel: xxxx invoked oom-killer: ... kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name kernel: [ 4321] 0 4321 696934 512846 1111 126261 0 ImageAgent This indicates ImageAgent uses 512846 4KB pages in resident memory and 126216 4KB pages of swap. So about 2GB of resident and 0.5GB of swap. Typically it will be very small.
Poor performance or unstable tenants: possible restarts, including of host rSeries.
rSeries host running affected F5OS-A version before 1.4.0. Install of software using PXE.
While there is no workaround, the issue can be mitigated. If the leaking ImageAgent process can be restarted before it gets too big it should be possible to avoid symptoms. It would probably be best to restart it before it reaches 1 GB in resident memory use (RES or RSS depending on utility). On iHealth you can view this in a host qkview under Commands, open system_image_agent folder and click on top. Look at the value under RES column for row with command of /confd/bin/ImageAgent Restarting the process should not affect traffic service. To restart the system image agent, log into the host rSeries system as root and run: docker restart system_image_agent (N.B. underscores, not hyphens) After this, there will be various log messages from image-agent in /var/F5/system/log/platform.log: image-agent[10]: priority="Notice" version=1.0 msgid=0x2001000000000001 msg="Image Agent starting". <--- image-agent[10]: priority="Info" version=1.0 msgid=0x6602000000000005 msg="DB is not ready". image-agent[10]: priority="Info" version=1.0 msgid=0x6602000000000005 msg="DB is not ready". image-agent[10]: priority="Info" version=1.0 msgid=0x6602000000000006 msg="DB state monitor started". image-agent[10]: priority="Info" version=1.0 msgid=0x2005000000000001 msg="Image file added" FILE="BIGIP-15.1.6.1-0.0.10.ALL-F5OS.qcow2.zip.bundle". image-agent[10]: priority="Info" version=1.0 msgid=0x6602000000000003 msg="DB state is now Active". <---
None