Bug ID 557098: Correlation is continuously restarted with 'An instance with pid xxxx is already running' error in the ltm log

Last Modified: Nov 07, 2022

Bug Tracker

Affected Product:  See more info
BIG-IP ASM(all modules)

Known Affected Versions:
11.6.0, 11.6.0 HF1, 11.6.0 HF2, 11.6.0 HF3, 11.6.0 HF4, 11.6.0 HF5, 11.6.0 HF6, 11.6.0 HF7, 11.6.0 HF8, 11.6.1, 11.6.1 HF1, 11.6.1 HF2, 12.0.0, 12.0.0 HF1, 12.0.0 HF2, 12.0.0 HF3, 12.0.0 HF4

Fixed In:
12.1.0, 11.6.2

Opened: Nov 09, 2015
Severity: 4-Minor
Related Article:
K80251813

Symptoms

correlation is continuously restarted with these errors: /var/log/asm: ---------- ASM subsystem error (asm_start,F5::NwdUtils::Nwd::log_failure): Watchdog detected failure for process. Process name: correlation, Failure: Insufficient number of threads (required: 1, found: 0). ---------- /var/log/ltm: ---------- err correlation[xxxx]: 01560000:3: An instance with pid yyyyy is already running ----------

Impact

Correlation is continuously restarted.

Conditions

ASM provisioned.

Workaround

Here is a workaround. All commands should be executed as root user, on the CLI of the affected BIG-IP system: 1) Create a backup of the file '/usr/share/ts/config/asm_processes.yaml' to '/usr/share/ts/config/asm_processes.yaml.orig': # cp /usr/share/ts/config/asm_processes.yaml /usr/share/ts/config/asm_processes.yaml.orig 2) Patch the yaml file (Note: The spaces in the command are significant.): # perl -pi -e 's/correlation:/correlation:\n pid_file: \/shared\/tmp\/correlation.pid/' /usr/share/ts/config/asm_processes.yaml 3) Diff the original file VS the patched file: # diff -C 1 /usr/share/ts/config/asm_processes.yaml.orig /usr/share/ts/config/asm_processes.yaml 4) Validate that the diff, which was generated in previous step, is exactly as follows (spaces are significant, timestamps will differ): *** /usr/share/ts/config/asm_processes.yaml.orig 2015-11-09 10:12:06.000000000 -0500 --- /usr/share/ts/config/asm_processes.yaml 2015-11-09 10:12:14.000000000 -0500 *************** *** 33,34 **** --- 33,35 ---- correlation: + pid_file: /shared/tmp/correlation.pid exec_method: system 5) Restart ASM: # bigstart restart asm 6) Make sure that the BIG-IP system is Active and that there are no 'Watchdog detected failure for process' errors in '/var/log/asm'. 7) Monitor logs for these errors, which should not appear: ----------------------------------------- /var/log/asm: ---------- ASM subsystem error (asm_start,F5::NwdUtils::Nwd::log_failure): Watchdog detected failure for process. Process name: correlation, Failure: Insufficient number of threads (required: 1, found: 0). ---------- /var/log/ltm: ---------- An instance with pid xxxx is already running ---------- ----------------------------------------- Note that step (5) is disruptive and will cause the BIG-IP system to go 'Offline' for a short period of time. In the case you need to revert the workaround: ----------------------------------------- # mv /usr/share/ts/config/asm_processes.yaml.orig /usr/share/ts/config/asm_processes.yaml # bigstart restart asm -----------------------------------------

Fix Information

We have resolved an issue with the correlation daemon startup sequence so that it does not go into a restart loop.

Behavior Change