Bug ID 969553: DNS cache returns SERVFAIL

Last Modified: Jul 27, 2021

Bug Tracker

Affected Product:  See more info
BIG-IP DNS, GTM, LTM(all modules)

Known Affected Versions:
13.1.0, 13.1.0.1, 13.1.0.2, 13.1.0.3, 13.1.0.4, 13.1.0.5, 13.1.0.6, 13.1.0.7, 13.1.0.8, 13.1.1, 13.1.1.2, 13.1.1.3, 13.1.1.4, 13.1.1.5, 13.1.3, 13.1.3.1, 13.1.3.2, 13.1.3.3, 13.1.3.4, 13.1.3.5, 13.1.3.6, 13.1.4, 13.1.4.1, 14.1.0, 14.1.0.1, 14.1.0.2, 14.1.0.3, 14.1.0.5, 14.1.0.6, 14.1.2, 14.1.2.1, 14.1.2.2, 14.1.2.3, 14.1.2.4, 14.1.2.5, 14.1.2.6, 14.1.2.7, 14.1.2.8, 14.1.3, 14.1.3.1, 14.1.4, 14.1.4.1, 14.1.4.2, 14.1.4.3, 15.1.0, 15.1.0.1, 15.1.0.2, 15.1.0.3, 15.1.0.4, 15.1.0.5, 15.1.1, 15.1.2, 15.1.2.1, 15.1.3, 15.1.3.1, 16.1.0

Opened: Dec 04, 2020
Severity: 3-Major

Symptoms

- A DNS cache (or net dns-resolver) returns SERVFAIL responses to clients, despite the BIG-IP system receiving a good (albeit delayed) response from upstream servers. - When this happens, the BIG-IP system can be seen reject the responses from the upstream servers with ICMP errors (Destination unreachable - Port unreachable). - If the db key dnscacheresolver.loglevel is set to debug5, the following error message is visible in the /var/log/ltm file when this issue occurs: debug tmm[13147]: DNScache: request example.com. has exceeded the maximum number of glue fetches 17 to a single delegation point If a net dns-resolver is used with an http explicit proxy, the symptoms can appear as "503 Service Unavailable" due to DNS lookup failure.

Impact

Clients of the BIG-IP DNS cache are not returned an answer. As a result, application failures may occur.

Conditions

This issue occurs when the following conditions are met: - A DNS cache is in use on the BIG-IP system. - The DNS cache is configured with a forward-zone that uses multiple servers to perform resolutions. - The RTT of the servers fluctuates. For example, the servers are generally fast to reply for most domains, but take extra time to reply for a given domain. - 'Randomize Query Character Case' is enabled in the DNS cache. If the requests for the domain take a long time to resolve, BIG-IP may reply with SERVFAIL.

Workaround

You can work around this issue by changing 'Randomize Query Character Case' to 'No' in the DNS cache or net dns-resolver settings.

Fix Information

None

Behavior Change