I have implemented a DNSSEC monitoring solution which does a number of checks and reports and logs the results. Among others it queries 18.104.22.168 (Cloudflare DNS). This has been running for a couple of months now, so I have a fair amount of data.
Sporadically I get errors where RRSIGs can not be verified, even for the domain desec.io! I also get errors (with an ever higher frequency) for another DNS provider when using their DNSSEC solution. Over time the frequency of the errors seems to be increasing.
For example from a report dated 2021.01.12 20:00:58 +0100:
: ERROR: DNSSEC verification failed for 2 TXT records using DNS resolver 22.214.171.124! For desec.io TXT (#2033)
· · ·▷ desec.io. TXT "v=spf1 a mx -all"
· · ·▷ desec.io. TXT "google-site-verification=kHvNl9DPVIQMSbpPgc-j_hZrNTYFxgEcICtgtJaogXA"
· · ·▷ desec.io. RRSIG TXT 8 2 900 20210121000000 20201231000000 32110 desec.io. K41jLast0ud+gc1cicxYmEj7NFjlMA7ayOVuMKu2aaxWaJHdnwBlM2mr OoNsXVdkQAJvqPlIhFXI7uREDQDqXr6EWwktLAE6/Xbhjz3oHYuRticL e/czTnqkD34hxOYtfWQ6cICB979XqKHIwfrt5GzNqxnX1LSGoD/jbteM ZwE=
The same queries to 126.96.36.199 (Google Public DNS) and to the local verifying DNS Resolver OpenBSD 6.8 unbound(8) made virtually at the same time do not result in these errors. That leaves me to conclude that Cloudflare DNS is broken w.r.t. DNSSEC.
Unfortunately I have not found any way to contact their support. If someone knows a way to contact them, please let me know.
Anyway, I wanted to let people know that at this time I can not recommend using 188.8.131.52 when using DNSSEC. Or even in general as more and more domains are DNSSEC secured and may exhibit similar problems leading to outages. Otherwise 184.108.40.206 has proven to be very fast and reliable so a deficiency like this is very unfortunate.
the signature that you show above is valid and is the one currently served by our nameservers. You can check it out with
dig +dnssec TXT desec.io @ns1.desec.io, and to make sure it’s valid, the same command with
@220.127.116.11, as Google Public DNS uses DNSSEC validation. (But only until the next signature rotation, which is scheduled for tomorrow, Thursday around noon UTC.)
Could you provide more information as to why your test says the signature verification fails?
A wild guess could be that other DNS responses that are needed to validate the answer got lost in transit and validation fails due to that missing information.
My mechanism uses dig(1) to do the actual DNS queries. To be precise: dig 9.10.8-P1 on OpenBSD 6.8 stable
It goes through a number of steps which I’ll leave out for brevity, at some point querying several RRsets using different resolvers.
Basically it uses several resolvers to get a validated answer, e.g. (using the same example RRset as in my original post):
/usr/bin/dig +dnssec +noall +answer +comments +nottl +noclass +rrcomments @'18.104.22.168' 'desec.io' 'TXT'
If that fails to yield an answer, i.e. if no TXT records are returned, for the targeted resolver then a recheck is done without validation:
/usr/bin/dig +dnssec +noall +answer +comments +nottl +noclass +rrcomments +cd @'22.214.171.124' 'desec.io' 'TXT'
If the recheck succeeds then the mechanism concludes that the RRSIG validation must have failed. The output of this last command is included in the failure report, thus the 2 TXT records and the RRSIG record.
Your suggestion that lost responses might be to blame is interesting. I will need to check my code to see how it would react in that case. I may need to put in some more debug code to see if that might be the issue. Still, this only happens with 126.96.36.199.
I did some testing using tshark (on the server) and Wireshark (to analyze the captured packets). I filtered the capture using
port == 53 && host == 188.8.131.52 to capture all DNS traffic to and from 184.108.40.206.
I do see lost UDP packets but that seems to be handled internally by dig(1). (I see a resend of the query after ≈5s when no response is received.)
While I did not capture any more problems with
desec.io queries I did see the issues with another domain. In these cases the 220.127.116.11 resolver responds with server fault (
dns.flags.rcode == 2), sometimes even when the query is sent with the +cd option and for RRsets where I would expect either a validated answer or a validated negative answer (based on the NSEC3 mechanism). The DNS hoster does have some known problems with their DNSSEC implementation, e.g. CAA records are incorrectly signed, which they explained with “unsupported”.
That being said, I would have put the blame on the wonky DNSSEC implementation of the DNS hoster if not for the fact that I occasionally see the same thing with other domains like
My best guess right now is some weird caching problem at 18.104.22.168 but that guess is not really based on any provable facts.