I’m getting my X.509 certificates from Let’s Encrypt and use the DNS-01 challenge method for authentication. This has been working nicely for a few year, but I recently switched my ACME client to lego and I have occasional issues with authentication.
Here’s what I’m observing:
lego sends the CSR, receives a challenge and publishes the response via the deSEC API.
lego repeatedly queries the authoritative nameservers (ns1.desec.io and ns2.desec.org) until they respond with a NOERROR code.
lego asks Let’s Encrypt to verify authentication.
Let’s Encrypt fails with the following error: acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: NXDOMAIN looking up TXT for _acme-challenge.example.com - check that a DNS record exists for this domain
Apparently, lego gets a valid reply from the nameservers but Let’s Encrypt does not.
I guess this has to do with record propagation in deSEC’s anycast network: Authoritative nameservers near me may have the new record while nameservers near Let’s Encrypt may not.
My questions:
What would be the best way to handle this? lego does not support configuring a delay between querying the authoritative nameservers and triggering the CA to complete authorization.
How log does global record propagation usually take?
Can I somehow check if propagation of a newly added record has completed? I didn’t find anything in the API docs.
The proper way is to wait a bit longer, or retry automatically until it works.
Normally, it is just a few seconds. However, due to this issue, ad-hoc notifications to our global secondaries currently are ineffective, and our fall-back mechanism triggers all updates instead. The fall-back mechanism checks freshness once a minute, and depending on when exactly your update happened, it may be discovered right away, or after approximately one minute. There should be only very few cases which take longer.
This is currently not possible, but it seems like a reasonable feature! We’d appreciate a feature request on our GitHub.
Here’s another (not very elegant) workaround: You could delegate your _acme-challenge subdomain to another DNS provider which doesn’t have an anycast deployment. In such a setup, you will see the same state as Let’s Encrypt, so you can proceed with validation immediately once you observe that the challenge has been published.
thank you, I think that helps.
If I can currently rely on one minute, I’ll just set lego’s propagation polling interval to slightly more than one minute. Seems like a reasonable and simple workaround to me.