Erratic 502 Bad Gateway responses on update.dedyn.io

Just a quick update: The mitigations put in place by the deSEC team yesterday have lowered the frequency of the 502 responses to ≈1/day (down from ≈18/day) in my testing. Tests were performed using cron(1) */5 without any additional delays.

I have some suggestions to further improve this on the client side:

Premise: Dynamic IPs typically change once a day or less. The exception might be after an outage or when the Internet router is restarted.

Depending on your client, either only react to changes (as would likely be the case for built-in clients on Internet routers) or use a client that polls less frequently, updates only when a public IP actually changed and randomises the exact update time.

Detailed suggestions

The details of now to accomplish this vary depending on your operating system. So the next points have to stay somewhat vague.

  • Polling every 5 min. or less is probably sufficient in most cases. If you know that your public IPs will probably change during the night, i.e. at a time when you are probably sleeping, then use a longer interval because you probably don’t care about an outage while asleep.

  • If you use cron(1) to trigger the checks, randomise the time. ~/5 * * * * would choose a random 5 minute interval for example. So it might run at …:07:00, …:12:00, … instead of …:05:00, …:10:00, etc. See your crontab(5) man page for details. Even better use something like ~/11 * * * * to get a randomised and shifting longer time.

  • Additionally add a random amount of seconds to delay execution when using cron(1). cron(1) executes its jobs more or less exactly on a full minute. And since most hosts use NTP servers to synchronise their clocks, this translates to high server load at these times. Use e.g. sleep(1) with a random integer generated using jot(1), shuf(1), awk(1), or whatever methods are available on your platform to get a random integer in the range from say 10-50 for use as the parameter for sleep(1).
    Some examples:

    • sleep $(awk -v min=10 -v max=50 'BEGIN{srand(); print int(min+rand()*(max-min+1))}')
    • sleep $(jot -r 1 10 50)
    • sleep $(shuf -i 10-50 -n 1)
    • And if you are using something other than a *NIX shell (Python, PHP, perl, …) you have even more options.
  • Then have your client compare the current public IPs to the ones currently set in the DNS records. Only call the IP Update API if there actually was a change. And verify that your method to determine a public IP actually yielded a valid IP before using the result. The same goes for the DNS query. Don’t trigger an update based on faulty or incomplete data!

  • Make sure you check the result of the update request. If it is not good or the HTTP status was not 200 then something went wrong and you need to retry the update. Don’t retry too often! Use the same judicious approach to retries that you used for the original update attempt. Ideally back off using growing delays between retries (up to a maximum delay).

  • It takes time for the results of a successful update request to propagate. There is a delay for the update server to modify the authoritative nameservers. Then there is the DNS TTL of 60 seconds which allows caches to still serve the stale data. So after a successful update don’t try to check again for a while. Typically a 2 minute delay would probably be sufficient. But a 10 min. delay would not cause much grief either. The longer the delay, the less traffic and workload you generate.

2 Likes