API rate limits in the context of Terraform/OpenTofu/Ansible/etc. automation

Hello,

I’m in the process of automating DNS with the Terraform module provided by Valodim/desec. My question is not about the Terraform module, but rather the API rate limiting that every automation would experience.

I started by defining resources for my domain and 6 deSEC API tokens using the module above. Each token has a policy attached with at least 1 rule in addition to the imposed default rule. For my setup, there is 1 domain + 6 tokens + 6 default policies + 8 custom policies >= 21 resources managed. I wrote >= because I also plan to add RRsets as Terraform resources as well.

When I run terraform plan it first refreshes the current state by issuing API requests. Due to this, I run into API rate limits within seconds every time I run terraform plan or terraform apply, which is quite often now that I’m in the process of automating DNS.

$ time tofu plan
module.desec.desec_token.token["traefik-proxy-ogd"]: Refreshing state... [id=76d946a2-6387-42b2-9dde-25511cf501af]
module.desec.desec_token.token["ddclient-ogd"]: Refreshing state... [id=2a531e71-2432-4fcb-9b7e-01baf576f105]
module.desec.desec_token.token["ddclient-home"]: Refreshing state... [id=f738da25-a38f-4d4c-aa48-fc9f869ba59c]
module.desec.desec_token.token["traefik-proxy-home"]: Refreshing state... [id=eb663c10-23ce-48c7-b96c-bc46cba33e5b]
module.desec.desec_token.token["pve-home"]: Refreshing state... [id=53f2d2ec-b906-4c76-a4e7-b24ab6578a2b]
module.desec.desec_token.token["pve-ogd"]: Refreshing state... [id=7871e57b-ccdf-4cfb-90da-8aa940640782]
module.desec.desec_domain.trs: Refreshing state... [id=therightstuff.de]
module.desec.desec_token_policy.default["ddclient-ogd"]: Refreshing state... [id=4c9cf4db-8eb6-488e-ace0-a409e29b3da1]
module.desec.desec_token_policy.default["traefik-proxy-home"]: Refreshing state... [id=6865b1e8-1b36-4c61-b8a3-53a9e5100610]
module.desec.desec_token_policy.default["traefik-proxy-ogd"]: Refreshing state... [id=014c372a-16a9-470f-9fb0-8a4c307eb973]
module.desec.desec_token_policy.writable["pve-home.0"]: Refreshing state... [id=f140a16e-2f58-414d-aa3e-99ca7c99af05]
module.desec.desec_token_policy.writable["ddclient-ogd.1"]: Refreshing state... [id=a0e608a2-16c2-4c41-9061-3768baac1b26]
module.desec.desec_token_policy.default["ddclient-home"]: Refreshing state... [id=87b896a0-8266-4d89-8ffe-6df3ea3d9353]
module.desec.desec_token_policy.default["pve-home"]: Refreshing state... [id=208e4769-1af3-46bb-b275-d6546850a3eb]
module.desec.desec_token_policy.default["pve-ogd"]: Refreshing state... [id=364906ff-7e33-4549-b189-f3aab514792f]
module.desec.desec_token_policy.writable["traefik-proxy-home.0"]: Refreshing state... [id=9b27e4e0-675c-41ab-8ce3-26b1c24b21bd]
module.desec.desec_token_policy.writable["ddclient-home.1"]: Refreshing state... [id=ac986144-7131-473a-a7bf-110ed9a2296e]
module.desec.desec_token_policy.writable["traefik-proxy-ogd.0"]: Refreshing state... [id=252b6fca-3cdc-444b-bb6a-943fdf070cd9]
module.desec.desec_token_policy.writable["ddclient-ogd.0"]: Refreshing state... [id=e041f0af-625e-4deb-a9c3-619210056d6f]
module.desec.desec_token_policy.writable["pve-ogd.0"]: Refreshing state... [id=2163fc33-f9de-4237-9079-6a92c5428a24]
module.desec.desec_token_policy.writable["ddclient-home.0"]: Refreshing state... [id=502b570e-d6c0-453e-a4c6-83a90c7f534c]

No changes. Your infrastructure matches the configuration.

OpenTofu has compared your real infrastructure against your configuration and found no differences, so no changes are needed.
tofu plan  0.57s user 0.27s system 1% cpu 1:01.51 total

Developing with delays of at least a minute is neither fun nor productive.

I’m not a Terraform expert, but I found no options to run my desec module only on demand. Putting count = 0at the module level makes Terraform want to destroy all desec resources:

module "desec" {
  source = "./modules/desec"
  count = 0 # This will destroy all resources after taking >=1 minute to refresh.
}

@Valodim Did you manage to work around this?

I’m looking forward to ideas on how to solve this. Is there maybe a process to get the API limits raised?

Thanks,

Alex

There is currently no procedure to raise limits per account. However, we happy to raise the limits globally if we see a use case aligned with deSEC’s mission that otherwise cannot be supported and if we have enough resources to support it.

Please let us know what requests you’re sending and why the bulk API doesn’t provide a solution for your use case.

Please also refer to the docs for additional info on the rate limits: Rate Limits — deSEC DNS API documentation

Thanks,

Nils

2 Likes

Please let us know what requests you’re sending

I think what would be nice is if GET requests would get their limit raised. As far as I can tell, Terraform will refresh the state of all resources at startup and will issue a GET for every declared resource.

I’m not even touching RRset resources yet. Just the 6 tokens + their attached policies cause a minute of delay. I think that’s hitting the account_management_passive limits with these URLs:

The URLs above are a bit of guesswork because I use the Terraform module, which internally uses another library to talk to the API.

I understand that resource changes might be more expensive and hence have lower limits.

and why the bulk API doesn’t provide a solution for your use case

Regarding bulk requests, I found this comment by the Terraform module author:

However, Terraform doesn’t offer any mechanisms for batching, so this would require a hack in the provider to do.

Any insights on the topic? Is automation one of deSEC’s goals?

I’ve submitted a commit to increase the account_management_passive rate limit from 10/min to 50/min. A new hourly limit will be added which will be equivalent to the previous effective hourly limit (600/h).

Note that there is also a per-user rate limit of 2000/day for all requests.

We’re not willing to fully bear the costs of that limitation, hence the rate limits.

Note that writing a record to a domain causes the domain to get re-signed. Creating three records in a row will cause the domain to get re-signed three times. That’s quite expensive, and doesn’t scale if everyone does it. Hence, we offer bulk requests to circumvent this problem.

Sure. It’s worth distinguishing what exactly is to be automated:

Our mission is to improve Internet security by making DNSSEC more automated in the domain industry overall. We develop automation standards (e.g., RFC 9615, RFC 9859, and soon-to-be RFCs here and here), implementing them to showcase the feasibility, and convincing other players to do the same. For example, we’ve convinced Cloudflare to implement RFC 9615.

Other things can be automated, too, such as DNS record management. We make a reasonable effort there, because we want to be a decent DNS provider. However, remember that it’s free, and our focus lies elsewhere – automating enterprise use cases is not a goal (unless specifically related to DNS security).

Of course, that’s not to say that we don’t appreciate enterprise users; just managing expectations.

As a community-driven platform, we’ll be happy to take on contributions for improvements. For example, if you can afford spending the engineering hours to switch our signing pipeline from domain signing to incremental record signing, that would solve the performance issues described above, and we’d very much appreciate such a contribution (and would then relax rate limits).

Thank you for understanding!

Stay secure,
Peter

2 Likes