;

Nabeel Sulieman

Traefik: Adding Cloud-Based Cert Storage

2019-11-16

Traefik is great. It's easy to setup and has sensible default settings that allow you to get up and running really fast. With just a couple of steps, you can deploy it as your Kubernetes Ingress controller, and it will automatically create and manage Let's Encrypt certificates right out of the box.

There is one downside to the Let's Encrypt functionality in Traefik: The community version only supports storing certificates to disk. Therefore, if you want to use Traefik's Let's Encrypt functionality, you will have to either upgrade to their enterprise edition or run a single instance of the service, which would make your service not very reliable.

For example, in my case, I have Traefik set up to store the certificates on a mounted storage volume. If the node were to go down, that volume would have to detached from the node and re-attached to a new node before Traefik can be started up again. Moving volumes, at least in Digital Ocean's Kubernetes service, is not instantaneous. The result is more outage time than we really should have.

Interestingly, the older version of Traefik did support storing certs in etcd, consul and a few other options. But even after spending hours carefully following the instructions, I was never able to get that working. Then Traefik2 came out and they dropped support for that altogether, presumably because it was unreliable and hard to get working correctly.

I got to thinking about how annoying it was to setup my own consul deployment. And for what? A few kilobytes of certificate data? This seems like overkill. I got to thinking about how much easier it would be if I could just use some existing cloud storage solution for this purpose.

Why not use AWS or Azure to store the certificates? Then I wouldn't have to worry about managing the storage service myself. Latency isn't an issue here, since this isn't a data application that needs that type of performance. Most of the time you're loading the certs into memory once and using them to handle requests. You only need to renew a certificate once every few months.

For this reason, I've decided to look into updating Traefik so that it uses Azure Table Storage. Why Azure Table Storage? Mostly because I personally find Table Storage easier to use than Dynamo. I also happen to work at Microsoft, so there's that.

But the goal is to make this modification clear enough that anyone can further extend it to any other type of storage.

Stay tuned for updates...