moving from docker compose files to a solution like Nomad + Consul + Vault + JuiceFS
I managed most of my hosted apps in my homelab and on my server using docker-compose files for a long time, since it was easy to use and I could easily backup the data. Updates were easy as well since I could just pull the latest image and restart the container. But a homelab is not a homelab if you don't try new things and so I decided to try out something new. I also wanted to try out Ansible for a long time and so I decided to use it to automate the setup of my new server.
Spoiler: After setting up my homelab I migrated my "production" server to the "Hashistack" as well. You are currently looking at the result of that migration.
Warning: It took me a week on and off to read the documentation for Nomad, Consul, Vault, JuiceFS and to prepare the ansible playbook the way I wanted it to be.
I strongly recommend you take your time.
What my "needs" were
- I want orchestration for my applications and I want to be able to scale them if needed.
- I also want to be able to easily backup my data and restore it if needed.
- I also want to be able to easily update my applications.
- I want to be able to easily add new applications.
Why I chose Nomad, Consul, Vault and not Kubernetes, Docker Swarm, ...
I chose Nomad because it is easy to use and it is very flexible. In comparison to Kubernetes the learning curve is not as steep and it is easier to setup. Consul is used as a service discovery and for service mesh. Vault is used to store secrets and certificates. All three tools work very well together and are easy to use. Some people say Kubernetes compliments those, so I still have the option to switch to Kubernetes if I want to. Docker Swarm is not as flexible as Nomad and I don't want to use Docker Swarm in production.
Why I chose Ansible and not Saltstack, Puppet, Chef, ...
I chose Ansible because it is easy to use and it is agentless. I don't want to install an agent on my server and I don't want to use a master server. While I might use a bastion host in the future, I don't want to use a master server as a single point of failure. The only requirement is python3 and ssh which is already installed on most linux distributions. Ansible also integrates very well with Terraform, which I might use in the future.
Why I chose Traefik and not Nginx, ...
I chose Traefik because it is easy to use, and it integrates very well with Nomad and Consul. It is also very easy to configure and it supports Let's Encrypt out of the box. Traefik also supports TCP and UDP load balancing which is a big plus for me.
Everything else
I chose JuiceFS as a distributed file system for my data. To support JuiceFS i chose Minio as a S3 compatible object storage and redis as a metadata store. Without a distributed file system it would be hard to scale the applications and nomad would not be able to schedule the applications on different nodes. The cluster transport is secured using Wireguard. I chose to use Docker as a container runtime since it is easy to use and I already have experience with it.
Backups
With this setup we need to back up the following things:
- Minio data basic backup of the data directory alternatively mirror
- Redis data enabling rdb and aof and configure snapshots
- Nomad state create snapshot using the cli
- Consul state create snapshot using the cli
- We dont need to create backups for vault, since vault is using the kv storage of consul
Ansible roles
While I won't share my playbooks, since I think everyone should write their own to be able to truly understand what is happening, I will share the roles I used and what they do.
ssh_hardening - Hardens the ssh server and copies the Ansible ssh key to the server. devsec.hardening.os_hardening - Hardens the system. hifis.unattended_upgrades - Installs and configures unattended upgrades. geerlingguy.pip - Installs pip. geerlingguy.docker - Installs docker. ansible-consul - Installs and configures consul. ansible-vault - Installs and configures vault. ansible-nomad - Installs and configures nomad. ansible_role_wireguard - Installs and configures ansible.
I wrote my own role for JuiceFS + minio + redis.
Things to consider
If you use Ansible for i.e. job templates you need to properly escape '{{' and '}}' in your templates. Otherwise, the template will be rendered by Ansible and not by Nomad. This is especially important for jobs with secrets.
Low latency is important for almost all the services. If you have high latency between your nodes you might experience problems.
Retrospective
I am very happy with the result. I learned a lot about Ansible, Nomad, Consul, Vault and Traefik. I also learned a lot about JuiceFS, Minio and Redis. I am also very happy with the performance of the cluster. I am able to deploy new applications in seconds, and I am able to scale them if needed. I am also able to easily backup my data and restore it if needed.
While overkill for a homelab and even for my very small production environment, I am very happy with the result. Ansible is invaluable for me and I will use it for all my future projects and work, since it is easy to use and makes provisioning and configuration management a breeze.