The other day, I was setting up Gitlab Runner to autoscale on AWS to run our CI/CD pipelines. The setup was going to be thus:
- Run a single long-living EC2 instance (preferably t2.micro) and spin up new ones as needed.
- Do not run any tests on Gitlab’s shared runners
- Allow developers to deploy Review Apps
- Deploy Review Apps via Kubernetes
To achieve all of this, first step was to find out a way to use a single EC2 instance and autoscale from there if a lot of pipelines start. Gitlab has a way to set this up - autoscaling Gitlab Runners on AWS EC2.
And so I started. I created a reasonably small EC2 instance and setup Docker, Docker Machine, Gitlab Runner on it. I then configured the Runner to play nice with autoscaling:
Docker Machine gives nightmares
At first, the region I selected was
ap-south-1b - which is where I had created my instance and security group made sure it was well isolated. It started throwing errors saying “Invalid region specified”. Very puzzling because I had made no typo! A quick internet search suggested that exact string
ap-south-1b is not recognized by Docker Machine! Each region has to be added in Docker Machine separately. The only way for me was to use
ap-south-1 which meant I had to move my instance to
ap-south-1a, so I did that.
Gitlab Runner config faults out
Next error thrown was
requestConcurrancy meet which meant I was not allowed to start so many instances, and had to change a few params:
Next error thrown said
Failed to update executor docker+machine for XXXXXXXX No free machines that can process builds. A lot of searches helped me understand there is an issue with the way
IdleCount works, and I changed that to
You cannot debug!
Next error thrown, and this one is damn amazing, said the runner limit met without having any queued jobs! At this point, I was running the runner from CLI and not as a service (
sudo gitlab-runner --debug run). Here, the problem turns out to be that this runner limit check message is parsed as an error when your log level is DEBUG. So, if you do the normal thing
sudo gitlab-runner run, things will work just fine. It took me hours to find out. Here is link to the discussion: link.
At the end of the day, I got my pipelines running on EC2 and they are indeed autoscaling.