Incus Fleeting Plugin for Gitlab Autoscaler

By Paul Kramme

Published on: May 4, 2024 | Reading Time: 4 min | Last Modified: May 4, 2024

Incus Fleeting Plugin

Over the last few days I’ve build a new fleeting plugin for Incus, the fork of Canonical’s LXD. With the plugin it’s now possible to use Incus hypervisors with the Gitlab autoscaler runner. The autoscaler runner creates and destroys VMs based on configured limits, and is mostly cloud and hypervisor agnostic. You can find the project here:
github.com/pkramme/fleeting-plugin-incus

Gitlab’s autoscaler runner

Gitlab introduced a new way to create autoscaling CI/CD infrastructure with the “fleeting” library and their new autoscaler runner. Officially, Gitlab supports the big three cloud providers: AWS, GoogleCloud and Azure. The fleeting system is supposed to replace the long time abandoned docker-machine runner, which previously was the only way to let a gitlab-runner create and destroy VMs based on amount of jobs and inside certain limits.

Unfortunately, until now the problem with private CI/CD infrastructures wasn’t really solved - the old docker-machine driver was technically able to create VMs on private infrastructure with docker-machine specific plugins. The new fleeting system unofficially supports other systems aswell, such as kubernetes, but I wasn’t able to find plugins for any private on-prem hypervisor backends.

A plugin is a standalone binary, which is loaded by the gitlab-runner process. The implementation follows a client-server model, where the plugin imports a specific library for communicating with the fleeting system, in this case the runner. The runner, through the fleeting library, can then request new instances, ask for connection details or destroy existing VMs.

Why autoscaling outside of clouds?

So why build something new? Scaling for load isn’t the only thing the autoscaler does. Without the new autoscaler or the old docker-machine runners, some workloads cannot be safely executed. Especially building docker images with DinD inside the Gitlab CI requires the --privileged flag, which gives the CI job effectively complete control over the host system. If you these kinds of jobs on a Gitlab instance with multiple privilege levels without using different runners per level and project, you are asking for trouble. Even then, jobs running after a runner was compromised will obviously be unsafe, too.

Previously you could either

  1. setup many runners
  2. use docker-machine with its drawbacks
  3. use kaniko
  4. roll the dice

Now, you can use the autoscaler with its plugins, to create new VMs for every job and be reasonably safe from supply chain attacks or bad actors. And with this plugin you can do this on private infrastructure.

Other benefits

  • Compared to one docker based runner
    • Better resource isolation through the usage of VMs, no priority inversions or memory stalls affecting other jobs
    • Better security through the usage of VMs instead of Linux namespace features
  • Compared to public clouds
    • Dependending on the hardware used, much better performance in all regards
    • Better security through dedicated and/or on-prem infrastructure, no data leaks, no sidechannel attacks, etc
    • Much better cost effectiveness, including not being affected by “run-away” usage costs
    • Possibly better compliance with local regulations, such as GDPR
  • Compared to k8s based runners
    • All default runner features work and migration is possible without any code changes to any pipelines
    • DinD style workloads are possible

Installation and Configuration

The installation is simple, because fleeting plugins are just binaries that are executed by gitlab-runner.

git clone github.com/pkramme/fleeting-plugin-incus
cd fleeting-plugin-incus/cmd/fleeting-plugin-incus
go build
install fleeting-plugin-incus /usr/local/bin/fleeting-plugin-incus

The plugin also needs a VM image it can start. Create a base VM with incus.

# generate SSH keys
ssh-keygen -t ed25519

# create a base VM
incus launch images:ubuntu/22.04 runner-base --vm
incus exec runner-base bash

Then, prepare the VM.

# inside the VM
apt-get update && apt-get install openssh-server docker.io nano -y

# deploy your generated ssh public key inside VM
mkdir ~/.ssh/
nano ~/.ssh/authorized_keys

# shutdown
poweroff

Now, publish the VM image.

incus publish runner-base --alias runner-base --reuse

After that, configure the plugin according to the gitlab documentation. The plugin supports some configuration options. You need to set the base image correctly.

    [runners.autoscaler.plugin_config]
      incus_image = "runner-base"
      incus_instance_key_path = "/root/.ssh/id_ed25519"

After a systemctl restart gitlab-runner you should see something like this:

root@gitlab-runner-2:~# incus ls runner -f compact
        NAME          STATE            IPV4                                 IPV6                             TYPE        SNAPSHOTS
  runner-kelflllsrb  RUNNING  172.17.0.1 (docker0)     fd42:7bc9:1310:295c:216:3eff:fe02:9cd7 (enp5s0)  VIRTUAL-MACHINE  0
                              10.199.247.164 (enp5s0)
  runner-mhuwwvpfrd  RUNNING  172.17.0.1 (docker0)     fd42:7bc9:1310:295c:216:3eff:fe5c:b388 (enp5s0)  VIRTUAL-MACHINE  0
                              10.199.247.45 (enp5s0)

Additionally, I suggest you configure docker registry pull-through caches and S3 for caching.