Using Flux for GitOps

Bare-metal Kubernetes Homelab · by

In this article, we’re going to install Flux, a popular tool for GitOps. We’re going to discuss reconciliation, where the need for GitOps comes from, and common repository structures for Flux.

So what’s GitOps?

To understand what GitOps is, let’s go back in time to a place before GitOps. Back then, there were development teams and sysadmin teams. Developers put bugs in software created new features, and the admin’s job was to somehow deploy and monitor the application. This had all the usual problems we discussed in the previous blog post: manual processes are prone to errors. That’s why tools like Ansible or Terraform exist—to declaratively describe the state of the system and automate these tasks.

Now you have YAML files describing the state of the cluster. This is nice—and definitely an improvement!—but how do you know which change caused something to break? How can you roll back to a previous version in case something went horribly wrong?

Enter GitOps: The idea is to just put your infrastructure files in Git. That way, you can version control your infrastructure. You can adopt a software-like development model, where you test out changes in branches and do pull requests to merge them into your production environment. You know who in your team changed what (git blame), and in case something breaks, you can git revert the bad change.

Kubernetes plays a special role in this conversation due to its API-driven design: you already describe the state of the cluster, so it seems like a natural fit. The Git repository acts as a single source of truth, while tools like Flux ensure that the cluster’s actual state matches the desired state stored in Git.

For those new to K8s and GitOps: Imagine recreating a cluster from scratch. If you have tens of applications running on the cluster, modifying them all manually can be overwhelming. A properly set-up Flux repository makes this process almost effortless because you can just apply the repository you created.

A last advantage I want to highlight are automatic updates: If you’re running containerized workloads right now, how many of them are up-to-date? Are you aware of critical CVEs? If you have your entire cluster state in a repository, you can use automatic scanners, like Renovate Bot, to open pull requests to automatically update your containers to the newest version.

Flux: GitOps on Kubernetes

At the time of writing, there are two main tools for doing GitOps on Kubernetes: Flux and ArgoCD. Both projects work fantastically well, and I highly recommend both. For this project though, we’re going to go with Flux.

Challenge time

If you don’t mind deviating from the treaded path, do try out ArgoCD with the rest of theses posts and let me know how it goes!

Getting started with Flux

To get started with Flux, the first thing we need to do is to install the Flux CLI. In my case, I can use winget. If you’re using another OS, check out the docs for specific instructions.

> winget install -e --id FluxCD.Flux

Next, we’re going to bootstrap the cluster. What this means is installing the Flux controllers into our K8s cluster, pushing the manifests to our Git Repository, and then reconciling the state between the cluster and the repository.

Let’s run a quick sanity check, to see if our cluster will support flux:

$ flux check --pre

All good? Let’s get started bootstrapping the cluster then!

For the rest of this post I’m assuming you’re using GitHub. If that happens to not be the case check out the docs for other Git providers.

To get started with GitHub, create a private repo. Then, it’s necessary to create a Personal Access Token so that flux can read from and write into the repository. Visit the Fine-grained personal access tokens page to create one:

  • Give it a useful name.
  • Choose the Only select repositories setting, and pick the repository you just created.
  • Give the token read and write access to administration and content.

Best Practice: Principle of Least Privilege

You’re creating something here that’s effectively a password to your GitHub Account. Please scope the PAT down as much as possible.

Next, we can bootstrap flux like this:

$ export GITHUB_TOKEN=<gh-token>
$ flux bootstrap github \
    --token-auth \
    --owner=<my-github-username> \
    --repository=<my-repository-name> \
    --branch=main \
    --path=clusters/<my-cluster> \
    --personal

When this is done: Congratulations, Flux is running in your cluster!

Intermission: Kubernetes YAML manifests

Before we set-up flux further, we should discuss Kubernetes manifests. Feel free to skip ahead if you’re already familiar.

As we discussed before, Kubernetes works declaratively. This means you somehow describe the state of the cluster, and the control plane takes care of actually getting to that state. So let’s bring light into the “somehow”.

To quote from the docs:

Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe:

  • What containerized applications are running (and on which nodes)
  • The resources available to those applications
  • The policies around how those applications behave, such as restart policies, upgrades, and fault-tolerance

A Kubernetes object is a “record of intent”—once you create the object, the Kubernetes system will constantly work to ensure that the object exists. By creating an object, you’re effectively telling the Kubernetes system what you want your cluster’s workload to look like; this is your cluster’s desired state.

These objects are created using the API. There are various clients that can interface with the API. The most useful one for us now however is kubectl, the official Kubernetes CLI.

For (almost) all objects, the API has two nested fields: The spec which defines the desired state and the status which reflects the current state1 (for a deeper understanding of the API conventions, check out the contributor documentation). We see the status field when we use kubectl to get or describe a resource.

Similarly, to create or update an object, we need to provide the spec field. In terms of kubectl you can either give it all required values as commandline parameters (e.g., kubectl create secret generic my-secret --from-literal=key1=supersecret --from-literal=key2=topsecret) or define it all in a YAML file (called a manifest) which kubectl translates to JSON in the background:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

This YAML manifest describes a Deployment resource. It gives it some metadata (a name) and then specifies it in the spec section. If we were to kubectl apply -f this, then kubectl would transform this to JSON and call to appropriate API endpoints to create the resource.

Kustomizations are a special type of manifest, because they allow you to layer different YAML files over each other in a structured way. It provides a way to customize and reuse Kubernetes manifests without modifying the original YAML files directly. This allows you to define a “base” configuration and then apply environment-specific or cluster-specific changes using overlays. This avoids duplicating YAML files for different setups.

Flux repository structures

Let’s now look at the repository that Flux created for us. We can see that it’s almost empty, just having a clusters/<your cluster name> folder with a couple of files. So how would we actually deploy something into the cluster? Luckily, the documentation mentions a couple of ways:

  • Monorepo: All Kubernetes manifests are stored in a single Git repository. Different Environments (e.g., staging, production) are defined as separate overlays in the same branch. The separation of apps and infrastructure allows reconciliation in a specific order. The folder structure looks something like this:
    ├── apps
    │   ├── base
    │   ├── production
    │   └── staging
    ├── infrastructure
    │   ├── base
    │   ├── production
    │   └── staging
    └── clusters
        ├── production
        └── staging
    
  • Repo per environment: Each environment has its own Repository. This allows you to limit who has access to which environment (typically moot in homelab scenarios). You promote changes using Pull Requests.
  • Repo per team: Here, each team gets its own repository who they are the sole owner of. Again, this is probably overkill for a homelab.
  • Repo per app: You can also have one repository per app that you deploy. This makes sense in larger organizations where teams own their workloads.

I’d strongly suggest going down the Monorepo route for a homelab, unless you have a very good reason to deviate from that. It boils down to this: The added flexibility you get from the other models doesn’t outweigh the management overhead you have from implementing that model. This, however might change should you choose to share your homelab with e.g. your friends or partner(s). In this case you may want to isolate parts and only give access to certain repositories.

Implementing the Monorepo

If you haven’t already, clone your private Git repository. Create the following folder structure:

├── apps
│   ├── base
│   └── <your cluster>
├── infrastructure
│   ├── configs
│   └── controllers
└── clusters
    └── <your cluster>

From a high-level point of view, we’re going to build three main Kustomizations:

  1. Infra Controllers are tools that are immediately needed to run the cluster. Examples would be our logging infrastructure, our monitoring tools, our storage driver or the load balancer.
  2. Infra Configs is the configuration of some of the tools. For example, our Load Balancer needs to know in which CIDR blocks to operate, and is doing this with custom resources.
  3. Apps finally contains all applications that are running in our cluster. This Kustomization has a base layer of a “good” default for each app, as well as a folder per cluster, so that you can overwrite some values. A simple example: You have two clusters, one for dev and one for prod. You can use the <your dev cluster> and <your prod cluster> folders to overwrite the specific container images used, the URLs or the container limits, while keeping the rest identical.
Of course, the implicit assumption is that your infrastructure doesn’t change between your different clusters. If you plan on having more than one cluster with different configurations you can adapt the folder structure from /apps to /infrastructure as well.

To get started, in the clusters/<your cluster> folder, create infrastructure.yml:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infra-controllers
  namespace: flux-system
spec:
  interval: 1h
  retryInterval: 1m
  timeout: 5m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/controllers
  prune: true
  wait: true
---

Here we tell Flux about the existence of a Kustomization resource. Its name is infra-controllers and it lives in the flux-system namespace. In order of appearance, we set the following fields:

  1. interval — How often the Kustomization is fetched from the source and reconciled. In our case, a value of one hour is fine, you can set this to a lower value if required. Don’t set it below 60 seconds. The interval needs to be parseable by Go.
  2. retryInterval — If reconciliation fails, the interval at which to try again.
  3. timeout — A timeout for building, applying and health checks during the reconcilation process.
  4. sourceRef — From where Flux can fetch the Kustomization. Since we’re using a monorepo, we reference the flux repository we created earlier.
  5. path — The path to the Kustomization within the repository.
  6. prune — Determines if objects that were created by this Kustomization, but are now missing, should be deleted or kept.
  7. wait — Halts until all reconciled resources have successful health checks.

To get to the previously discussed structure, add another resource to the file:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infra-configs
  namespace: flux-system
spec:
  dependsOn:
    - name: infra-controllers
  interval: 1h
  retryInterval: 1m
  timeout: 5m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/configs
  prune: true

In the same folder, create an apps.yml file with this content:

# =============== apps.yml ===============
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 10m0s
  dependsOn:
    - name: infra-configs
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/<your cluster>
  prune: true
  wait: true
  timeout: 5m0s

For both new files, note the dependsOn field. I also expect apps to change more frequently, which is why I have reduce the interval to 10min.

Finally, create three kustomization.yml in the /infrastructure/{configs, controllers} and /apps/base folders respectively:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources: []

Commit your changes and push them. You can then either take a nice walk and wait for flux to recognize the changes automatically, or force a reconciliation with flux reconcile source git flux-system.

Let’s see what changed!

# Check which sources are available
> flux get sources git 
NAME            REVISION                SUSPENDED       READY   MESSAGE
flux-system     main@sha1:XXXXXXXX      False           True    ...

# Check if our new kustomizations were picked up
> flux get kustomization
NAME                    REVISION                SUSPENDED       READY   MESSAGE
apps                    main@sha1:XXXXXXXX      False           True    ...
flux-system             main@sha1:XXXXXXXX      False           True    ...
infra-configs           main@sha1:XXXXXXXX      False           True    ...
infra-controllers       main@sha1:XXXXXXXX      False           True    ...

You should see that Flux has successfully picked up on our new kustomizations and successfully applied them to the cluster. Right now this is somewhat pointless, because they’re empty. This is going to change in the next article though, when we deploy MetalLB. See you then!

Footnotes

  1. Note that I show the objects in YAML notation while the API technically deals in JSON.

Next article

Setting up MetalLB