In this article, we’re going to install Flux, a popular tool for GitOps. We’re going to discuss reconciliation, where the need for GitOps comes from, and common repository structures for Flux.
So what’s GitOps?
To understand what GitOps is, let’s go back in time to a place before GitOps. Back then, there were
development teams and sysadmin teams. Developers put bugs in software created new features, and the
admin’s job was to somehow deploy and monitor the application. This had all the usual problems we discussed in
the previous blog post: manual processes are prone to errors. That’s why tools like Ansible or Terraform
exist—to declaratively describe the state of the system and automate these tasks.
Now you have YAML files describing the state of the cluster. This is nice—and definitely an improvement!—but how do you know which change caused something to break? How can you roll back to a previous version in case something went horribly wrong?
Enter GitOps: The idea is to just put your infrastructure files in Git. That way, you can version
control your infrastructure. You can adopt a software-like development model, where you test out changes in
branches and do pull requests to merge them into your production environment. You know who in your team
changed what (git blame
), and in case something breaks, you can git revert
the bad change.
Kubernetes plays a special role in this conversation due to its API-driven design: you already describe the state of the cluster, so it seems like a natural fit. The Git repository acts as a single source of truth, while tools like Flux ensure that the cluster’s actual state matches the desired state stored in Git.
A last advantage I want to highlight are automatic updates: If you’re running containerized workloads right now, how many of them are up-to-date? Are you aware of critical CVEs? If you have your entire cluster state in a repository, you can use automatic scanners, like Renovate Bot, to open pull requests to automatically update your containers to the newest version.
Flux: GitOps on Kubernetes
At the time of writing, there are two main tools for doing GitOps on Kubernetes: Flux and ArgoCD. Both projects work fantastically well, and I highly recommend both. For this project though, we’re going to go with Flux.
Challenge time
If you don’t mind deviating from the treaded path, do try out ArgoCD with the rest of theses posts and let me know how it goes!Getting started with Flux
To get started with Flux, the first thing we need to do is to install the Flux CLI. In my case, I can use
winget
. If you’re using another OS, check out the docs for specific
instructions.
> winget install -e --id FluxCD.Flux
Next, we’re going to bootstrap the cluster. What this means is installing the Flux controllers into our K8s cluster, pushing the manifests to our Git Repository, and then reconciling the state between the cluster and the repository.
Let’s run a quick sanity check, to see if our cluster will support flux:
$ flux check --pre
All good? Let’s get started bootstrapping the cluster then!
To get started with GitHub, create a private repo. Then, it’s necessary to create a Personal Access Token so that flux can read from and write into the repository. Visit the Fine-grained personal access tokens page to create one:
- Give it a useful name.
- Choose the Only select repositories setting, and pick the repository you just created.
- Give the token read and write access to administration and content.
Best Practice: Principle of Least Privilege
You’re creating something here that’s effectively a password to your GitHub Account. Please scope the PAT down as much as possible.Next, we can bootstrap flux like this:
$ export GITHUB_TOKEN=<gh-token>
$ flux bootstrap github \
--token-auth \
--owner=<my-github-username> \
--repository=<my-repository-name> \
--branch=main \
--path=clusters/<my-cluster> \
--personal
When this is done: Congratulations, Flux is running in your cluster!
Intermission: Kubernetes YAML manifests
Before we set-up flux further, we should discuss Kubernetes manifests. Feel free to skip ahead if you’re already familiar.
As we discussed before, Kubernetes works declaratively. This means you somehow describe the state of the cluster, and the control plane takes care of actually getting to that state. So let’s bring light into the “somehow”.
To quote from the docs:
Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe:
- What containerized applications are running (and on which nodes)
- The resources available to those applications
- The policies around how those applications behave, such as restart policies, upgrades, and fault-tolerance
A Kubernetes object is a “record of intent”—once you create the object, the Kubernetes system will constantly work to ensure that the object exists. By creating an object, you’re effectively telling the Kubernetes system what you want your cluster’s workload to look like; this is your cluster’s desired state.
These objects are created using the API. There are various
clients that can interface with the API.
The most useful one for us now however is kubectl
, the official Kubernetes CLI.
For (almost) all objects, the API has two nested fields: The spec
which defines the desired state and the
status
which reflects the current state1 (for a deeper understanding of the API conventions,
check out the contributor
documentation).
We see the status
field when we use kubectl
to get
or describe
a resource.
Similarly, to create
or update
an object, we need to provide the spec
field. In terms of kubectl you can
either give it all required values as commandline parameters (e.g., kubectl create secret generic my-secret --from-literal=key1=supersecret --from-literal=key2=topsecret
) or define it all in a YAML file (called a
manifest) which kubectl translates to JSON in the background:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
This YAML manifest describes a Deployment
resource. It gives it some metadata (a name) and then specifies it
in the spec
section. If we were to kubectl apply -f
this, then kubectl would transform this to JSON and
call to appropriate API endpoints to create the resource.
Kustomizations are a special type of manifest, because they allow you to layer different YAML files over each other in a structured way. It provides a way to customize and reuse Kubernetes manifests without modifying the original YAML files directly. This allows you to define a “base” configuration and then apply environment-specific or cluster-specific changes using overlays. This avoids duplicating YAML files for different setups.
Flux repository structures
Let’s now look at the repository that Flux created for us. We can see that it’s almost empty, just having a
clusters/<your cluster name>
folder with a couple of files. So how would we actually deploy something into
the cluster? Luckily, the documentation mentions a
couple of ways:
- Monorepo: All Kubernetes manifests are stored in a single Git repository. Different Environments (e.g.,
staging, production) are defined as separate overlays in the same branch. The separation of apps and
infrastructure allows reconciliation in a specific order. The folder structure looks something like this:
├── apps │ ├── base │ ├── production │ └── staging ├── infrastructure │ ├── base │ ├── production │ └── staging └── clusters ├── production └── staging
- Repo per environment: Each environment has its own Repository. This allows you to limit who has access to which environment (typically moot in homelab scenarios). You promote changes using Pull Requests.
- Repo per team: Here, each team gets its own repository who they are the sole owner of. Again, this is probably overkill for a homelab.
- Repo per app: You can also have one repository per app that you deploy. This makes sense in larger organizations where teams own their workloads.
I’d strongly suggest going down the Monorepo route for a homelab, unless you have a very good reason to deviate from that. It boils down to this: The added flexibility you get from the other models doesn’t outweigh the management overhead you have from implementing that model. This, however might change should you choose to share your homelab with e.g. your friends or partner(s). In this case you may want to isolate parts and only give access to certain repositories.
Implementing the Monorepo
If you haven’t already, clone your private Git repository. Create the following folder structure:
├── apps
│ ├── base
│ └── <your cluster>
├── infrastructure
│ ├── configs
│ └── controllers
└── clusters
└── <your cluster>
From a high-level point of view, we’re going to build three main Kustomizations:
- Infra Controllers are tools that are immediately needed to run the cluster. Examples would be our logging infrastructure, our monitoring tools, our storage driver or the load balancer.
- Infra Configs is the configuration of some of the tools. For example, our Load Balancer needs to know in which CIDR blocks to operate, and is doing this with custom resources.
- Apps finally contains all applications that are running in our cluster. This Kustomization has a
base
layer of a “good” default for each app, as well as a folder per cluster, so that you can overwrite some values. A simple example: You have two clusters, one for dev and one for prod. You can use the<your dev cluster>
and<your prod cluster>
folders to overwrite the specific container images used, the URLs or the container limits, while keeping the rest identical.
infrastructure
doesn’t change between
your different clusters. If you plan on having more than one cluster with different configurations you can
adapt the folder structure from /apps
to /infrastructure
as well.To get started, in the clusters/<your cluster>
folder, create infrastructure.yml
:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infra-controllers
namespace: flux-system
spec:
interval: 1h
retryInterval: 1m
timeout: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/controllers
prune: true
wait: true
---
Here we tell Flux about the existence of a Kustomization
resource. Its name is infra-controllers
and it
lives in the flux-system
namespace. In order of appearance, we set the following fields:
interval
— How often the Kustomization is fetched from the source and reconciled. In our case, a value of one hour is fine, you can set this to a lower value if required. Don’t set it below 60 seconds. The interval needs to be parseable by Go.retryInterval
— If reconciliation fails, the interval at which to try again.timeout
— A timeout for building, applying and health checks during the reconcilation process.sourceRef
— From where Flux can fetch the Kustomization. Since we’re using a monorepo, we reference the flux repository we created earlier.path
— The path to the Kustomization within the repository.prune
— Determines if objects that were created by this Kustomization, but are now missing, should be deleted or kept.wait
— Halts until all reconciled resources have successful health checks.
To get to the previously discussed structure, add another resource to the file:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: infra-configs
namespace: flux-system
spec:
dependsOn:
- name: infra-controllers
interval: 1h
retryInterval: 1m
timeout: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./infrastructure/configs
prune: true
In the same folder, create an apps.yml
file with this content:
# =============== apps.yml ===============
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 10m0s
dependsOn:
- name: infra-configs
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps/<your cluster>
prune: true
wait: true
timeout: 5m0s
For both new files, note the dependsOn
field. I also expect apps to change more frequently, which is why I
have reduce the interval to 10min.
Finally, create three kustomization.yml
in the /infrastructure/{configs, controllers}
and /apps/base
folders respectively:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources: []
Commit your changes and push them. You can then either take a nice walk and wait for flux to recognize the
changes automatically, or force a reconciliation with flux reconcile source git flux-system
.
Let’s see what changed!
# Check which sources are available
> flux get sources git
NAME REVISION SUSPENDED READY MESSAGE
flux-system main@sha1:XXXXXXXX False True ...
# Check if our new kustomizations were picked up
> flux get kustomization
NAME REVISION SUSPENDED READY MESSAGE
apps main@sha1:XXXXXXXX False True ...
flux-system main@sha1:XXXXXXXX False True ...
infra-configs main@sha1:XXXXXXXX False True ...
infra-controllers main@sha1:XXXXXXXX False True ...
You should see that Flux has successfully picked up on our new kustomizations and successfully applied them to the cluster. Right now this is somewhat pointless, because they’re empty. This is going to change in the next article though, when we deploy MetalLB. See you then!
Footnotes
-
Note that I show the objects in YAML notation while the API technically deals in JSON. ↩