Using Flux for GitOps

Bare-metal Kubernetes for your Homelab — An overview of Kubernetes, some motivation and common hardware choices for such projects.
Installing Fedora CoreOS — Learn how to install Fedora CoreOS, a minimal, container-optimized Linux distribution, as the base operating system for your Kubernetes cluster with Butane and Ignition.
Deploying bare-metal Kubernetes with Kubespray — How to use Kubespray, an automation tool based on Ansible, to deploy Kubernetes on your bare-metal setup, providing detailed steps for configuring and launching a multi-node cluster.
Using Flux for GitOps — You are here.
Persistent storage with Rook-Ceph — We need to be able to save data in our cluster. We use Rook-Ceph, the cloud native to use the Ceph distributed filesystem to make storage available in our cluster.
Observability — Learn how to monitor and log your cluster's performance and health for better insights and troubleshooting.
1. Metrics: Prometheus and Grafana — Using the prometheus-stack, we set up observability for the cluster, by scraping metrics and displaying them in Grafana Dashboards.
2. Logging: Centralized Logging with ElasticSearch — The other part of observability, centralized logging. We'll discover how to set-up FluentBit to scrape logs from all parts of your cluster and use ElasticSearch and Kibana to analyze them.
Cert-Manager for automatic TLS certificates — We use Cert-Manager to automatically create and rotate TLS certificates in our cluster that we acquire using ACME and Let's Encrypt.
Ingress — Guide to configuring ingress for your cluster, using various tools to manage traffic routing and external access.
1. Ingress with Traefik — Learn how to deploy Traefik as an ingress controller for dynamic routing and load balancing across your services.
2. Ingress with cloudflared (Cloudflare Tunnels) — Discover how to use Cloudflared/ Cloudflare Tunnels as a secure ingress solution that integrates easily with Cloudflare’s edge network for added protection and performance.
3. Ingress with TailScale — Finally, we can also integrate our Kubernetes cluster into our tailscale network.
Cloud-Native Postgres — This project allows you to manage PostgreSQL Databases in a cloud-native way.
Service Meshes — Explore the benefits of service meshes for secure and efficient communication between your services.
1. Service Meshes: linkerd — An introduction to Linkerd, a lightweight and secure service mesh for your Kubernetes cluster.
2. Service Meshes: Istio — A detailed guide to Istio, a robust service mesh offering advanced traffic management and security features.
Security — Here, we discuss how to harden the cluster.
1. Automatic K8s cluster scanning with Trivy — Trivy can scan clusters for vulnerabilities and misconfigurations.
2. Use Renovate Bot for GitOps with Flux — We can use renovate bot to automatically update our Flux Deployments.

In this article, we’re going to install Flux, a popular tool for GitOps. We’re going to discuss reconciliation, where the need for GitOps comes from, and common repository structures for Flux.

New content

2025-04-20: I’ve updated the article to include configuration for kube-vip.

So what’s GitOps?

To understand what GitOps is, let’s go back in time to a place before GitOps. Back then, there were development teams and sysadmin teams. Developers ~~put bugs in software~~ create new features, and the admin’s job was to somehow deploy and monitor the application. This had all the usual problems we discussed in the previous blog post: manual processes are prone to errors. That’s why tools like Ansible or Terraform exist—to declaratively describe the state of the system and automate these tasks.

Now you have YAML files describing the state of the cluster. This is nice—and definitely an improvement!—but how do you know which change caused something to break? How can you roll back to a previous version in case something went horribly wrong?

Enter GitOps: The idea is to just put your infrastructure files in Git. That way, you can version control your infrastructure. You can adopt a software-like development model, where you test out changes in branches and do pull requests to merge them into your production environment. You know who in your team changed what (git blame), and in case something breaks, you can git revert the bad change.

Kubernetes plays a special role in this conversation due to its API-driven design: you already describe the state of the cluster, so it seems like a natural fit. The Git repository acts as a single source of truth, while tools like Flux ensure that the cluster’s actual state matches the desired state stored in Git.

For those new to K8s and GitOps: Imagine recreating a cluster from scratch. If you have tens of applications running on the cluster, modifying them all manually can be overwhelming. A properly set-up Flux repository makes this process almost effortless because you can just apply the repository you created.

A last advantage I want to highlight are automatic updates: If you’re running containerized workloads right now, how many of them are up-to-date? Are you aware of critical CVEs? If you have your entire cluster state in a repository, you can use automatic scanners, like Renovate Bot, to open pull requests to automatically update your containers to the newest version.

Flux: GitOps on Kubernetes

At the time of writing, there are two main tools for doing GitOps on Kubernetes: Flux and ArgoCD. Both projects work fantastically well, and I highly recommend both. For this project though, we’re going to go with Flux.

Challenge time

If you don’t mind deviating from the treaded path, do try out ArgoCD with the rest of theses posts and let me know how it goes!

Getting started with Flux

To get started with Flux, the first thing we need to do is to install the Flux CLI. In my case, I can use winget. If you’re using another OS, check out the docs for specific instructions.

> winget install -e --id FluxCD.Flux

Next, we’re going to bootstrap the cluster. What this means is installing the Flux controllers into our K8s cluster, pushing the manifests to our Git Repository, and then reconciling the state between the cluster and the repository.

Let’s run a quick sanity check, to see if our cluster will support flux:

$ flux check --pre

All good? Let’s get started bootstrapping the cluster then!

For the rest of this post I’m assuming you’re using GitHub. If that happens to not be the case check out the docs for other Git providers.

To get started with GitHub, create a private repo. Then, it’s necessary to create a Personal Access Token so that flux can read from and write into the repository. Visit the Fine-grained personal access tokens page to create one:

Give it a useful name.
Choose the Only select repositories setting, and pick the repository you just created.
Give the token read and write access to administration and content.

Best Practice: Principle of Least Privilege

You’re creating something here that’s effectively a password to your GitHub Account. Please scope the PAT down as much as possible.

Next, we can bootstrap flux like this:

$ export GITHUB_TOKEN=<gh-token>
$ flux bootstrap github \
    --token-auth \
    --owner=<my-github-username> \
    --repository=<my-repository-name> \
    --branch=main \
    --path=clusters/<my-cluster> \
    --personal

When this is done: Congratulations, Flux is running in your cluster!

Intermission: Kubernetes YAML manifests

Before we set-up flux further, we should discuss Kubernetes manifests. Feel free to skip ahead if you’re already familiar.

As we discussed before, Kubernetes works declaratively. This means you somehow describe the state of the cluster, and the control plane takes care of actually getting to that state. So let’s bring light into the “somehow”.

To quote from the docs:

Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. Specifically, they can describe:

What containerized applications are running (and on which nodes)

The resources available to those applications

The policies around how those applications behave, such as restart policies, upgrades, and fault-tolerance

A Kubernetes object is a “record of intent”—once you create the object, the Kubernetes system will constantly work to ensure that the object exists. By creating an object, you’re effectively telling the Kubernetes system what you want your cluster’s workload to look like; this is your cluster’s desired state.

These objects are created using the API. There are various clients that can interface with the API. The most useful one for us now however is kubectl, the official Kubernetes CLI.

For (almost) all objects, the API has two nested fields: The spec which defines the desired state and the status which reflects the current state¹ (for a deeper understanding of the API conventions, check out the contributor documentation). We see the status field when we use kubectl to get or describe a resource.

Similarly, to create or update an object, we need to provide the spec field. In terms of kubectl you can either give it all required values as commandline parameters (e.g., kubectl create secret generic my-secret --from-literal=key1=supersecret --from-literal=key2=topsecret) or define it all in a YAML file (called a manifest) which kubectl translates to JSON in the background:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

This YAML manifest describes a Deployment resource. It gives it some metadata (a name) and then specifies it in the spec section. If we were to kubectl apply -f this, then kubectl would transform this to JSON and call to appropriate API endpoints to create the resource.

Kustomizations are a special type of manifest, because they allow you to layer different YAML files over each other in a structured way. It provides a way to customize and reuse Kubernetes manifests without modifying the original YAML files directly. This allows you to define a “base” configuration and then apply environment-specific or cluster-specific changes using overlays. This avoids duplicating YAML files for different setups.

Flux repository structures

Let’s now look at the repository that Flux created for us. We can see that it’s almost empty, just having a clusters/<your cluster name> folder with a couple of files. So how would we actually deploy something into the cluster? Luckily, the documentation mentions a couple of ways:

Monorepo: All Kubernetes manifests are stored in a single Git repository. Different Environments (e.g., staging, production) are defined as separate overlays in the same branch. The separation of apps and infrastructure allows reconciliation in a specific order. The folder structure looks something like this:
```
├── apps
│   ├── base
│   ├── production
│   └── staging
├── infrastructure
│   ├── base
│   ├── production
│   └── staging
└── clusters
    ├── production
    └── staging
```
Repo per environment: Each environment has its own Repository. This allows you to limit who has access to which environment (typically moot in homelab scenarios). You promote changes using Pull Requests.
Repo per team: Here, each team gets its own repository who they are the sole owner of. Again, this is probably overkill for a homelab.
Repo per app: You can also have one repository per app that you deploy. This makes sense in larger organizations where teams own their workloads.

I’d strongly suggest going down the Monorepo route for a homelab, unless you have a very good reason to deviate from that. It boils down to this: The added flexibility you get from the other models doesn’t outweigh the management overhead you have from implementing that model. This, however might change should you choose to share your homelab with e.g. your friends or partner(s). In this case you may want to isolate parts and only give access to certain repositories.

Implementing the Monorepo

If you haven’t already, clone your private Git repository. Create the following folder structure:

├── apps
│   ├── base
│   └── <your cluster>
├── infrastructure
│   ├── configs
│   └── controllers
└── clusters
    └── <your cluster>

From a high-level point of view, we’re going to build three main Kustomizations:

Infra Controllers are tools that are immediately needed to run the cluster. Examples would be our logging infrastructure, our monitoring tools, our storage driver or the load balancer.
Infra Configs is the configuration of some of the tools. For example, our Load Balancer needs to know in which CIDR blocks to operate, and is doing this with custom resources.
Apps finally contains all applications that are running in our cluster. This Kustomization has a base layer of a “good” default for each app, as well as a folder per cluster, so that you can overwrite some values. A simple example: You have two clusters, one for dev and one for prod. You can use the <your dev cluster> and <your prod cluster> folders to overwrite the specific container images used, the URLs or the container limits, while keeping the rest identical.

Of course, the implicit assumption is that your infrastructure doesn’t change between your different clusters. If you plan on having more than one cluster with different configurations you can adapt the folder structure from /apps to /infrastructure as well.

To get started, in the clusters/<your cluster> folder, create infrastructure.yml:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infra-controllers
  namespace: flux-system
spec:
  interval: 1h
  retryInterval: 1m
  timeout: 5m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/controllers
  prune: true
  wait: true
---

Here we tell Flux about the existence of a Kustomization resource. Its name is infra-controllers and it lives in the flux-system namespace. In order of appearance, we set the following fields:

interval — How often the Kustomization is fetched from the source and reconciled. In our case, a value of one hour is fine, you can set this to a lower value if required. Don’t set it below 60 seconds. The interval needs to be parseable by Go.
retryInterval — If reconciliation fails, the interval at which to try again.
timeout — A timeout for building, applying and health checks during the reconcilation process.
sourceRef — From where Flux can fetch the Kustomization. Since we’re using a monorepo, we reference the flux repository we created earlier.
path — The path to the Kustomization within the repository.
prune — Determines if objects that were created by this Kustomization, but are now missing, should be deleted or kept.
wait — Halts until all reconciled resources have successful health checks.

To get to the previously discussed structure, add another resource to the file:

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: infra-configs
  namespace: flux-system
spec:
  dependsOn:
    - name: infra-controllers
  interval: 1h
  retryInterval: 1m
  timeout: 5m
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./infrastructure/configs
  prune: true

In the same folder, create an apps.yml file with this content:

# =============== apps.yml ===============
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 10m0s
  dependsOn:
    - name: infra-configs
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/<your cluster>
  prune: true
  wait: true
  timeout: 5m0s

For both new files, note the dependsOn field. I also expect apps to change more frequently, which is why I have reduce the interval to 10min.

Finally, create three kustomization.yml in the /infrastructure/{configs, controllers} and /apps/base folders respectively:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources: []

Commit your changes and push them. You can then either take a nice walk and wait for flux to recognize the changes automatically, or force a reconciliation with flux reconcile source git flux-system.

Let’s see what changed!

# Check which sources are available
> flux get sources git 
NAME            REVISION                SUSPENDED       READY   MESSAGE
flux-system     main@sha1:XXXXXXXX      False           True    ...

# Check if our new kustomizations were picked up
> flux get kustomization
NAME                    REVISION                SUSPENDED       READY   MESSAGE
apps                    main@sha1:XXXXXXXX      False           True    ...
flux-system             main@sha1:XXXXXXXX      False           True    ...
infra-configs           main@sha1:XXXXXXXX      False           True    ...
infra-controllers       main@sha1:XXXXXXXX      False           True    ...

You should see that Flux has successfully picked up on our new kustomizations and successfully applied them to the cluster. Right now this is somewhat pointless, because they’re empty.

Flux in Action: A cloud controller and config map for kube-vip

This content is new.

We can use Flux now to finalize the installation of kube-vip from the last article. As we discussed before, kube-vip can also function as a load balancer for services of type LoadBalancer. In order for this to work, we need to tell kube-vip about the CIDR range and install the cloud controller.

For this, create a kube-vip-cloud-controller.yml file in the infrastructure/controllers folder:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-vip-cloud-controller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  name: system:kube-vip-cloud-controller-role
rules:
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["get", "create", "update", "list", "put"]
  - apiGroups: [""]
    resources: ["configmaps", "endpoints","events","services/status", "leases"]
    verbs: ["*"]
  - apiGroups: [""]
    resources: ["nodes", "services"]
    verbs: ["list","get","watch","update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: system:kube-vip-cloud-controller-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:kube-vip-cloud-controller-role
subjects:
- kind: ServiceAccount
  name: kube-vip-cloud-controller
  namespace: kube-system
---

This just creates a user account with the required permissions in the cluster. Next, we create a deployment for the software like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kube-vip-cloud-provider
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: kube-vip
      component: kube-vip-cloud-provider
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: kube-vip
        component: kube-vip-cloud-provider
    spec:
      containers:
      - command:
        - /kube-vip-cloud-provider
        - --leader-elect-resource-name=kube-vip-cloud-controller
        image: ghcr.io/kube-vip/kube-vip-cloud-provider:v0.0.11
        name: kube-vip-cloud-provider
        imagePullPolicy: Always
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      serviceAccountName: kube-vip-cloud-controller
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 10
            preference:
              matchExpressions:
              - key: node-role.kubernetes.io/control-plane
                operator: Exists
          - weight: 10
            preference:
              matchExpressions:
              - key: node-role.kubernetes.io/master
                operator: Exists
---

Finally, we need to tell kube-vip which CIDR ranges to hand out. We do this in the form of config map:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubevip
  namespace: kube-system
  labels:
    app: kube-vip
data:
  range-global: <Start IP>-<End IP>

Please set your start IP and your end IP in a way that they’re reachable in your network.

Finally, include the file in your infrastructure/controllers/kustomization.yml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - kube-vip-cloud-controller.yml

Now you can git commit and git push your changes and either reconcile your changes manually or wait a bit until Flux has pulled the repository. Either way, after a while you should see the pod appear in the namespace alongside its config map (output shortened for brevity):

# Optionally, to force reconciliation. You can also just wait a bit.
$ flux reconcile source git flux-system
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision main@sha1:XXXXX

$ kubectl get cm -n kube-system --selector=app=kube-vip
NAME      DATA   AGE
kubevip   1      2m

$ kubectl get pods -n kube-system --selector=app=kube-vip
NAME                               READY   STATUS    RESTARTS   AGE
kube-vip-cloud-provider-xxx-yyyy   1/1     Running   0          2m

That’s it for now, folks! Next time we’re going to look at implementing persistent storage. Until then!

Note that I show the objects in YAML notation while the API technically deals in JSON. ↩