Deploying bare-metal Kubernetes with Kubespray

Bare-metal Kubernetes Homelab · by

In this article, we’re going to use Kubespray to deploy Kubernetes on our bare-metal CoreOS cluster. We’ll discuss the configuration options. We’ll also cover CRI (container runtime interface) as well as CNI (container networking interface).

First things first: What is Ansible?

One recurring theme in this series is going to be automation. Processes requiring manual intervention are prone to errors—think of the last time you accidentally burned your toast or scrambled your eggs too long. Imagine you’re managing ten web servers hosting various web applications under subdomains of example.com. Each server requires a certificate that needs to be updated and deployed every three months before expiration. While the task is straightforward (copy the new certificate and restart the web server daemon), mistakes can easily creep in. Did you copy the correct certificate? Did you actually restart the service? Which server did you just finish updating? Add to that the challenge of continuity: if your experienced IT admin leaves, are the replacements prepared to handle this consistently and correctly?

Enter Ansible, a powerful tool for automation. With Ansible, you describe the desired state of your systems (e.g., “file: present”, “service: started”) declaratively, and it handles applying that configuration (typically over SSH). This means you can rebuild or reconfigure servers in minutes with minimal effort.

For this article, we don’t need to get into the weeds of Ansible, but it does help to know some important terms:

  • Inventory: A collection of servers managed by Ansible.
  • Roles: Define specific configurations or tasks for servers.
  • Playbooks: YAML files that outline desired states and actions.

For example, you might have twenty servers. Fifteen of them are webservers and seven of them a postgres cluster (with two servers assuming dual roles). In this scenario, you might define the following roles with their corresponding playbooks:

  • Webserver Role: Configure and maintain servers hosting web applications.

    • Deploy updated SSL/TLS certificates.
    • Install and configure the web server software (e.g., Nginx, Apache).
    • Ensure the webserver service is running and properly configured.
    • Set up monitoring and logging for HTTP traffic.
  • Database Role: Manage the servers in the PostgreSQL cluster.

    • Set up primary and replica configurations for high availability.
    • Back up the database to a secure storage location.
    • Apply performance tuning configurations or updates.
    • Monitor replication health and repair issues proactively.
  • Monitoring Role: Deploys monitoring agents, such as for Prometheus.

Kubespray is a collection of Ansible Playbooks to deploy Kubernetes

Our goal of this blog series is to have a running Kubernetes cluster. There are a couple of ways to do it, but the way I always found nice and easiest is using Kubespray.

If you’re interested in looking under the hood, consider Kubernetes the hard way, which are a series of labs to deploy K8s without any scripts. You can also bootstrap clusters with kubeadm, which I personally haven’t done.

Kubespray is a collection of Ansible playbooks to deploy a production-ready cluster. So with that out of the way, let’s get started!

Create a virtual environment

You can also follow along using Kubespray’s documentation.

Ansible is written in Python, and because Python’s dependency management is … great, the first thing we need to do is create a virtual environment and install Ansible as well as some other dependencies:

Although not strictly necessary, if you’re on Windows, I recommend using WSL for this part.
$ git clone git@github.com:kubernetes-sigs/kubespray.git
$ cd kubespray
$ python3 -m venv .venv
$ source .venv/bin/activate # Different if on powershell
$ pip install -U -r requirements.txt

Create an Inventory

The next step is to create an inventory for your nodes. Friendly reminder: At this point your cluster should be running CoreOS and be reachable over the network from the machine you’re working on. Kubespray comes with an example inventory that we can adapt to our needs. To copy it, run:

$ cp -rfp inventory/sample inventory/your-cluster

We end up with a folder structure like this (shortened for the sake of brevity):

inventory/your-cluster
├── group_vars
│   ├── all
│   │   ├── all.yml
│   │   ├── containerd.yml
│   │   ├── coreos.yml
│   │   └── etcd.yml
│   ├── etcd.yml
│   └── k8s_cluster
│       ├── addons.yml
│       ├── k8s-cluster.yml
│       ├── k8s-net-calico.yml
│       ├── k8s-net-kube-router.yml
│       └── k8s-net-macvlan.yml
└── inventory.ini

The inventory.ini is the inventory of our cluster. We’ll edit it in a bit. Let’s take the opportunity to read through the documentation on the folder structure:

The inventory is composed of 3 groups:

  • kube_node: list of kubernetes nodes where the pods will run.
  • kube_control_plane: list of servers where kubernetes control plane components (apiserver, scheduler, controller) will run.
  • etcd: list of servers to compose the etcd server. You should have at least 3 servers for failover purpose.

So what are the individual components? Kubernetes is split into a control plane and nodes. The control plane manages the cluster’s overall state, schedules workloads, and handles communications within the cluster. It consists of components like:

  • API Server: Acts as the gateway for administrative commands and the cluster’s state.
  • Scheduler: Assigns workloads (Pods) to nodes based on resource availability and other constraints.
  • Controller Manager: Ensures that the cluster’s state matches the desired state, managing tasks like replication and node health.
  • etcd: A distributed key-value store that serves as Kubernetes’ backing store for all cluster data.

The nodes are the worker machines where your application workloads (Pods) run. Each node includes:

  • kubelet: Ensures that containers are running in Pods as expected.
  • kube-proxy: Manages networking for Pods, including service discovery and load balancing.
  • Container Runtime: Runs the actual containers (e.g., containerd, CRI-O, or Docker).

Basically, Kubespray allows us to choose which of our servers will assume which roles. It’s possible to run servers that are part of the control plane and act as worker nodes and run etcd (in fact, that’s what we’re deploying today), but you can separate the roles as well.

So let’s create our inventory:

[all]
node1 ansible_host=10.100.0.1 etcd_member_name=k8s-01 ansible_user=ansible
node2 ansible_host=10.100.0.2 etcd_member_name=k8s-02 ansible_user=ansible
node3 ansible_host=10.100.0.3 etcd_member_name=k8s-03 ansible_user=ansible

[kube_control_plane]
node1
node2
node3

[etcd]
node1
node2
node3

[kube_node]
node1
node2
node3

[calico_rr]

[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr
You can also alternatively use YAML as the format of this file.

First, we define the nodes that we want ansible to manage: We give them a name, set their IP (in my case, all of the nodes live in the 10.100.0.0/24 CIDR range). Since these three nodes are going to run etcd, we set their etcd_member_names (if I had one more node not running etcd, this could just be set to an empty string). Finally ansible_user is the user that Ansible will use to SSH into the machine.

Double-check that this user exists (i.e., you set it in your ignition file). You should be able to ssh THIS-USER@your-machine.

Next, we split the hosts into groups: The kube_control_plane will host the control plane, the etcd group will house etcd and the kube_node group will be available as worker nodes. There’s one last special group, calico_rr (calico route reflectors). Unless you have special networking requirements, leave it empty.

Configuring Kubespray variables

Next, it’s time to configure some variables to make our cluster behave in the way we want. For that, Kubespray comes with the group_vars directory.

Settings for all nodes: all.yml

The all/all.yml contains settings that are important for the entire process. Check out the file yourself (in particular if you’re running a proxy, or are using RHEL). One thing you definitely should set is the upstream DNS servers:

upstream_dns_servers:
  - 8.8.8.8
  - 8.8.4.4

The Google DNS servers are sane defaults, you can also use Cloudflare’s 1.1.1.1 or Quad9 if you feel more comfortable with that, it really doesn’t matter.

Container runtimes (CRI)

Container runtimes are the underlying systems responsible for running containers on Kubernetes nodes. Historically, Docker was the dominant runtime, but Kubernetes introduced the Container Runtime Interface (CRI) to standardize how runtimes integrate with Kubernetes. This led to a shift toward CRI-compliant runtimes, and today, the three main options are containerd, CRI-O, and Docker (via the cri-dockerd CRI plugin).

  • containerd: Originally developed as part of Docker, containerd has become a standalone, lightweight, and efficient runtime. It is the default runtime for many Kubernetes distributions due to its close integration with Kubernetes and low resource overhead.

  • CRI-O: Built specifically for Kubernetes, CRI-O is a lightweight runtime focused on simplicity and security. It supports Open Container Initiative (OCI) standards and is often used in environments prioritizing Kubernetes-native solutions, like OpenShift.

  • Docker: While Docker brought containerization to the mainstream, its complexity and non-CRI compliance led to Kubernetes deprecating direct Docker support in version 1.20. Users who prefer Docker now rely on the cri-dockerd plugin or migrate to CRI-compliant alternatives.

To choose the container runtime, edit the k8s-cluster.yml:

## Container runtime
## docker for docker, crio for cri-o and containerd for containerd.
## Default: containerd
container_manager: containerd
Personally, containerd has worked best for me. With cri-o I occasionally had really annoying and obscure errors. Your mileage may vary.

You can the configure the container runtime in the associated YAML file, for example containerd.yml. This can be useful in the case you have a private registry or if you want to change the log format. For my cluster, I left the default values.

CoreOS Settings: coreos.yml

You can use the all/coreos.yml to turn off auto upgrades. I’d advise against it.

etcd settings: etcd.yml

You can also change the etcd deployment type. If you don’t know what you’re doing, don’t change it.

Container Networking Interface (CNI)

The Container Networking Interface (CNI) is a standardized framework for configuring networking in containerized environments. Kubernetes relies on CNI plugins to provide connectivity between Pods and manage networking policies. The choice of CNI plugin can impact performance, scalability, and available features.

Common CNI Plugins

  • Calico: A versatile CNI plugin that supports a wide range of features, including network policy enforcement, IP address management, and optional support for BGP (Border Gateway Protocol). Calico can operate in pure Layer 3 mode, making it highly efficient for large-scale deployments.

  • Flannel: A simple, easy-to-configure plugin that provides basic networking through overlays like VXLAN. It lacks advanced features like network policy support but is lightweight and ideal for small to medium-sized clusters.

  • Weave Net: Known for its simplicity and ease of setup, Weave Net offers automatic IP allocation, encrypted connections, and limited network policy support. It’s a good choice for developers or smaller clusters.

    Be aware that Weaveworks apparently is bancrupt and there’s no support whatsoever for their CNI plugin.
  • Cilium: A modern CNI plugin focused on security and observability. It uses eBPF (extended Berkeley Packet Filter) to enable high-performance networking, advanced policy management, and detailed telemetry.

Kubespray supports those and then some. Unless you have specific requirements, Calico is a good choice, and is what we’ll be using in this project.

This thus leaves us with the last file we should (or at least could) edit, k8s-cluster.yml:

# Choose network plugin (cilium, calico, kube-ovn, weave or flannel. Use cni for generic cni plugin)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: calico

While we’re at it, we can also configure the cluster-internal CIDRs, if needed:

# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18

# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18

Dual-stack operations

You can choose to enable IPv6 for the cluster and optionally run everything in dual-stack mode. This is out-of-scope for this blog series.

Addons

Optionally, Kubespray can install many addons into the cluster. You can finetune this in the addons.yml. While this is a perfectly fine way to handle things, personally, I like to do this using Flux (see the next article for details). That’s why I have everything disabled in the file.

Deploying the cluster

Once we have configured everything to our liking, we can now deploy the cluster to our nodes by running:

$ ansible-playbook -i inventory/your-cluster/inventory.ini cluster.yml \
    -b -v --private-key=~/.ssh/id_rsa  

Lean back. Get a coffee. Walk your cute labradoodle. This is going to take a while.

Getting the admin.conf for the cluster

Once everything is done, we’re almost ready to connect to our cluster! The last thing we need is to somehow get the admin credentials. The easiest way is to ssh to one of your nodes and copy the admin.config from /etc/kubernetes/:

foo@node-1:$ sudo cp /etc/kubernetes/admin.conf ~
foo@node-1:$ sudo chown foo:foo admin.conf
user@workstation:$ scp foo@node-1:/home/foo/admin.conf ~/.kube/config

Edit the config, such that the server points to one of your nodes:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: "" #<CENSORED>
    server: https://10.100.0.1:6443 # <------- CHANGE THIS
  name: cluster.local
contexts:
# ...

Finally, we can test the connection to the cluster by running:

$ kubectl get nodes
NAME    STATUS   ROLES           AGE  VERSION
node1   Ready    control-plane   1d   v1.30.4
node2   Ready    control-plane   1d   v1.30.4
node3   Ready    control-plane   1d   v1.30.4

Closing remarks

Ooof, that was a long way to have a cluster up and running! Congratulations! There’s a couple of considerations to have at this stage:

  • Encryption at rest: Kubernetes knows the concept of secrets, data that should be securely stored (for example, API tokens or TLS certificates). Kubernetes can be configured to not store those in plaintext, but encrypt them at rest. Kubespray by default uses the secretbox algorithm.

    kube_encrypt_secret_data: true
    
  • Cluster hardening: You can follow the Kubespray documentation on hardening your cluster.

  • High Availability: Right now, in your admin.conf you hardcoded the IP or the hostname of one node. While the control plane technically is highly available, when that node goes down, you don’t have access without editing the config. You can check out the docs on HA to change that.

  • Authentication and Authorization: Right now, there’s one user that can sign into the cluster, which is the super mega admin. Consider using other authentication methods, such as OpenID Connect.

  • Removing nodes: You might want to remove nodes (or replace them). To add new nodes you can simple re-run cluster.yml. To remove them, use the remove-node.yml playbook.

In the next article in the series, we’ll set up Flux to have a working GitOps pipeline!

Previous article

Next article

Using Flux for GitOps