Deploying a Local Kubernetes Cluster on Ubuntu Servers#

When I set up my local Kubernetes for the first time, I had to go through a lot of trial and error to get it right. There were some blog posts on setting up Kubernetes, but they were either outdated or lacked the explanation I needed. As I went through the tedious process of learning everything, I wrote down steps in my personal notebook. I figured I might as well reorganize these notes and share them with you. This is a summary of how to deploy a Kubernetes cluster locally.

I will be demonstrating the setup with two virtual machines running Ubuntu 24.04 Server. The first virtual machine will be the control plane node, and the second will be the worker node. I will be installing Kubernetes v1.30. For the container runtime, I will be installing Containerd 1.7.12. For the CNI plugin, I will use Flannel. Ensure you also have an adequate amount of RAM and CPU cores to run the cluster.

Preparing Host Machines#

You may find installation guides on the internet including the official ones inconsistent with each other when it comes to configuring the host machines. Some settings are mandatory on one type of setup, but not on another. It stems from the fact that different components of Kubernetes require different dependent modules for different settings. Even if some modules are not needed, depending on the version of Kubernetes you are installing, they might be required in Kubernetes preflight checks that would raise an error if the modules are not detected. I think it is important to stay informed about the changes in the evolving Kubernetes ecosystem. The following are the most crucial settings that I have found working for setting up a current or recent versions of Kubernetes. Unless you’re certain that you need to deviate from these steps, you may configure the following manually.

0. Update Packages#

Up and foremost, you may want to update the package index and upgrade the install packages to the latest versions.

Command: update apt#
sudo apt update && sudo apt dist-upgrade

1. Disable Memory Swap#

Memory swap allows your system to use disk space as an extension of physical RAM by moving less frequently accessed memory pages to disk. According to the official documentation and a dev note of swap support, Kubernetes did not provide support for swap memory on Linux systems until version 1.22 alpha. This was due to the inherent difficulty in guaranteeing and accounting for pod memory utilization when swap memory was involved. For now unless you have a compelling reason to not do so, I recommend you should follow what most do, and disable swap memory on all Kubernetes nodes’ hosts systems.

You can disable swap by running the following command.

Command: disable swap and remove swap entry from /etc/fstab#
sudo swapoff -a

sudo sed -i '/\sswap\s/s/^/#/' /etc/fstab

The first command disables all swap spaces currently in use. The second command comments out any swap entries in the /etc/fstab file, which prevents swap spaces from being automatically mounted at boot. After running the above commands, you can execute cat /proc/swaps to verify that there is no active swap.

2. Configuring Hosts for Cluster Networking#

To facilitate networking within your Kubernetes cluster, you need to configure certain settings on your host machines. Kubernetes delegates network management to CNI (Container Network Interface) plugins, so we must ensure that all nodes in the cluster are properly configured to allow the chosen CNI plugin to manage networking. These configurations may vary depending on your choice of CNI plugin. In this case, I’ll be using Flannel.

First, we need to manually preload a kernel module called br_netfilter on all nodes. br_netfilter module manages VxLAN traffic, which is important for inter-pod communication across nodes in the same cluster.

Command: enable br_netfilter loading at boot time and load br_netfilter#
echo "br_netfilter" | sudo tee -a /etc/modules-load.d/k8s.conf

sudo modprobe br_netfilter

# check if br_netfilter module is loaded
lsmod | grep br_netfilter

The /etc/modules-load.d/ directory is used by systemd to load kernel modules at boot time. You may verify that the module is loaded by running lsmod | grep br_netfilter.

Second, you need to enable IP forwarding by setting net.ipv4.ip_forward = 1, and enable packets passing to iptables for various networking functions.

Command: configure IP forwarding and enable iptables#
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

# verify the settings:
sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward

Installing a Container Runtime#

Container runtime is a crucial part of Kubernetes, it takes care of lower level container operations such as creating or stopping containers; whereas Kubernetes is a higher level orchestration system that gives the container runtime instructions. Because there is a standardized interface between them, Kubernetes can work with any container runtime that implements the interface. Popular container runtimes supported by Kubernetes include Containerd, Docker Engine, and CRI-O. You need to install a container runtime on each node within the cluster, I will be using Containerd.

1. Loading Dependent Modules for Container Runtime#

The container runtime you choose may require some kernel modules to be loaded, although it is possible that container runtime may load it automatically. One module Containerd uses is overlay. The overlay kernel module is a filesystem feature that enables the creation of layered filesystems, which facilitates storage management in container environments. You can manually load the overlay module by running the following commands.

Command: enable overlay loading at boot time load overlay module#
echo "overlay" | sudo tee -a /etc/modules-load.d/k8s.conf

sudo modprobe overlay

# check if overlay module is loaded
lsmod | grep overlay

This block appends a line overlay to the file /etc/modules-load.d/k8s.conf, which tells the system to load the overlay module at boot time, and load overlay immediately.

2. Installing Containerd#

Installing Containerd is straightforward. You can install it from the official repository by running the following command.

Command: install Containerd#
sudo apt install containerd

By executing systemctl status containerd, you can verify that the service is running. The output should look like the following.

 1● containerd.service - containerd container runtime
 2     Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; preset: enabled)
 3     Active: active (running) since Fri 2024-07-19 07:31:40 UTC; 4s ago
 4       Docs: https://containerd.io
 5    Process: 5649 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
 6   Main PID: 5658 (containerd)
 7      Tasks: 18
 8     Memory: 14.8M (peak: 16.9M)
 9        CPU: 133ms
10     CGroup: /system.slice/containerd.service
11             └─5658 /usr/bin/containerd

As one may see at Process attribute, Containerd actually helped loading overlay before starting it main service.

3. Configuring the Systemd Cgroup Driver#

Both the kubelet and the container runtime utilizes cgroups to manage resources for pods and containers, e.g., setting CPU limits for workloads. Both the kubelet and container runtime must use the same cgroup driver. According to the official documentation, if your system has systemd as init system and employs cgroup v2, you should use the systemd cgroup driver instead of cgroupfs. If you create a cluster with kubeadm, the kubelet will use the systemd cgroup driver by default. That means, you need to configure the container runtime, in my case Containerd, to use the systemd cgroup driver as well.

Command: configure Containerd to use systemd cgroup driver#
sudo mkdir /etc/containerd
# init a custom config file with default settings:
containerd config default | sudo tee /etc/containerd/config.toml

# modify the values in the configuration file:
sudo nano /etc/containerd/config.toml
# set the value of SystemdCgroup to true:
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]

    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
      ...
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
        ...
        SystemdCgroup = true  # set to true

# restart
sudo systemctl enable --now containerd
sudo systemctl restart containerd

The commands above initialize a configuration file with default settings and then modify it to enable SystemdCgroup. The default file path of custom Containerd configuration file is /etc/containerd/config.toml. You should restart Containerd after making the change.

4*. Replacing Sandbox Image#

Due to my restricted internet, which prevents me from reaching registry.k8s.io, I can’t start my cluster with the default sandbox image when running kubeadm init. As a result, I need to replace the default sandbox image URL in the Containerd configuration file. I’m changing it from registry.k8s.io/pause:3.8 to an alternative source for the sandbox_image flag - in my case, a mirrored image on aliyuncs. You may skip this step if you don’t have similar restrictions.

/etc/containerd/config.toml sanbox_image#
# modify the following:
[plugins."io.containerd.grpc.v1.cri"]
  sandbox_image = "registry.k8s.io/pause:3.2"

Alternatively, you can use the following command to replace the default sandbox image URL in configuration file.

Command: replace sandbox image URL#
sudo sed -i 's/registry.k8s.io/registry.aliyuncs.com\/google_containers/' /etc/containerd/config.toml

sudo systemctl restart containerd

You should restart Containerd for the change to take effect.

Installing Kubernetes Packages#

Now, we can install the Kubernetes packages.

1. Downloading the Kubernetes GPG Key#

The package manager will use this GPG key to verify the public signature on Kubernetes packages for the installation. We should download the GPG key and place it in the /etc/apt/keyrings directory.

Command: download the Kubernetes GPG key#
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

sudo chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg

2. Adding a Kubernetes Apt Repository#

The following adds the official Kubernetes repository to your system’s package sources. Ensure you set signed-by flag to the path of the key we downloaded in the previous step.

Command: add Kubernetes apt repository#
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo chmod 644 /etc/apt/sources.list.d/kubernetes.list
/etc/apt/sources.list.d/kubernetes.list#
1deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /

3. Install kubectl, kubelet, and kubeadm#

Install the necessary components of Kubernetes, which are kubectl, kubelet, and kubeadm.

Command: install Kubernetes packages#
sudo apt update

sudo apt install -y kubelet kubeadm kubectl

Setting Up Controller Nodes#

Warning

Everything up to this section should be installed and configured in all nodes. Setups in this section are only applied to controller nodes. You should not run these commands on worker nodes.

Now we will start setting up the controller node, which is the first node in the cluster. This type of nodes are responsible for managing the entire cluster by hosting the control plane components. Instructions are sent from controller nodes to worker nodes, in turn, worker nodes execute the application workloads.

We will be setting up 1 controller node.

1. Initializing the Kubernetes Control Plane#

The command sudo kubeadm init initializes a Kubernetes control plane node, which will establish the necessary system components.

Command: initialize the control plane node#
sudo kubeadm init --control-plane-endpoint=192.168.1.179 --node-name=kube-controller --pod-network-cidr=10.244.0.0/16 --image-repository='registry.cn-hangzhou.aliyuncs.com/google_containers'
  • --control-plane-endpoint: Set this to the IP address of the control plane node.

  • --node-name: Set this to the name or hostname of the node.

  • --pod-network-cidr: Use this to set the CIDR range for the pod network. If you’re using Flannel for cluster networking, use 10.244.0.0/16.

  • --image-repository: Use this flag to pull component images from a different registry if you have access restrictions to the default Kubernetes registry.

Now we enable the kubelet service.

Command: enable kubelet service#
sudo systemctl enable --now kubelet

The following commands allow us to manage the Kubernetes cluster as a regular user.

Command: copy kubeconfig to user’s home directory#
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

The second line copies an initialized config file that contains the necessary credentials and cluster information to access the Kubernetes API server. The third line ensures that the current user can read and modify the config file without needing sudo privileges.

2.a Installing a CNI Plugin#

Kubernetes requires a network plugin to enable communication between pods across different nodes. We will be using Flannel, which is a popular choice for Kubernetes networking. To install Flannel in Kubernetes, you simply need to apply the Flannel configuration file.

Command: apply Flannel configuration directly#
  kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

After applying the configuration, we can check if the system pods are running properly. Ensure the two coredns pods are not in a pending state. If they are, you may need to check the logs of these pods to identify any issues. If you encounter problems similar to what I experienced, refer to the next subsection for troubleshooting steps.

kubectl get pods -n kube-system (healthy)#
1user@test-kube-master:~$ kubectl get pods -n kube-system
2NAME                                      READY   STATUS    RESTARTS   AGE
3coredns-7c445c467-9cxtt                   1/1     Running   0          15h
4coredns-7c445c467-tkxk6                   1/1     Running   0          15h
5etcd-kube-controller                      1/1     Running   0          15h
6kube-apiserver-kube-controller            1/1     Running   0          15h
7kube-controller-manager-kube-controller   1/1     Running   0          15h
8kube-proxy-wr8p2                          1/1     Running   0          15h
9kube-scheduler-kube-controller            1/1     Running   0          15h

2.b Configuring Kubernetes to Pull Flannel from a Private Registry#

Again, the above should do fine for most of us. However, if you’re in a situation similar to mine, where server nodes cannot access docker.io, you may need to pull the Flannel images from a mirror registry or a private registry hosted in a node with VPN access that can pull images freely. I’ll be demonstrating the latter approach. I also think the inclusion of this subsection is helpful because being unable to pull from a private image registry straight away is a common problem that many people face when setting up a Kubernetes cluster.

kubectl describe pod kube-flannel-ds-4cv7k -n kube-flannel (what the error looks like)#
 1Events:
 2  Type     Reason     Age                 From               Message
 3  ----     ------     ----                ----               -------
 4  Normal   Scheduled  2m10s               default-scheduler  Successfully assigned kube-flannel/kube-flannel-ds-4cv7k to kube-controller
 5  Warning  Failed     92s                 kubelet            Failed to pull image "docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1": rpc error: code = DeadlineExceeded desc = failed to pull and unpack image "docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1": failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/0b/0b2af0d159717fee418c744b3ec2d6ad0478aa9180987a06a790aca4adbe3b56/data?verify=1721409422-hLd%!F(MISSING)trChaBB5UnqjMq67%!B(MISSING)tNVVTs%!D(MISSING)": dial tcp 103.200.31.172:443: i/o timeout
 6  Warning  Failed     41s (x2 over 92s)   kubelet            Error: ErrImagePull
 7  Warning  Failed     41s                 kubelet            Failed to pull image "docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1": rpc error: code = DeadlineExceeded desc = failed to pull and unpack image "docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1": failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/0b/0b2af0d159717fee418c744b3ec2d6ad0478aa9180987a06a790aca4adbe3b56/data?verify=1721409472-8RcRSX%!B(MISSING)miU91pM5g1odJQzJPiZE%!D(MISSING)": dial tcp 103.200.31.172:443: i/o timeout
 8  Normal   BackOff    26s (x2 over 91s)   kubelet            Back-off pulling image "docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1"
 9  Warning  Failed     26s (x2 over 91s)   kubelet            Error: ImagePullBackOff
10  Normal   Pulling    11s (x3 over 2m9s)  kubelet            Pulling image "docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel1"

Check out my other post on setting up a private Docker registry. I’ll assume you already have a private Docker registry, and your Kubernetes cluster is able to reach the hosting server. Ensure you’ve imported the self-signed certificate, allowing you to successfully pull images from the private Docker registry. You can verify it by running openssl s_client -connect registry-server:5000 -showcerts </dev/null.

To proceed, log on any machine in the same network with free access to the public registry and pull the Flannel images from the public registry. Next, you need to tag and push the Flannel images to your private registry. You can accomplish these by running the following commands.

Command: tagging and pushing Flannel images#
# pull the Flannel images from the public registry:
sudo docker pull flannel/flannel:v0.25.4
sudo docker pull flannel/flannel-cni-plugin:v1.4.1-flannel1

# tag the pulled images and include the private registry address:
sudo docker image tag flannel/flannel:v0.25.4 registry-server:5000/flannel/flannel:v0.25.4
sudo docker image tag flannel/flannel-cni-plugin:v1.4.1-flannel1  registry-server:5000/flannel/flannel-cni-plugin:v1.4.1-flannel1

# push the tagged images to the private registry:
image push registry-server:5000/flannel/flannel:v0.25.4
image push registry-server:5000/flannel/flannel-cni-plugin:v1.4.1-flannel1

After pushing the Flannel images to your private registry, we need to download the default Flannel configuration file, and Modify the configuration file to use the private registry for pulling the Flannel images:

a. Cut the namespace creation part from the configuration file and save it in a separate file:

kube-flannel-namespace.yaml#
apiVersion: v1
kind: Namespace
metadata:
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
  name: kube-flannel

b. Create the name space and create a Kubernetes secret in the flannel namespace for accessing the private registry:

Command: create a namespace and secret for Flannel#
# create a namespace for flannel:
sudo nano kube-flannel-namespace.yaml
# copy everything from above into the editor and save it.

# create a namespace for flannel:
kubectl apply -f kube-flannel-namespace.yaml

# create a Kubernetes Secret for accessing private registry:
kubectl create secret docker-registry <secret-name> \
--namespace kube-flannel \
--docker-server=<your-registry-server> \
--docker-username=<username to log into the docker registry> \
--docker-password=<password to log into the docker registry>

# see the secret:
kubectl -n kube-flannel get secret private-docker-registry --output=yaml

c. Modify the image field to pull from the private registry, and include secret of our private registry log in information:

modified-kube-flannel.yaml (partial)#
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
    ...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - args:
        ...
        name: kube-flannel
        image: registry-server:5000/flannel/flannel:v0.25.4
        ...
      initContainers:
      - args:
        ...
        name: install-cni-plugin
        image: registry-server:5000/flannel/flannel-cni-plugin:v1.4.1-flannel1
        ...
      - args:
        ...
        name: install-cni
        image: registry-server:5000/flannel/flannel:v0.25.4
        ...
      imagePullSecrets:
        - name: private-docker-registry
      ...

You can now initiate Flannel with the following command now.

Command: apply modified Flannel configuration#
kubectl apply -f modified-kube-flannel.yaml

Afterwards, check if the system pods (especially the coredns pods) are running properly with kubectl get pods -n kube-system.

Setting Up a Kubernetes Cluster On Worker Nodes#

Warning

Setups in this section are only applied to worker nodes. Make sure you have completed the universal setups (Preparing Host Machines, Installing a Container Runtime & Installing Kubernetes Packages).

Assuming that you have all previous steps correctly, you should now be able to join the worker nodes to the cluster. If you are using a private image registry, ensure the related setups are also completed on the worker nodes as well.

Now on your controller node, run the following command.

Command: create new join command for worker nodes (on controller nodes)#
sudo kubeadm token create --print-join-command

This command generates a new token and provides the complete join command that can be used to add new worker nodes to the existing cluster.

Copy the output command and run it with sudo on each worker node. The command will look similar to this:

Command: join worker nodes to the cluster (on worker nodes)#
sudo kubeadm join 192.168.1.179:6443 --token djbije.06tw1yghp5sqpcml --discovery-token-ca-cert-hash sha256:ed972e37c2dfec7b3a0daefd04948622bc5829deeb6e9d4cf7fe2c28c277008b

You should get a message in the terminal that says this node has joined the cluster. To add more worker nodes, simply repeat these steps on each new worker node.

Final Status Check#

Finally, we want to verify that the node has indeed joined the cluster by running kubectl get nodes on the controller node.

Command: check the status of the nodes#
  kubectl get nodes
  kubectl describe nodes

If the output shows the nodes are in the Ready state, then it means the node has successfully joined the cluster.

kubectl get nodes#
1user@test-kube-master:~$ kubectl get nodes
2NAME                 STATUS   ROLES           AGE   VERSION
3kube-controller      Ready    control-plane   18h   v1.30.3
4test-kube-worker-1   Ready    <none>          11s   v1.30.3

That is it, we have built a Kubernetes cluster.

Tearing Down a Kubernetes Cluster#

In this section, I will show you how to uninstall components of Kubernetes and remove the cluster. This can also be helpful if you did not install the cluster correctly and want to start over.

1. Removing a Node from a Cluster#

You may want to remove an entire node from the cluster. Run the following on the controller node.

Command: remove a worker node (controller Node)#
kubectl drain test-kube-worker-1 --delete-local-data
kubectl delete node test-kube-worker-1
# kubectl delete node test-kube-worker-1 --force --grace-period=0

The first command basically evicts all pods from the node and deletes the local data as well. The second command tells Kubernetes to delete and remove the node from the cluster. If you want to force delete the node, you may add flags --force --grace-period=0 to the second command.

Then log into the worker node and run kubeadm reset.

Command: reset a node (worker Node)#
kubeadm reset

It will stop the services that kubeadm set up, remove all related files, and reset the network configuration.

2. Resetting The Entire Cluster#

After removing all worker nodes, you may also want to reset the entire cluster for a fresh start. Run the following on the controller node.

Command: reset the controller node#
sudo kubeadm reset
sudo rm -rf /etc/kubernetes && sudo rm -rf $HOME/.kube && sudo rm -f /etc/cni/net.d/*flannel*

Just as we did on the worker node, we will also need to remove the left-over files that Kubernetes created, remove the configurations we copied to HOME/.kube, and delete the Flannel configuration files. If you are not using Flannel, you may need to remove the CNI plugin directory that you are using. These ensure you don’t have conflicting configurations when you reinstall the cluster.

If you no longer need Kubernetes, you can also remove the Kubernetes packages from the system.

Command: remove Kubernetes packages#
sudo apt-get remove kubectl kubeadm kubelet