Discover sidecar containers in Kubernetes
Ep #6: What are sidecar containers in Kubernetes and how they can enhance your applications?
Introduction
This article provides an introduction to the concept of sidecar containers in Kubernetes but also discusses some advanced use cases that even an experienced developer might not be aware of.
We will first give some definitions, then provide some resources to learn more about multi-container pods, and finally delve into a complex manifest that will teach us some lessons about logging in Kubernetes.
In this article, we are interested in the concept of sidecar containers as in the sidecar pattern used in Service Meshes like Istio, more than the concept of SidecarContainers as a specialised type of init containers that was introduced a couple of weeks ago. I discussed this new Kubernetes feature at Kubernetes adds SidecarContainers feature.
The only requirement to fully replicate the setup provided below is that you have a Kubernetes cluster either locally on your laptop or remotely on the cloud and that you already know how to run a Pod in Kubernetes. That is enough to fully appreciate what I have to teach you.
What is a sidecar container?
You might have used Kubernetes for a while and never found the need for more than a single container in a Pod. Well, I have to tell you, there is so much more that you can do with a single pod that just running your application.
In order to properly explain the concept of sidecar containers, we have to first discuss the concept of Cgroups and Namespaces in Linux.
From Wikipedia, we can read about cgroups:
cgroups (abbreviated from control groups) is a Linux kernel feature that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, etc.) of a collection of processes.
while we can read about namespace
Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources.
If you have more than a single container in a Pod, each container will have its own cgroup but it is also able to share namespaces with all the other containers in the same Pod.
By default, containers share the network namespace and so they can reach each other at localhost
. We will see this feature in action later in this article together with an example of how two or more containers can share other namespaces like the Process namespace.
Use cases for sidecar containers
There are quite a few different use cases for the sidecar pattern:
By sharing the network namespace, you can add HTTPS support to an insecure Legacy web service that is serving traffic only from localhost
By sharing the PID namespace, you can add a monitoring endpoint to a legacy application on a separate HTTP port
By sharing the File System namespace, you can sync a remote config file to the local filesystem of an application and then trigger a config reload via Linux signals
What all those use cases have in common is that you can add new functionality to legacy applications by using sidecar containers instead of changing the source code of the original application. The caveat is that you will end up with a distributed system that is more complex to debug, but with the advantage that you can offload those extra functionalities to a third-party application in a completely different programming language from the main application.
More about these use cases and the relative Kubernetes manifests can be found in Designing distributed systems.
Disclaimer: The previous link is an affiliate marketing link from the Amazon Affiliate program.
A complex use case
The manifest provided below is quite an artificial use case that I wouldn't suggest in production. It can be used mostly for education purposes since there are better ways to achieve the same results and more efficient container images.
I've provided it here since it can teach us quite a few lessons:
what is a sidecar container in Kubernetes?
how to generate sample logs for Nginx?
what is the default behaviour for logs in Kubernetes?
how do you redirect logs from stdout/stderr to the file system?
how to share process namespaces between containers in a Pod?
how to use volumes and volume mounts to write into a Kubernetes node from a Pod?
We provide first the K8s manifest so that you can refer to it in the following notes
apiVersion: v1
kind: Pod
metadata:
name: sample-logs
namespace: default
labels:
app: sample-logs
spec:
shareProcessNamespace: true
containers:
- name: client
image: curlimages/curl:7.84.0
args:
- /bin/sh
- -c
- >
i=0;
while true;
do
echo "$(date) INFO $i"
curl -s http://localhost:80 2>&1 > /dev/null
i=$((i+1));
sleep 1;
done
- name: nginx
image: nginx:1.23.1-alpine
ports:
- containerPort: 80
- name: tail-logs
image: busybox:1.28
args: [/bin/sh, -c, "more /proc/$(ps aux | grep 'nginx: master' | head -n 1 | cut -d ' ' -f4)/fd/1 | tee /var/log/sample-logs"]
volumeMounts:
- name: varlog
mountPath: /var/log
volumes:
- name: varlog
hostPath:
path: /var/log
type: "Directory"
A few notes about the above manifest:
A single Pod, no deployment or service since we are not exposing the main application to an end-user. We are only interested in generating some sample logs
One main container with a sample application. Here we use an Nginx web server but you can easily swap this image with anything else. Also If I was running this workload in production I would have probably suggested using a more secure image like cgr.dev/chainguard/nginx from Chainguard. Since that could have impacted the setup of the other containers we are still using this image here.
One sidecar container that acts as a client for our main application. The client is implemented with a bash script that every 1-second
curls
the web service endpoint `http://localhost:80`. Here we use the imagecurlimages/curl
since we only need thecurl
command to implement our client. As you can see there is nothing that enables the sharing of the network namespace explicitly. You could have implemented the same functionality with a separate manifest with a Cron Job object. That would be the preferred way to implement a client, especially if you wanted to provide external access to an end-user but also create some synthetic traffic. As we said already, this is just for educational purposes.Another sidecar container that read the standard output file from the main application and writes the results into a file at
/var/log/sample-logs
. Here we used the base imagebusybox
since we need more than a single command line tool. There is quite a lot to unpack from this container. First, we are grepping for the PID of the Nginx web server with the commandps aux | grep 'nginx: master' | head -n 1 | cut -d ' ' -f4
, then we are reading the content of the stdout for that process/proc/<pid>/fd/1
, and finally, we are reading the content of those logs and sending them both to the stdout of the same sidecar container and to disk with the commandtee /var/log/sample-logs
. The only reason why we can grep the PID of the Nginx web service from another container is that we enabled sharing the PID namespace with the configshareProcessNamespace: true
.Finally, we create a volume of type
Directory
at host path/var/log
and then we mount this volume into the last sidecar container. This allows the last container to write into the Kubernetes cluster node when it writes to its local file system.
A couple of notes about why we have to have quite a complicated second sidecar container:
Kubernetes by default redirect the stdout of a container to
/var/log/containers/<container_id>
on the cluster node where the Pod is running without a need for the container to be aware of this. More about how logging works in Kubernetes at Logging Architecture. So in theory we wouldn't have needed the second sidecar container if we were happy to have the logs at the default location.Nginx web server writes by default the access logs at
/var/log/nginx/access.log
and the error logs at/var/log/nginx/error.log
. In the image we used here though, the Dockefile redirects those files respectively to/dev/stdout
and/dev/stderr
with the following commandsln -sf /dev/stdout /var/log/nginx/access.log && ln -sf /dev/stderr /var/log/nginx/error.log
. This is quite a standard practice so that the logs will be automatically collected by the Kubernetes logging architecture and written to disk at the default local/var/log/containers/<container_id>
. While this might sound convenient, it also gets in the way if we only want to read the access logs and not the error logs as well. By using this sidecar container, we can still only tail the access log only and send them to a custom location. This last point is the only way we can split access logs from error logs in a simple way.
Conclusion
As mentioned before, the Kubernetes manifest provided above was just meant to be used for education purposes and it is no indication of a production-ready use case.
I hope that you learn a few lessons from this same use case.
Want to connect?
👉 Follow me on LinkedIn and Twitter.
If you need 1-1 mentoring sessions, feel free to check my Mentorcruise profile.