Docker inside Airflow when running via Docker Compose

Srujan Deshpande
Towards Dev
Published in
3 min readJan 4, 2022

--

Logos of Airflow and Docker

Apache Airflow is a workflow management system that is remarkably easy to use and get started with.

A very simple way of getting started with Airflow is by running it through Docker. A tutorial is on the official Airflow docs and a docker-compose file is even provided as a great starting off point.

Using Airflow in a container

Most tasks work out of the box when working inside a container apart from DockerOperator. Using DockerOperator inside a container requires docker to be installed. You may even be using Docker inside other tasks.

A perpetually spinning top from the movie Inception
Who knows what dind will spin up?

Installing Docker inside the Airflow container is possible but doesn’t make a lot of sense to run docker-in-docker as that would add unnecessary complexity. A better solution is to somehow use the host machine’s Docker daemon, but control it through the Airflow container.

Setting up Docker in Airflow

It’s fairly simple to use the host machine’s docker inside Airflow. There are 2 main changes we have to make:

  1. Install the docker-cli client in the Airflow image
  2. Mount the Docker socket to the container

Installing the Docker client in the Airflow image

In order to install the Docker client, we need to create our own Dockerfile. I’ll be using the apache/airflow:2.2.3 image, but you are free to use any or integrate it with existing docker files. The installation method is inspired from here.

Dockerfile

Mount the Docker socket to the container

When running on a Linux machine, the default location for the docker socket is /var/run/docker.sock We can mount this socket to the container directly. That way the container uses the host’s Docker daemon instead.

Couple of things to note in the above docker-compose file.

  1. Here, the image is being built in line 4. You can pre-build the image and use it in line 5 instead.
  2. The volume on line 7 mounts the host’s Docker socket to the container. Now, any docker commands run inside the container will run as if they were run in the host.
  3. Docker requires the user to be root or be present in the docker group. Here, the default user is defined as root in line 8. You can also replace this with a different non-root user as long as they are in the docker group, and the group is specified. Ex: 1001:998 (The user and group ID will be different)
  4. This volume, image, and user can also be directly updated in the airflow-worker service in the docker-compose.yml. This prevents the changes from affecting other Airflow containers.

Conclusion

Mounting the host’s Docker socket to Airflow is a simple and straightforward way of installing and running Airflow in containers while maintaining full functionality.

--

--