A Beginner’s Guide to Docker

Purpose for this guideline

This guide is intended to provide an overview of what Docker is, how it’s used, and the basics of running Docker containers. It will not go in depth on creating a Docker image, or on the more nuanced aspects of using Docker. For a more in-depth introduction, you can read through the official Docker docs.

Overview

Docker is a tool for containerizing code. You can basically think of it as a lightweight virtual machine. Docker works by defining an image which includes whatever you need to run your code. You start with a base image, which is a pre-made Docker image, then install your dependencies on top. Python? Java? Fortran libraries? Almost anything you can install into a normal computer, you can install into Docker. There are plenty of base images available. You can start with something as basic as Arch linux, or as complicated as a Windows base image with Python already installed.

Once you have created your Docker image, it can be uploaded to LASP’s internal registry for other people or machines to use. Every machine runs the Docker image in the same way. The same image can be used for local development, for running tests in Jenkins or GitHub Actions, or for running production code in AWS Lambdas. It creates a standard environment, so new developers can get started quickly, and so everyone can keep their local environments clean. Docker also makes it possible to archive the entire environment, not just the code. Code is only useful as long as people can run it. Finally, unlike many virtual machines, Docker is lightweight enough to be run only when needed, and updated frequently.

Basics of Docker

If you’ve used Virtual Machines in the past, the basic uses of Docker will be familiar to you. A few terms are defined below. For a more in-depth explanation, see the official Docker overview.

Docker Image: The Docker image contains all the information needed to run the Docker container. This includes the entire operating system, file system, and dependencies.

Docker Container: A Docker container is a specific instance of a Docker image. A Docker container is used to run commands within the environment defined by the Docker image.

Dockerfile: The dockerfile is what defines a Docker image. It contains the commands for building a Docker image, including things like the base image to use, the installation steps to run, creating needed directories, etc.

Docker Compose: A Docker compose file is an optional file which defines how to run the Docker images. This can be useful if you will be running multiple images in tandem, attaching volumes or networks to the containers, or just generally find yourself running the same commands for creating containers and want to optimize that.

Docker Registry: A registry or archive store is a place to store and retrieve docker images. This is one way to share already-built docker images. LASP has a private repository, in the form of the LASP docker registry.

So, you define a Docker image using a Dockerfile and/or a Docker Compose file. Running this image produces a Docker container, which runs your code and environment. An image can be pushed up to a registry, where anyone with access can pull the image and run the container themselves without needing access to the Dockerfile.

Getting Started

This section will outline some basic commands and use cases for Docker. First, you need to install Docker on your computer. Next, start by creating a dockerfile. This example dockerfile will run an alpine image and install Python. Traditionally, dockerfiles are named Dockerfile, although you can append to that if needed (eg, dev.Dockerfile). The docker build command will look in the current directory for a file named Dockerfile by default, but you can specify a different file though command line arguments or through your docker compose file.

Generally, each Docker image should be as small as possible. Each Dockerfile should only do one thing at a time. If you have a need for two extremely similar docker containers, you can also use Multi-stage builds. You can orchestrate multiple docker containers that depend on each other using Docker compose.

To start, your Dockerfile should specify the base image using FROM .. Then, you can set up the environment by using RUN commands to run shell commands. Finally, you can finish the container by using a CMD command. This is an optional command that will run once the entire container is set up.

Here is our example Dockerfile:

# Starting with alpine as our base image
FROM alpine

# Install python
RUN apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python
RUN python3 -m ensurepip
RUN pip3 install --no-cache --upgrade pip setuptools

In the same folder, we run the build command to build our image:

docker build --platform linux/amd64 -f Dockerfile -t docker_tutorial:latest .

The flag –platform linux/amd64 is optional unless you are running an M1 chip mac. The -f flag indicates the name of the Dockerfile – in this case, it is also optional, since Dockerfile is the default value. The -t flag is a way to track the docker images and containers on our system by adding a name and a tag. latest is the tag used to indicate the latest version of a Docker image. Additional useful flags include --no-cache for a clean rebuild, and you can find a full list of flags here.

Now that we have built the image, we can see all the Docker images that are built on our system by running the docker images command:

$ docker images
REPOSITORY                       TAG       IMAGE ID       CREATED         SIZE
docker_tutorial                  latest    71736be7c555   5 minutes ago   91.9MB

Info: If you prefer to use a GUI, the Docker Desktop application can also be used to view, run, and delete docker images.

If we wanted, we could now push that image up to a registry by using the docker push command. Alternatively, instead of building the image, you could pull an existing image using the docker pull command.

Now that we have an image locally, we can run a container from that image using the docker run command:

docker run --platform linux/amd64 -it --name tutorial docker_tutorial:latest

Once again, the platform is optional, unless you are on an M1 mac. The -it flag opens an interactive tty session – basically so you can interact with the container via the command line. The --name flag gives the container a name. Another key flag to know is -d, which runs the container in detached mode. This will let the container run in the background without attaching to your terminal. You can see all currently running Docker containers with docker ps, and all currently existing Docker containers with docker ps -a .

Running the docker run command will start your container and connect to it, so you can interactively run commands. If you run which python in this container, you should see that Python is successfully installed. You can use ^D to detach from the container and stop it.

With that, you have successfully run the Docker container! This is a good way to debug and run code inside a container for development purposes. If you want to have the Docker image automatically execute code when you run it, we can use the CMD command. For example, this can be used to run tests or the main application for a lambda container.

To do this, add a line with a CMD at the bottom of your Dockerfile:

CMD echo "Hello world"

Once you build the container, you can run it without the interactive session:

docker run --platform linux/amd64 docker_tutorial:latest

This will run once, execute the command in CMD at the end, and then exit the container. You can see that the container has successfully exited with docker ps -a. The CMD is how most Docker containers that run code without human intervention work. For an example of a system where that’s operating, you can read the documentation on the TIM tests in Docker.

Docker Cheat Sheet

Here is a list of Docker commands that might be useful to have as a shorthand:

# build locally
docker build --platform linux/amd64 -f <filename> -t <name>:latest .

# Run in interactive mode
docker run --platform linux/amd64 -it --name <container name> <image name>:latest

# Login to docker registry
docker login <registry_hostname>

# View docker images
docker images

# View docker containers
docker ps -a

# Remove stopped containers
docker container prune

# Remove dangling images (run after container prune)
docker image prune

Acronyms

  • apk = Alpine Package Keeper

  • amd64 = 64-bit Advanced Micro Devices

  • AWS = Amazon Web Services

  • pip = Pip Installs Packages

  • ps = Process Status

  • tty = TeleTYpe (terminal)

Credit: Content taken from a Confluence guide written by Maxine Hartnett