Working with Docker¶
In this section, we will learn to create and use our own docker container.
Note
Prerequisites: 1. Have Docker installed on your laptop 2. Create a Docker Hub Account
Building Images From a Dockerfile¶
- Dockerfiles are a reproducible and well documented method of developing your own container.
- They store the whole procedure of how an image is built.
Dockerfile Format
A Dockerfile has two type of fields:
- Instructions followed by arguments and comments
- A basic Dockerfile looks like
# Comment
INSTRUCTION arguments
General Steps
- Choose a base operating system
- Install dependencies and other useful packages
- Install scientific application
- Set any environment variable that might be useful
Install a tool in a Docker container¶
Let’s build a container that will run fastqc a popular bioinformatics tool for checking the quality of DNA sequencing reads.
To make it easier, follow along with the sample Dockerfile.
$ cat Dockerfile
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # Choose a base operating system
FROM ubuntu:18.04
# Update and install necessary packages
RUN apt-get update && apt-get upgrade -y \
&& apt-get install -y wget unzip default-jdk libfindbin-libs-perl
# Install the application
RUN wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip \
&& unzip fastqc_v0.11.8.zip \
&& rm fastqc_v0.11.8.zip \
&& chmod 755 /FastQC/fastqc
# Use environment variable to add executable to PATH
ENV PATH "/FastQC:$PATH"
|
From your current working directory, preferably a clean one, copy the contents of this file into a new file called Dockerfile and save it.
Build
$ docker build -t username/fastqc:0.11.8 .
Sending build context to Docker daemon 251.4MB
Step 1/4 : FROM ubuntu:18.04
---> 775349758637
. . .
Successfully built b5d705fbdfa1
Successfully tagged reshg/fastqc:0.11.8
Check images
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
reshg/fastqc 0.11.8 b5d705fbdfa1 3 hours ago 708MB
Run
We use the docker run command to run containers from an image. We pass a command to run in the container. Similar to running other programs on Unix systems, we can run containers in the foreground (attached) or in the background.
$ docker run username/fastqc:0.11.8 which fastqc
/FastQC/fastqc
Unpacking the ‘docker run’ command
docker run | Run something | |
–rm | Remove the container when the process completes | |
username/fastqc:0.11.8 | The name of the container | |
which fastqc | The command to run |
Push Image to Docker hub
$ docker push username/fastqc:0.11.8
Alternatively, you could also do this interactively¶
Note
Preferred way to build a docker image is by using Dockerfile. For the purpose of testing, working inside the container is sometimes helpful.
Open a base Docker Image
$ docker run --rm -it ubuntu /bin/bash
Unpacking the interactive ‘docker run’ command
docker run | Run something |
–rm -it | Remove the container when the process completes and connect your terminal to the container runtime |
ubuntu | The name of the container |
/bin/bash | The type of shell to start |
Install your tool in the image
root@ded8d40f1a1e:/#
# install dependencies
$ apt-get update && apt-get upgrade -y
$ apt-get install -y wget unzip default-jdk libfindbin-libs-perl
# install FastQC
$ wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip
$ unzip fastqc_v0.11.8.zip
$ rm fastqc_v0.11.8.zip
# make fastqc executable
$ chmod 755 /FastQC/fastqc
# add fastqc to the system path by linking to /bin
$ ln -s /FastQC/fastqc /bin
$ exit
Commit your image
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9f0d7afff313 ubuntu "/bin/bash" 9 minutes ago Exited (0) 18 seconds ago affectionate_einstein
# Grab the CONTAINER ID of the ubuntu image created just few minutes ago.
$ docker commit CONTAINER ID username/fastqc:0.11.8
sha256:738f35b39c5711f722cc6d9b550215454f2a7ea765c73667355d383a8a9285bf
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
reshg/fastqc 0.11.8 738f35b39c57 12 seconds ago 718MB
Push your image to Docker Hub
$ docker push username/fastqc:0.11.8
The push refers to repository [docker.io/reshg/fastqc]
6750c6c8d397: Pushing [=========> ] 124.3MB/654.2MB
Running a Container in Daemon mode
We can also run a container in the background. We do so using the -d flag:
$ docker run -d ubuntu sleep infinity
f406f6b0c34d4bba552a7106e951a5d667dcbfddcb429e2d42b0ac7a10a919fc
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f406f6b0c34d ubuntu "sleep infinity" 6 seconds ago Up 5 seconds romantic_wilson
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f406f6b0c34d ubuntu "sleep infinity" 15 seconds ago Up 14 seconds romantic_wilson
b7c50065ea75 ubuntu "/bin/bash" 21 hours ago Up 21 hours charming_robinson
a197e85bee14 reshg/fastqc:latest "/bin/bash" 21 hours ago Exited (0) 21 hours ago reverent_williamson
4eb4cf433d32 reshg/fastqc:latest "which fastqc" 21 hours ago Exited (1) 21 hours ago stoic_dhawan
1eb1de6ac64c reshg/fastqc "which fastqc" 21 hours ago Exited (1) 21 hours ago upbeat_mcnulty
Note: The docker ps command only shows you running containers - it does not show you containers that have exited. In order to see all containers on the system use docker ps -a.
Summary
A Dockerfile allows you to transparently document all the dependancies and steps needed to describe a software tool. You can then run this tool as a Docker container for full reproducibility.