Docker

READ THIS FIRST! On HP workstation equipment, you’ll likely need to enable VTX Extensions in the BIOS to allow for the execution of virtualisation technology. I use a hp EliteDesk 800 G2 Mini and the factory BIOS setting has VTX disabled by default. If in any doubt, check your BIOS now, or you WILL encounter problems later and could waste hours troubleshooting.

When it comes to containers, Docker is the most popular technology out there. So, why do we need containers? How do they differ from VM’s? Very briefly, a VM (as in a VMWare ESXi Virtual Machine) is an entirely self contained installation of an entire operating system that sits on top of a “bare metal hypervisor” layer that sits on top of the physical hardware. Unlike installation of an OS on the physical hardware, the bare metal hypervisor layer allows multiple installations of multiple OSes to co-exist on the same hardware – entirely separate apart from the physical resources they share. There may be cluster of ESXi hosts, managed by a Central vCenter server that allows better distribution of multiple VM’s across multiple physical hosts for the purposes of Distributed Resource Scheduling and High Availability. This is a more efficient use of physical servers, but still quite wasteful on resources and storage since many Windows VM’s for example, would be running the same binaries and storing the same information many times over.

Containers on the other hand are logically separated applications and their dependencies that all reside in an isolated “container” or “zone” or “virtual machine” or “jail” depending on the single instance of an underlying UNIX/Linux based OS. So, they share common components of the underlying OS which is a more efficient use of space and physical resources since only one instance of the operating system is running on the physical machine. This also reduces the overhead on patching and to some extent, monitoring, since in the case of an application hosted on multiple, entirely separate full stack VM’s in a virtual environment, only the parts of the stack with unique/incompatible dependencies are separated into their own container. This means that a container compared to a VM may be very small indeed, and containers are typically restarted in the time it takes to start a daemon or service, compared to the time it takes to boot an entire OS.

So in summary, a container is a more intelligent, more efficient way of implementing the various layers in a full stack application that won’t otherwise co-exist on the same OS due to their individual dependencies for slightly different versions of surrounding binaries and/or libraries.

On VMWare ESXi, there is no Operating System layer (shown in Orange), but VMWare Workstation or Oracle Virtualbox provides similar full OS VM separation within a software hypervisor running as an application in its own right atop an underlying desktop OS such as Windows or Linux. Hence the term bare-metal hypervisor (since the Hypervisor layer shown in Blue runs atop the Server hardware shown in Grey). Docker is similar to a software hypervisor, but rather than store multiple similar full OS/App stacks, it provides separation further up the stack, above the OS layer, such that just the unique requirements of the app/daemon/microservice are hosted in any given container, and nothing more in an effort to become as efficient as it’s possible to be.

REASONS FOR ADOPTING CONTAINERISATION.

In most information technology departments, there’ll be a team of developers who code and build apps using combinations of their own code, database, messaging, web server, programming languages that may each have different dependencies in terms of binaries/libraries and versions of the same binaries/libraries. This is referred to the Matrix from Hell as each developer will be building, knocking down, rebuilding their own development environment that likely runs on their own laptop. There’ll be a development environment too, the intention of which is to mirror that of the production environment although there’s no guarantees of that. There may be other Staging or Pre-Production environments too, again with no guaranteed consistency despite everybody’s best efforts. The problems arise when deploying an app from a Development environment to the Production environment only to find it doesn’t work as intended.

The solution to this problem is to put each component in the application you ultimately want to ship into production/cloud into its own container, i.e.

All the components in the application running on a single Linux OS that has Docker installed can be placed in their own container, i.e…
Once an application component is contained within its own container, all the dependencies that component has (other linux packages and libraries) will also be contained within the same container. So each component has exclusive access to just the packages and libraries it needs, without the potential to interfere with and break adjacent components on the same underlying host operating system.

In order to isolate the dependencies and libraries in this way, a typical Docker container will have its own Processes, Network interfaces and Mount points, just sharing the same underlying kernel.

Two containers with their own processes, network interfaces and mounts share the kernel of the OS that Docker is running on.

Linux containers typically run on LXC, a low level hypervisor that is tricky to set up and maintain, hence Docker was born to provide higher level tools to make the process of setting up containers easier.

Since Docker containers only share the kernel of the underlying OS, have their own processes, network interfaces and mounts, it is possible to run entirely different linux OS’es inside each container, since the only part of the underlying OS that Docker is running on, is the underlying kernel of that OS!

If you want, entirely different linux distributions that are able to run on the same kernel can be run in docker containers! However, since the kernel is very small, this is arguable as wasteful as simply using VMWare ESXi to host individual VM’s each running a different linux distro?!

Since the underlying kernel is the shared component, only OS’s that are capable of running on the same kernel can exist on the same docker host. Windows could not run in the scenario above, and would need to be run on a Windows Server based Docker host instead. This is not a restriction for the VMWare ESXi bare-metal hypervisor of course, where Windows and Linux can co-exist on the same physical host since their kernels are contained within the VM, along with everything else.

HOW IS DOCKER CONTAINERISATION DONE?

The good news is that containers are nothing new. They’ve been about for over a decade and most software vendors make their operating systems, databases, services and tools available in container format via Docker Hub.

Some of the official container images available on Docker Hub.

Once you identify the images you need and you install docker on your host, bringing up an application stack for the component you want, is as easy as running a docker command.

docker run ansible  #downloads and runs a container running ansible
docker run mongodb  #downloads and runs a container running mongodb
docker run redis    #downloads and runs a container running redis
docker run nodejs   #downloads and runs a container running node.js

A docker image can be installed mulitple times, for example in a scenario where you want multiple instances of node.js (you’d need a load balancer in front of the docker host(s)) so that in the event of a node.js container going down, docker would re-launch a new instance of that container.

So, the traditional scenario, where a developer puts together a bunch of components and builds an application, then hands it to operations, only for it to not install or run as expected because the platform is different in some way, is eliminated by ensuring all dependencies of each component is contained in its own container, and is thus guaranteed to run. the docker image can subsequently be deployed on any docker host, on prem or in the cloud.

It worked first time!

INSTALLING DOCKER

My everyday desktop machines all run Linux Mint for it’s ease of use and it’s propensity to just work when it comes to my various desktop computing requirements. You’d likely not run it on your servers though, instead choosing Debian or Ubuntu (which Mint is actually based on but not guaranteed to be exactly the same). Your server linux distro choice should be based on support, and by that I mean support for any problems that arise and support in terms of software vendors and in our case, docker image availability.

So, since I’m blogging on this same Mint machine, I’m going to install Docker via the Software Manager for immediacy. I will however cover installation on Ubuntu below.

Quickest and most reliable way of installing a working Docker on Mint, is to use the Software Manager.

The first step is to get docker. Start here.

Take the Docker for Linux option, unless you’re running it on a Windows or Mac Desktop machine

Follow the instructions here to install docker. The steps are also shown below.

#INSTALLING DOCKER ON UBUNTU
sudo apt-get update
sudo apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
sudo docker run hello-world
sudo docker run hello-world will download the hello-world container, install it and produce the output shown.
always precede your ‘docker’ command with sudo. It needs those root level privileges to communicate with the docker daemon.
sudo docker run -it ubuntu bash opens a root shell on container ‘ade951cb999’

From the ubuntu bash container, issuing a few linux commands quickly shows that the container has no reach outside of itself but shares stats returned by the kernel, such as disk space for the /dev/sda2 partition (the docker hosts root partition), cpu, memory and swap utilisation .

The hostname is different, there is only one process running ‘bash(excluding the ps -ef command itself), it can see how much physical disk space is used on the docker host (67% /dev/sda2), it has it’s own set of directories (/home shows as being empty) and the output from top shows only the 1 running process.

Standard linux commands being run inside the ubuntu bash container
The CPU, Memory and Swap statistics are the same as the Docker host that the container is running on since they share the same kernel.
/etc/issue says Ubuntu despite the Docker host being Linux Mint
The Docker host running Linux Mint.
cat /etc/*release* reveals more information about the operating system running in the container.

To display the version of docker thats been installed, use sudo docker version

sudo docker version

DOCKER IMAGES AND COMMANDS

Remember, you probably need to precede docker commands with sudo.

Here are some initial commands before I stop to make an important point…

#DOCKER COMMANDS
docker run nginx  #checks for a local copy of the nginx container image, if there isn't one, it'll go out to docker hub
                  # and download it from there.  For each subsequent command, it'll use the local copy.
docker ps         #lists all running containers
docker ps -a      #lists all containers present, running or otherwise
docker start <container-id> or <container-name>   #(re)start a non-running container
docker stop <container-id> or <container-name>    #stop a running container
docker rm <container-id> or <container-name>      #remove container
docker images     #shows all docker images downloaded from docker hub on the local system
docker rmi  nginx   #removes the docker image from the system (make sure non are running)
docker pull ubuntu  #pulls ubuntu image to local system but dont run it until docker run command is issued
Containers are only running while the command executed inside them is running. Once the process stops, the container stops running. This is an important distinction from VM’s that stay running and consuming system resources, irrespective. Note also the final column, a randomly assigned “name” for the container.

An important distinction between containers and VM’s is that whereas a VM stays running all the time, a container is only running while the command inside it is running. Once the process for the command completes, the container is shutdown, thus handing back any and all resources to the docker host.

Taking the 1st, 2nd and 3rd columns from the sudo docker ps -a command above for closer inspection, you can see that there is a container ID, the docker image, and the command run within that docker image, e.g.

Earlier we executed the command sudo docker run ubuntu bash and the docker host checked for a local copy of the ubuntu image, didn’t find one, so downloaded one from docker hub. It then started the container, and ran the bash command within that container, and thus we were left as a running bash command prompt on our container running ubuntu. As soon as we typed exit, the bash terminal closed, and since there were no running processes remaining on that container, the container was subsequently shut down.

Another container, docker/whalesay was also downloaded and ran the command cowsay Hello-World! before exiting and unlike ubuntu bash dropped us back at our own prompt on the docker host. This is because once the cowsay Hello-World! command had executed, there was no further need for the container, so it was shut down by the docker host.

docker exec mystifying_hofstadter cat /etc/hosts    #execute a command on an existing container
docker start <container-id> or <container-name> #starts an existing non-running container
docker stop <container-id> or <container-name> #stops a running container that's been STARTed

So, docker run <image-name> <command> will get a docker image, start it and execute a command within it, downloading the image from docker hub if required. But once the image is stored locally and that container exists on our docker host system, albeit in an exited state, what if we want to run the command again. Well, docker run <image-name> <command> will create another duplicate container and execute the command in that. We want to execute the same command in the existing container. For that we use the docker start command followed by the docker exec command and optionally finish up with the docker stop command e.g.

Before using docker exec to execute a command on an existing container, you’ll need to docker start it first.

DETACH and ATTACH (background and foreground)

If you’re running a container that produces some kind of output to the foreground but you want to run it and return to the prompt instead, like backgrounding a linux command, you can docker run -d <container> to run it in detached mode. To bring that running container to the foreground, you can docker attach <container>. Where <container> is the id or name of the container.

docker run -d kodekloud/simple-webapp               #runs a container in the background if it would otherwise steal foreground
docker run -a <container-id> #bring detached container (running in the background) to the foreground

If you have a docker image like redis, that when run with docker run, will stay running interactively in the foreground, there is no way to background it (detach it) without hitting CTRL-C to kill the process, then re-run it with docker run -d so that it runs in detached mode. However, if you run your containers with docker run -it then you can use the key sequence CTRL-P CTRL-Q to detach it without killing it first. Reattach using the docker attach <container> command. According to the docker run –help page, -i runs in in interactive mode and -t allocates a pseudo tty (terminal) to the running container.

DOCKER COMMANDS AND HELP SYSTEM

Docker has a very nicely thought out help system. Simply type docker and all the management commands and docker commands are listed along with descriptions. Type docker <command> –help and you’ll see more information on that particular command.

docker commands. Use docker <command> –help to dig deeper.

RUN TAG

If we run a container e.g. redis with docker run redis, we can see in the output the the version of redis is Redis version=5.0.8

The version TAG is Redis version=5.0.8 in our redis container.

If we wanted to run a different version of redis, say version 4.0, then we can do so by specifying the TAG, separated by a colon e.g. docker run redis:4.0

Run a different version of the redis container by specifying the TAG in the the docker run redis:4.0 command

In fact, if you specify no TAG, then what you’re actually doing is specifying the :latest tag, which is the default if no tag is specified. To see all the Tags supported by the container, go to docker hub, search for the container and you’ll see the View Available Tags link underneath the command.

RUN -STDIN

Already mentioned above, if you have a simple shell script that prompts for user input, then produces an output, e.g.

#!/bin/bash
echo "What is your name?"
read varname
echo "Hello $varname. It's nice to meet you."
exit
hello.sh needs to prompt the user for input before producing an output.

If this simple program were containerised with docker, when run, it needs to prompt the user for input before it can produce an output. So, the command needed to run this container, would be docker run -i -t <image>. The i runs the container in interactive mode so you can enter stdin, and the t allocates a pseudo terminal so you get to see the stdout.

PORT MAPPING

Before talking about port mapping, I’ll first cover how to see the internal ip address assigned to the container and the port the container is listening on. The output of docker ps will display a PORTS column, showing what port the container is listening on, then use docker inspect <container-name> to see the IP Address.

display the port using docker ps and use docker inspect to display the internal ip address.

The internal IP address is not visible outside of the docker host. In order for users to connect to the port on the container, they must first connect to a port on the docker host itself, that is then mapped to the port on the container i.e.

Here we see port 80 on the docker host is mapped to port 5000 on the container running a web app.

To map a local port on the docker host to a listening port on the container, we use the docker run -p 80:5000 <image-name> command. The -p stands for publish and creates a firewall rule allowing the flow of traffic through the specified ports. By default a container doesn’t publish any of its ports.

Users can connect to the IP and Port on the Docker host, and be tunnelled through to the container.

VOLUME MAPPING AND PERSISTENT DATA

If you’re running a container that stores data, any changes that occur are written inside that container. e.g. a mysql container will store it’s tablespace files in /var/lib/mysql inside the container.

A MySQL database will write data to it’s internal file system. But how does that work?

docker run -v /opt/datadir:/var/lib/mysql mysql mounts the directory /opt/datadir on the docker host into /var/lib/mysql on the mysql container. Any data that is subsequently written to /va/rlib/mysql by the mysql container, will land on /opt/datadir on the docker host, and as such will be retained in the event that the mysql container is destroyed by docker rm mysql.

CONTAINER INFORMATION

Already mentioned before, the docker inspect command returns many details about the container in JSON format. ID, Name, Path, Status, IP Address and many other details.

LOGS AND STDOUT

So, you can run the docker run -it redis command and see the standard output, but if you have an existing container that you start with docker start <container-name> and then attach to it using docker attach <container-name> you won’t see any stdout being sent to the screen. This is because unlike running it interactively with an assigned tty, simply starting the container and attaching to it, will not assign a tty. In order to view the stdout on the container, use the docker logs <container-name> and you’ll see the same output that you would if you used the docker run -it redis command. Obviously, using docker run redis would create a new container using the redis image, not start an existing redis container.

Starting and attaching to a container that produces stdout will not display the stdout
Using the docker logs <container-name> command to view the stdout on that container.

ENVIRONMENT VARIABLES

Consider the following python code web-page.py to create a web server that serves a web page with a defined background colour and a message. If the python program has been packed up into a docker image called my-web-page, then you’d run it using the docker run my-web-page command, connect to it from the web browser on the docker host on port 8080 to view the page.

import os
from flask import Flask

app = Flask (__name__)

color = "red"

@app.route("/")
def main():
    print(color)
    return render_template('hello.html', color=color)
                           
if __name__ == "__main__":
    app.run(host="0.0.0.0", port="8080")

The python program has a variable color=red within it but you want to be able to pass in values for that variable from the docker host when you run the container. To do this, you can move the variable outside of the python code by replacing the line of color=red with color = os.environ.get(‘APP_COLOR’)

import os
from flask import Flask

app = Flask (__name__)

color = os.environ.get('APP_COLOR')

@app.route("/")
def main():
    print(color)
    return render_template('hello.html', color=color)

if __name__ == "__main__":
    app.run(host="0.0.0.0", port="8080")

On the docker host, you can create and export a variable export APP_COLOR=blue; python web-page.py and refresh the page and the colour will change since it’s value is being read from an external variable on the docker host.

To run a docker image and pass in a variable, you can use the command

docker run -e APP_COLOR=orange my-web-page

to pass the variable APP_COLOR-orange into the container image my-web-page before the container is started.

To find the environment variable set on a container, you can use the docker inspect <container-name> command, and in the JSON file, under the “config”: { section, “env”: { subsection, you’ll see “APP_COLOR=blue” variable, along with some other variables too.

docker inspect will show the variables passed in from the docker host

CREATING A DOCKER IMAGE

So, you now have a good idea on how to interact with docker images and docker containers running on your linux system. We’ve even seen some code that can be containerised but we’ve not elaborated on how you get from a python program or shell script to a docker container image. Lets cover that important topic next.

Firstly, lets ask “Why would you want to dockerize a program?”. There are two answers to that question. The first is that you cannot find what you want on docker hub already, so you want to make it yourself. The second is that you have a program on your development laptop/environment and want to containerise it for ease of shipping to operations teams or docker hub and deployment on production infrastructure.

So, taking the above example of a web server and web page python script called web-page.py that uses the python flask module. If i were to build a system to serve that page, I’d follow the following steps.

  1. Install the Ubuntu OS
  2. Perform a full update of all packages using apt-get update && apt-get dist-upgrade
  3. Install python3.x and any dependencies using apt-get install python3
  4. Install python3.x script module dependencies using the python pip package manager
  5. Create/Copy the python source code into /opt directory
  6. Run the web server using the flask command.

DOCKER FILE

A docker file is basically the sequence of commands required to perform the sequence of steps listed above. It is written in an INSTRUCTION, Argument format. Everything on the left in CAPS is an Instruction, and everything that follows it is an argument.

It looks like this…

#Dockerfile for cyberfella/my-web-page
FROM Ubuntu 

RUN apt-get update
RUN apt-get install python

RUN pip install flask
RUN pip install flask-mysql

COPY . /opt/source-code

ENTRYPOINT FLASK_APP=/opt/source-code/web-server.py flask run

Docker has a “layered architecture”. This means that when you run docker build (shown below), it builds a docker image that contains only the parts of the application added by each line in the Dockerfile. Think of the Dockerfile as the recipe, the docker build command as the chef and the docker image as the dish (and I suppose docker hub as the restaurant if we take this food analogy all the way to its logical conclusion!).

The first line contains a FROM instruction. This specifies the base OS or another docker image.

To build the docker image, use docker build Dockerfile -t cyberfella/my-web-page command. This will create a local docker image called my-web-page.

Note that if your Dockerfile lives in a directory called MyApp, then you need to specify the folder that contains the Dockerfile in it in the docker build command, e.g. sudo docker build MyApp -t cyberfella/my-web-page

docker build adds each layer based on the actions of each instruction in the Dockerfile, to create the Docker image. Each layer only stores the changes from the previous layer, which is reflected in the size of the image file.
docker build output clearly shows the activity based off each line in the Dockerfile

If the docker build process fails during the processing of one of the layers, then once you have fixed the issue, the build process will start again at the failed layer since all previous layers are stored in the docker hosts cache. This means docker is very quick to rebuild an image, compared to the first time the image is built.

Re-running docker build will use the contents of the cache for layers that were previously successfully built. This makes the docker build process faster over time as small changes are all that are implemented with each subsequent run of docker build.

To push the image to Docker Hub, use the command docker push cyberfella/my-web-page

If docker push fails, you’ve probably not used docker login to log in to your repository on docker hub!

Note that cyberfella is my Docker Hub login name. You will need to register on Docker Hub first so that there’s a repository to push to and you’ll need to log into your repository from the docker host using the docker login command. You can also link your Docker Hub repository with your GitHub repository and automate new Docker image builds when updated code is pushed to GitHub!

You can see the size of each layer in the image by using the command docker history <image-name>

docker history <image-name> displays the amount of data being added to the image in each layer of the docker build process

COMMANDS ARGUMENTS AND ENTRYPOINTS

If you recall the last Instruction line in the Dockerfile was CMD and the Argument that follows it is the name of the command or daemon that you want your container to execute. These arguments can take the form of the command you’d ordinarily type in a Shell, or can be specified in JSON format, e.g.

The sleep 5 command can be specified as sleep 5, or as [“sleep”, “5”] in JSON format, whereby the first element is the executable and the second element the argument passed into the executable.
Building a Dockerfile that consists of two Instruction and Argument lines, FROM ubuntu \ CMD [“sleep”, “5”] took around a second. This is because the cache already contained the build artifacts from a prior successful FROM ubuntu instruction.

When we run the cyberfella/ubuntusleeper container, it will sleep for 5 seconds and then exit, just as if we ran the command sleep 5 from our Ubuntu terminal. Remember, if we wanted to run our ubuntusleeper container for 10 seconds, we don’t need to rebuild it, we can optionally pass in the amount of time we want as a parameter, e.g. sudo docker run cyberfella/ubuntusleeper sleep 10 passing in the executable and argument as parameters that will override the default parameters set in the Dockerfile when the image was built.

This still looks a little untidy. the imagename ubuntusleeper implies that the container will sleep, so we don’t want to have to pass in sleep 10 as parameters going forward. We’d prefer to just have to enter sudo docker run cyberfella/ubuntusleeper 10. This is where the ENTRYPOINT Instruction comes in. If we add an Instruction to our Dockerfile ENTRYPOINT [“sleep”] then any argument passed in, will be passed into that executable by default.

After appending ENTRYPOINT [“sleep”] the re-build of the docker image and pushing it to docker hub takes mere seconds.

This works great until the image is run without any parameters, at which point it’ll error with a missing operand error. To overcome this, edit the Dockerfile to contain the ENTRYPOINT instruction first and the CMD instruction after, but remove the sleep command from the CMD instruction since it will default to using it from the ENTRYPOINT instruction.

FROM ubuntu ENTRYPOINT [“sleep”] CMD [“5”] is all you need in your Dockerfile in order to pass in no parameters or just the parameter for the sleep duration in seconds. This only works for Dockerfiles written in JSON format.

Lastly, if you wanted to override the ENTRYPOINT parameter when running the docker image and replace sleep with, say, sleep2.0 then this can be done by specifying the new entrypoint on the docker run command, e.g. sudo docker run –entrypoint sleep2.0 cyberfella/ubuntusleep 5

DOCKER NETWORKING

When you install Docker, it creates three networks, bridge, None and host.

Bridge is the default network a container gets attached to. To specify a different network for the container to get attached to, use the –network= parameter in the docker run command, e.g sudo docker run –network=host cyberfella/ubuntusleep

The bridge network is the default network that containers get attached to on the docker host. They can all talk to one another, but there is no connectivity to the outside world (see PORT MAPPING above in this post for how to map ports on the docker host to ports on the containers)
The host network removes any network isolation from the docker host. This means that the IP address is the same as the docker host. This also means that you can’t host similar containers serving over the same port and that you cannot ship containers to other docker hosts so easily, without causing some disruption.

The none network, simply put, means the container is not connected to any network.

USER DEFINED NETWORKS

It is possible to create more than the default bridge network, so that groups of containers can talk to one another, but you can introduce some isolation between different groups of containers on the same docker host.

User defined networks allow only certain containers to be able to communicate with one another on a given docker host.

The command to set up this extra bridge network on the docker host, would be…

sudo docker network create –driver bridge –subnet 182.18.0.0/16 custom-isolated-network

To list all networks on the docker host, use the docker network ls command.

Remember the docker inspect command can be used to view IP addresses assigned to docker containers? To view the network settings for a specific container, use the command sudo docker inspect blissful_wozniak

Note that a docker container is only connected to the network and has an IP address when it is running. There is no IP address information for non-running containers in an existed state.

On the left, the network information for a non-running container, compared to a running container on the right.

EMBEDDED DNS

Containers can contact one another using their names. For example, If you have a MySQL Server and a Web Server that need to communicate with one another, it is not ideal to configure IP Addresses in a mysql.connect(), since there is no guarantee that a container will get the same IP address from the docker host each time it is started.

The docker host creates a separate namespace for each container and uses virtual ethernet pairs to connect the namespaces together.

containers should be configured to communicate using their names.

DOCKER STORAGE

So, where does the docker host store all it’s files? In /var/lib/docker

The docker host’s filesystem and containers directories.
Recall how the build process is a layered architecture, where each build stores only the changes made within each layer.

If we turn the build process upside down so we’re starting at the bottom layer with a base image from docker hub, then working our way up to a newly built container, we will call these layers the Image Layers. When we run that container with the docker run command, docker host creates a new Container Layer on top where it stores the changes occurring on that running container. This container layer exists while the container is running and is destroyed once the container exits. There is no persistent storage by default.

COPY ON WRITE MECHANISM

The difference between our Image Layers and Container Layers on a running container, is that the Image Layers are Read Only or “Baked in”. Only the Container Layer is Read/Write. If we attempt to modify the code of the script specified in our Entrypoint, then we are not modifying the code in the Image Layer. We are taking a copy of that code from the Image Layer/Entrypoint and modifying that copy.

Remember, when the container exits, it is destroyed and there is no persistent storage by default. Running a container and modifying its entrypoint code will not change the underlying container image contents.

This is known as the Copy on Write mechanism i.e at the point that you write a change, a copy is made and changes written to that copy.

At the point at which the container exits, the image layers and the container layer and is data are destroyed until a container is run again from the built image consisting only of immutable image layers.

If we have a container that we want to write some data that we want to be persistent, i.e. survive after the container is destroyed, then we need to create a volume outside of the container and mount it inside the container.

VOLUME MOUNTING

Create a volume called data_volume using the docker volume create command data_volume command on the docker host that will run your container.

A folder called data_volume will then be created in the /var/lib/docker/volumes/ directory on the docker host.

To mount the volume in your container, run the container with the command docker run -v data_volume :/var/lib/mysql mysql (where mysql is the image name in this example and /var/lib/mysql is the directory on the container that you wish to be the mount point for the data_volume volume on the docker host).

If you run a container and specify a non-existent volume, e.g. docker run -v data_volume2:/var/lib/data ubuntu then rather than error, it will actually create a volume called data_volume2 and mount it into the specified directory, in our case /va/rlib/data on a container running using the ubuntu image.

MOUNTING EXTERNAL STORAGE (BIND MOUNTING)

If the docker host has mounted some external storage, into lets say /data/mysql on the docker host, and we wish to mount that on our running container, then we use the docker run -v /data/mysql:/var/lib/mysql mysql command to accomplish this, specifying the mount point on the docker host rather than the volume, followed by the mount point on the container and the image name just as before.

The -v switch is the old method. The newer standard is to use –mount instead, an example of bind mounting using the newer –mount way, is shown below

docker run --mount type=bind,source=/var/san-mysqldata-vol01,target=/var/lib/mysql mysql

The docker storage drivers control the presentation of one type of filesystem to the docker container that mounts it. Commonly supported file system types are aufs, zfs, btrfs, device mapper, overlay, overlay2

DOCKER COMPOSE

We’ve covered how to run docker containers using the docker run command, and how we can run multiple containers built from different images, but if we want to run an application built from multiple components/containers, a better way to do that is to use “docker compose”, specifying the requirements in a YAML file.

A docker-compose.yml file for your app stack might look like this,

#docker-compose.yml
services:
web:
image: "cyberfella/webapp"
database:
image: "mongodb"
messaging:
image: "redis:alpine"
orchestration:
image: "ansible"

The stack is then brought up with the command docker compose up

Lets look at a more comprehensive example, often used to demonstrate docker. Consider the following voting application architecture.

Imagine if we were going to run this manually, using docker run commands and the images were already built/available on docker hub, we’d start each container like this…

docker run -d --name=vote -p 5000:80 voting-app
docker run -d --name=redis redis
docker run -d --name=worker worker
docker run -d --name=db postgres:9.4 db
docker run -d --name=result -p 5001:80 result-app

But our application does not work because despite running all these required detached containers and port-forwards for our app, they are not linked together. In fact, on our docker host, we may have multiple instances of redis containers running. The console will display errors akin to “waiting for host redis” or “waiting for db” etc.

So how do we associate the containers for our app with one another, amongst other containers potentially running on our docker host?

Going back to our architecture, the python voting app is dependent on the redis service, but the voting app container cannot resolve a container with the name redis, so in our command to start the voting app container, we’d use a –link option to make the voting app container aware of the container named redis with the hostname of redis. This is why we used the –name=redis redis option in our docker run commands above.

# Use --link <container-name>:<host-name> option
  docker run -d --name=vote -p 5000:80 --link redis:redis voting-app 

This actually creates an entry in the hosts file on the voting app container for the redis host. So what other links would we need here?

Well our result-app container needs to know of the db container in order to display the results…

docker run -d --name=vote -p 5000:80 --link redis:redis voting-app # --link <container-name>:<host-name>
docker run -d --name=redis redis
docker run -d --name=worker --link db:db ---link redis:redis worker
docker run -d --name=db postgres:9.4 db
docker run -d --name=result -p 5001:80 --link db:db result-app

Our .NET worker container needs to know of the db and redis containers…

docker run -d --name=vote -p 5000:80 --link redis:redis voting-app # --link <container-name>:<host-name>
docker run -d --name=redis redis
docker run -d --name=worker --link db:db ---link redis:redis worker
docker run -d --name=db postgres:9.4 db
docker run -d --name=result -p 5001:80 --link db:db result-app

Our redis and db containers don’t need to make contact with anything since all communications to these are inbound in our app.

docker run -d --name=vote -p 5000:80 --link redis:redis voting-app # --link <container-name>:<host-name>
docker run -d --name=redis redis
docker run -d --name=worker --link db:db ---link redis:redis worker
docker run -d --name=db postgres:9.4 db
docker run -d --name=result -p 5001:80 --link db:db result-app

With our docker run commands ready, we can now think about putting together our docker-compose.yml file for our app.

#DOCKER COMPOSE EQUIVALENT TO DOCKER RUN COMMANDS
redis:
  image:  redis
db:
  image: postgres:9.4
vote:
  image: voting-app
  ports:
    - 5000:80
  links:
    - redis
result:
  image: result-app
  ports:
    - 5001:80
  links:
    - db
worker:
  image: worker
  links:
    - redis
    - db    

Underneath each section for each of our containers, we’ve specified the image we wish to use. Two of our containers are just images available from docker hub but the other three are using our own in-house application code. Instead of referencing the built image of these in-house containers, the code can be referenced in our docker-compose.yml file instead. The docker host will build our container using the Dockerfile and surrounding code in the directory that contains all the elements required to build our container image using the lines build: ./vote and build: ./result and build: ./worker instead of image: voting-app etc. i.e.

#DOCKER COMPOSE EQUIVALENT
redis:
  image:  redis
db:
  image: postgres:9.4
vote:
  build: ./vote
  ports:
    - 5000:80
  links:
    - redis
result:
  build: ./result
  ports:
    - 5001:80
  links:
    - db
worker:
  build: ./worker
  links:
    - redis
    - db

There are currently 3 different formats of docker-compose files since docker compose has evolved over time. Lets look at the differences between v1.0 and v2.0

docker-compose version 1 connects all containers to the default bridged network, whereas docker-compose version 2 creates a new, separate dedicated bridge network on the docker host, and connects each container to that – presumably for better isolation and better security.

version: 2 must be specified at the top of the docker-compose.yml file and all version 1 elements move under a section called services:) Because of this isolated bridge network, links are created automatically between the containers, so the link: lines are also no longer required.

Version 2 also supports a depends on: feature, to control the start up order of containers.

#DOCKER COMPOSE V2 EQUIVALENT TO DOCKER RUN COMMANDS
version: 2
services:
  redis:  
    image: redis
  db:
    image: postgres:9.4
  vote:
    image: voting-app
    ports:
      -5000:80
    depends_on:
      - redis
  result:
    build: ./result
    ports:
      - 5001:80
  worker:
    build: ./worker

Version 3 looks like Version 2 but has version: 3 at the top (obviously). It also is used by Docker Swarm and Docker Stacks, which we will come on to later.

Going back to Docker Compose Networks, what if we wanted to separate our traffic on the front end App and Results servers from our back end Redis, Worker and DB?

We’d need to create a network for the voting and results containers to connect to, i.e.

and all components to a separate back end network, i.e.

Our version 2 docker-compose.yml file would need to have a map of networks added, i.e.

#DOCKER COMPOSE V2 FRONT-END AND BACK-END NETWORKS
version: 2
services:
  redis:
    image: redis
    
  db:
    image: postgres:9.4
    
  vote:
    image: voting-app
    
  result:
    image: result
    
networks:
  front-end:
  back-end:

then each of our containers needs a networks: section that defines which networks to connect to, e.g.

#DOCKER COMPOSE V2 FRONT-END AND BACK-END NETWORKS
version: 2
services:
  redis:
    image: redis
    networks:
      -back-end

  db:
    image: postgres:9.4
    networks:
      - back-end

  vote:
    image: voting-app
    networks:
      - front-end
      - back-end

  result:
    image: result
    networks:
      - front-end
      - back-end

networks:
  front-end:
  back-end:

DOCKER REGISTRY

PUBLIC REGISTRY

The docker registry is the place where the docker images come from. We run a docker container with a command like docker run nginx and then magically, if the image doesn’t already exist, docker host goes and downloads the latest nginx image from the docker hub repository nginx/nginx:latest i.e. if only nginx is specified, then docker assumes that the docker hub user name is nginx and the repository name is nginx and if no tag is specified, it’ll assume the :latest tag and download the latest version of the image that exists in the nginx/nginx respository. In summary, the first nginx is the docker hub user account and the second nginx is the repository name where the image lives. If there are multiple images for that container in the repository, then they’ll have a tag to denote the version or that they are the latest version of the container image. Since we did not specify the docker registry, it was assumed that we meant the docker.io registry, i.e. docker.io/nginx/nginx:latest

There are other registries too, where docker images can be found, such as Googles Kubernetes registry, gcr.io/kubernetes-e2e-test-images/dnsutils for performing end-to-end tests on the cluster for example. These are publicly accessible images that anybody can download.

PRIVATE REGISTRY

If you have images that should not be made available publicly, you can create a private registry in house. Alternatively AWS, Azure or GCP provide a private registry when you create an account on their cloud platform.

To obtain a container image from a private registry, you must first log into the private registry with your credentials using the docker login command

docker login private-registry-name.io 

You need to log into a registry before you can push or pull an image.

When you use a Cloud account provider like AWS or Azure, a private registry is created when you open an account. If you wanted to deploy an on-premise private docker registry however, you could do so using the command docker run -d -p 5000:5000 –name registry registry:2

To create a private repo, tag an image prior to pushing it to it, or to pull it from a local or network private repo, use the commands shown below.

docker login private-registry.io                    #logs in to private image registry (prompts fro creds)
docker run -d -p 5000:5000 --name registry registry:2   #creates on-premise private docker registry
docker image tag my-image localhost:5000/my-image   #Tag the container image prior to pushing to local private repo
docker push localhost:5000/my-image                 #push image to local private repo
docker pull localhost:5000/my-image                 #pull image from local private repo
docker pull 192.168.56.100:5000/my-image            #pull from private repo on network

DOCKER ENGINE

When you install docker engine on a host, you’re actually installing three components, the Docker Command Line Interface, the REST Application Programming Interface and the Docker Daemon.

An important point is that the Docker CLI need not reside on the docker host in order to communicate with the Docker Daemon. If using a docker CLI on a laptop to talk to a remote Docker Host for example, use the following command…

docker -H=remote-docker-engine:2375 forms the first part of a command executed on a remote docker host
docker -H=remote-docker-engine:2375                 #first part of a command to execute on a remote docker host
docker -H=10.123.2.1:2375 run nginx                 #runs nginx on a remote docker host

CONTAINERIZATION

How does Docker work under the hood.

Docker utilises namespaces to isolate workspace. Process ID, Network, Timeslicing, InterProcess communication and Mount are created in their own namespace thereby providing isolation within containers

NAMESPACE PID

When a linux system boots up, the root process PID 1 is the first process to start, followed by other services with incremental, unique process ID’s. Because of this, each container must think it is a unique system, originating from a root process with a PID of 1.

Each container thinks that it is it’s own system originating from a root PID of 1, just like any other Linux system.

Since the processes are all running on the same host system and Process ID’s must be unique, we cannot have more than one process ID of 1, 2,3 etc. This is where process namespaces come into play. If we were to look, we’d see the same process running on both the container and the docker host, but with different PID’s.

Notice how the there is one container running a /bin/bash process. On the docker host, only one of the /bin/bash processes is running as the root user. When I stop the container, that is the process that dies.

By default, there is no restriction on how many system resources a container can use.

There is a way to restrict how much CPU and Memory a container can use. Docker uses cgroups to do this. To limit the amount of CPU and/or Memory a container uses, use the following commands in docker run.

docker run --cpus=.5 ubuntu                         #ensures ubuntu image doesnt use more than 50% CPU on the docker host
docker run --memory=100m ubuntu                     #limits the amount of ram to 100M

Remember, Docker containers use the underlying kernel so for linux containers, you need to run docker on a linux host. Thats not to say the docker host can’t be a VM running on a hypervisor on a Windows physical host. If you want to play around with Docker, but only have access to a WIndows machine, then you can install Oracle Virtualbox on Windows and create a Linux Ubuntu, Debian or CentOS VM. Then, install docker on the Linux VM and try out all the different commands for yourself. Docker provides the Docker Toolbox which contains all the pieces of software you might want in order to try it out, but instead of having to download the software from many different places, Docker put it all together in one place for your convenience. This is or older versions of Windows that don’t meet the new Docker Desktop Standard.

Download Docker Toolbox from https://docs.docker.com/toolbox/toolbox_install_windows/

The latest option for Windows 10 Enterprise or Professional or Windows Server 2016, is Docker Desktop for Windows, which removes the need for Oracle Virtualbox and uses Windows Hyper-V virtualisation instead.

Docker Desktop uses Hyper-V and the default option is Linux containers, however, Docker has announced that you can now package Windows containers for use on a Windows host, using Docker Desktop.

To use Windows containers, you must switch to Windows Containers mode in Docker Desktop in the menu.

Unlike Linux, in Windows there are two types of Windows Container. The first is Windows Server. The second is Hyper-V isolation, whereby a highly optimised VM guarantees total kernel isolation from the WIndows host.

Windows Server containers and Hyper-V Isolation Containers. The latter guarantees Kernel isolation from the host.

Windows containers can be deployed to Windows Server Core or Nano Server.

CONTAINER ORCHESTRATION

Orchestration solutions such as Docker Swarm and Kubernetes (Google K8’s) allow for the deployment and scaling of many docker hosts and many docker containers across those cluster with advanced networking and sharing of cluster resources and storage between them. Docker Swarm is easy to set up but lacks some of the more advanced features now available on other solutions like Mesos and Kubernetes. Kubernetes is one of the highest ranked projects on GitHub.

DOCKER SWARM

To create a Docker Swarm Cluster, run the following commands on your designated Manager and Workers respectively. The docker swarm init command will generate a token code to use on the workers in that cluster.

docker swarm init --advertise-addr 192.168.1.12     #Initialize Swarm Manager
docker swarm join --token SWMTKN-1-<token-code> <token-mgr-ip>:2377   
docker swarm join-token manager #Add a amanger to a Swarm
docker service create --replicas=3 <image-name>      #Deploy a docker image to a Swarm

Just as with the docker run command, the docker service command supports many of the same commands e.g.

docker service create --replicas=3 --network frontend <image>   #Deploy a docker image to a Swarm, connected to front end network
docker service create --replicas=3 -p 8080:80 mywebserver   #Deploy mywebserver image with port forwarder

KUBERNETES

There is a more in-depth review of Kubernetes here, where the journey into the topic of containerisation continues.

image_pdfCreate PDF of this post...
Facebooktwitterredditpinterestlinkedinmail

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.