A basic webserver

Docker containers are small OS images in themselves that one can deploy and run without worrying about dependencies or interoperability. All the dependencies are packed in the same container file. And the docker runtime takes care of the interoperability. You are not tied to using a single language or framework. You can write code in Python, Go, Java, Node.js, or any of your favorite languages and pack it in a container.

Consider a simple example of a Go-based webserver

Go
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
$ cat src/main.go
package main

import (
    "fmt"
    "log"
    "net/http"
    "os"
)

const homepageEndPoint = "/"

// StartWebServer the webserver
func StartWebServer() {
    http.HandleFunc(homepageEndPoint, handleHomepage)
    port := os.Getenv("PORT")
    if len(port) == 0 {
        panic("Environment variable PORT is not set")
    }

    log.Printf("Starting web server to listen on endpoints [%s] and port %s",
        homepageEndPoint, port)
    if err := http.ListenAndServe(":"+port, nil); err != nil {
        panic(err)
    }
}

func handleHomepage(w http.ResponseWriter, r *http.Request) {
    urlPath := r.URL.Path
    log.Printf("Web request received on url path %s", urlPath)
    msg := "Hello world"
    _, err := w.Write([]byte(msg))
    if err != nil {
        fmt.Printf("Failed to write response, err: %s", err)
    }
}

func main() {
    StartWebServer()
}

We can build this using

Bash
1
2
$ go build -v -o bin/server src/*.go
...

and run it using

Bash
1
2
$ PORT=8080 ./bin/server
...

You can test it at http://localhost:8080

Building the Docker image

Now to package it into a docker container, we will write a Dockerfile, we will use alpine Linux as the base since it is a small image. How small? Let’s check.

First, pull (download) the image. $ docker pull alpine Now, check its size using this long command

Bash
1
2
$ docker image inspect alpine --format='{{.Size}}' | numfmt --to=iec-i
5.4Mi

That’s the base Linux image; we will use. We want the base image to be small. In particular, up to 100MB is acceptable, and larger images usually cause slow start times and other problems.

Also, it is a good idea to pin to a particular version (tag) of the image. You can see all the versions at Docker Hub. We will compile our code right inside the image, which will be used for running it to avoid portability issues.

Now, write the Dockerfile.

Dockerfile
1
2
3
4
5
6
# Pull the image and call it base
FROM alpine:3.11 as base
# Copy the code
COPY src /codebase/src
# Build the binary
RUN cd /codebase && go build -v -o bin/server src/*.go

To build it, we will use docker build command.

We will use Docker build kit since it’s a new fancy way of building docker images.

Bash
1
2
3
4
# -f specifies the Docker file
# -t specifies the tag of the built image
$ DOCKER_BUILDKIT=1 docker build -f Dockerfile -t my_hello_world_server
. ... > [3/3] RUN go build -v -o /codebase/bin/server src/*.go: #6 0.282 /bin/sh: go: not found

The command failed. Oops, we forgot to install the Go build toolchain for building this. There are two ways to do this, either we can install Go build toolchain explicitly, or we can just an image someone else has built for us. Let’s do the latter and change our Dockerfile to the following

Dockerfile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Pull the image and call it base
FROM golang:1.13.7-alpine3.11
# Copy the code
COPY src /codebase/src
RUN ls /codebase/src/main.go
# Build the binary
RUN cd /codebase && go build -v -o /codebase/bin/server ./src/main.go
# Set the env which will be available at runtime
ENV PORT=8080
# Specify the run command for the binary
CMD ["sh", "-c", "/codebase/bin/server"]

Now build it with

$ DOCKER_BUILDKIT=1 docker build -f Dockerfile -t my_hello_world_server

Run the container

Run the container using

Bash
1
2
# Run the container tagged my_hello_world_server with the name my_hello_world_server and forward the 8080 port to the container's 8080 port.
docker run --name my_hello_world_server -p 127.0.0.1:8080:8080 -it my_hello_world_server

Check it out at http://localhost:8080/

If you kill and try to start again, you will see docker: Error response from daemon: Conflict. The container name "/my_hello_world_server" is already in use by container "e22e524035e3d939e431c1672945f7f962daecaa1c6368bb66a8ec2e6d408cbc". You have to remove (or rename) that container to be able to reuse that name. To deal with that just delete that name with docker rm my_hello_world_server

Or run with

Bash
1
docker rm my_hello_world_server; docker run --name my_hello_world_server -p 127.0.0.1:8080:8080 -it my_hello_world_server

or as someone pointed out, start the container with –rm to remove it on the exit

Bash
1
docker run --rm --name my_hello_world_server -p 127.0.0.1:8080:8080 -it my_hello_world_server

Optimizing Container Image Size

There is one problem, though; our docker container image is big. Check its size with ``

Bash
1
2
$ docker image inspect my_hello_world_server --format='{{.Size}}' | numfmt --to=iec-i 350Mi
350Mi

Wait for what? 350MB for just a hello world web server?

Our binary is small, and this indicates that something else is going on. $ du -shc bin/server 7.0M bin/server 7.0M total

Let’s check the size of the base image

Go
1
2
$ docker image inspect golang:1.13.7-alpine3.11 --format='{{.Size}}' | numfmt --to=iec-i
343Mi

So, the base image which we need for building the binary is enormous. But we don’t need to go build the chain at the time of execution. There are two options. We can build it on our machine outside the docker container and copy the binary. But that’s frowned. One has to be careful to build it with the right architecture. When you build and run it inside the same container architecture, you get that portability guarantee for free.

Another alternative approach is to do what’s called a multi-stage build. We will build the binary in one docker stage and then copy only that binary over to the next step.

So, let’s write Dockerfile2

Dockerfile
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Pull the image and call it base
FROM golang:1.13.7-alpine3.11 as stage1
# Copy the code
COPY src /codebase/src
RUN ls /codebase/src/main.go
# Build the binary
RUN cd /codebase && go build -v -o /codebase/bin/server ./src/main.go

FROM alpine:3.11 as stage2
# We will copy the final binary from the previous stage to this stage
COPY --from=stage1 /codebase/bin/server /server
ENV PORT=8080
# Specify the run command for the binary
CMD ["sh", "-c", "/server"]

Build and verify that it works ``

Bash
1
2
$ DOCKER_BUILDKIT=1 docker build -f Dockerfile2 -t my_hello_world_server2 . $ docker rm my_hello_world_server2; docker run --name my_hello_world_server2 -p 127.0.0.1:8080:8080 -it my_hello_world_server
...

And check its size

Bash
1
2
$ docker image inspect my_hello_world_server2 --format='{{.Size}}' | numfmt --to=iec-i
13Mi

Remember, 5.4MB was the base image, and 7MB is our new web server binary, so, this is the smallest we can get to anyways.

A build-time optimization

Right now, your build step is taking less than a second. Let’s try to what happens when we end up having a lot of unrelated files.

Bash
1
2
3
# Write a 1GB file
$ dd if=/dev/zero of=src/testfile bs=1024 count=1024000
# Now build it again

It takes about ~47 seconds on my machine at the “Transfer context…” stage. What happens is that the docker build happens on a docker server, and everything from the directory (which we specified as “.” while building) is transferred to the Docker server to build. The server will discard extraneous files, so your final image size is still the same, but the build time becomes significant. To avoid this problem, add src/testfile to .dockerignore file.

Bash
1
2
$ echo src/testfile >> .dockerignore
...

Now, try building again, and your build times would be back to normal. It is best to exclude big dirs like .git or bin from the Docker build step to keep the builds fast.

Persistence

Docker images don’t persist anything. The idea is to run stateless machines that are completely clean and which connect to stateful storage like object-storage or SQL to store data. So, Docker cleanly abstracts out storage and execution.

Look forward to part 2 of this post which talks about deploying Docker images to Google Cloud Run.