Docker: Placing limits on container memory using cgroups

Containers themselves are light, but by default a container has access to all the memory resources of the Docker host.

Internally Docker uses cgroups to limit memory resources, and in its simplest form is exposed as the  flags “-m” and “–memory-swap” when bringing up a docker container.

sudo docker run -it -m 8m --memory-swap 8m alpine:latest /bin/sh

However, you need to first ensure that the Docker host has cgroup memory and swap accounting enabled, so this article will go through each step, and then validate these limits using a real application.

cgroup memory limits

In order to set maximum limits on memory, you need to have cgroup memory and swap accounting enabled in the kernel.  On RPM systems (RHEL), this is already enabled, but Debian/Ubuntu need to set them in grub.  To check, run docker info, and look for warnings about swap limit support.

$ docker info | grep swap
WARNING: No swap limit support

If you see the message above, or “WARNING: Your kernel does not support cgroup swap limit.”, then modify “/etc/default/grub” as below:

GRUB_CMDLINE_LINUX_DEFAULT="cgroup_enable=memory swapaccount=1"

Then update grub (sudo update-grub) and reboot.

Validate cgroup settings

In order to validate these settings have taken, open one console to view the systemd logs and look for messages containing “cgroup”.

journalctl -f | grep cgroup

And in another console, spin up a very small alpine container.

$ sudo docker run -it -m 8m --memory-swap 8m alpine:latest /bin/sh

If you see an immediate warning message like below from the console running docker or a similar message from journalctl, that means the cgroup limits are not in place.

# if you see this from the docker console, cgroup limits not in place
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

# If ouput from journalctl, cgroup limits not in place
level=warning msg="Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap."

If you do not see these messages, then you should have a container with an 8Mb limit on memory usage.  The “free” utility does not work for cgroups, so cat the following file to check the memory limit inside the container.

$ sudo docker run -it -m 8m --memory-swap 8m alpine:latest /bin/cat /sys/fs/cgroup/memory/memory.limit_in_bytes
8388608

$ sudo docker run -it -m 10m --memory-swap 10m alpine:latest /bin/cat /sys/fs/cgroup/memory/memory.limit_in_bytes
10485760

Program to exercise memory limits

We’ve seen the “memory.limit_in_bytes” value mirror the amount of memory we requested as a ceiling.   But let’s put that to the test with a sample program running in the container allocating memory until failure.

I have a github project, golang-memtest, that allocates 1Mb chunks of memory at a time until it reaches a maximum amount specified by the user.  We are going to use this program to test the memory constraints.

Grab the project from github:

git clone https://github.com/fabianlee/golang-memtest.git

# ensure you have the make utility
sudo apt-get install make -y

# use Dockerfile to build image
make docker-build

Now run the golang-memtest binary in a docker container that has no limits on memory.   By default, it allocates 3Mb.

$ sudo docker run -it --rm fabianlee/golang-memtest:1.0.0 1

Alloc = 0 MiB	TotalAlloc = 0 MiB	Sys = 66 MiB	NumGC = 0
Asked to allocate 3Mb

Alloc = 1 MiB	TotalAlloc = 1 MiB	Sys = 68 MiB	NumGC = 0
Alloc = 2 MiB	TotalAlloc = 2 MiB	Sys = 68 MiB	NumGC = 0
Alloc = 3 MiB	TotalAlloc = 3 MiB	Sys = 68 MiB	NumGC = 0

Alloc = 3 MiB	TotalAlloc = 3 MiB	Sys = 68 MiB	NumGC = 0
SUCCESS allocating 3Mb

As expected, this is a success.  It allocated 3Mb in the container, and without limits, the Docker container had access to all the memory resources of the host machine.

But now let’s run it with an 8Mb ceiling on container memory, and allocate 4Mb using the program.

$ sudo docker run -it --rm -m 8m --memory-swap 8m fabianlee/golang-memtest:1.0.0 4
Alloc = 0 MiB	TotalAlloc = 0 MiB	Sys = 66 MiB	NumGC = 0
Asked to allocate 4Mb

Alloc = 1 MiB	TotalAlloc = 1 MiB	Sys = 68 MiB	NumGC = 0
Alloc = 2 MiB	TotalAlloc = 2 MiB	Sys = 68 MiB	NumGC = 0
Alloc = 3 MiB	TotalAlloc = 3 MiB	Sys = 68 MiB	NumGC = 0
Alloc = 4 MiB	TotalAlloc = 4 MiB	Sys = 68 MiB	NumGC = 1

Alloc = 4 MiB	TotalAlloc = 4 MiB	Sys = 68 MiB	NumGC = 1
SUCCESS allocating 4Mb

Once again it is successful, as it should be.  We had a maximum of 8Mb and we allocated 4Mb.

But now let’s exceed the amount of memory available by allocating 10Mb.

$ sudo docker run -it --rm -m 8m --memory-swap 8m fabianlee/golang-memtest:1.0.0 10
Alloc = 0 MiB TotalAlloc = 0 MiB Sys = 66 MiB NumGC = 0
Asked to allocate 10Mb

Alloc = 1 MiB TotalAlloc = 1 MiB Sys = 66 MiB NumGC = 0
Alloc = 2 MiB TotalAlloc = 2 MiB Sys = 68 MiB NumGC = 0
Alloc = 3 MiB TotalAlloc = 3 MiB Sys = 68 MiB NumGC = 0
Alloc = 4 MiB TotalAlloc = 4 MiB Sys = 68 MiB NumGC = 1
Alloc = 5 MiB TotalAlloc = 5 MiB Sys = 68 MiB NumGC = 1
Alloc = 6 MiB TotalAlloc = 6 MiB Sys = 68 MiB NumGC = 1

We got a very abrupt exit from the program while allocating the 7th Mb.  If you look through last lines of the systemd logging with journalctl, you can see how the kernel is reporting that memory usage has been reached and the process is being killed.

$ journalctl -n 100 --no-pager | grep memory -A5
...
Jan 18 16:23:22 docker1804 kernel: memory: usage 8192kB, limit 8192kB, failcnt 0
Jan 18 16:23:22 docker1804 kernel: memory+swap: usage 8192kB, limit 8192kB, failcnt 46
...
Jan 18 16:23:22 docker1804 kernel: Memory cgroup out of memory: Kill process 2808 (golang-memtest) score 1053 or sacrifice child
Jan 18 16:23:22 docker1804 kernel: Killed process 2808 (golang-memtest) total-vm:104576kB, anon-rss:7156kB, file-rss:1396kB, shmem-rss:0kB
...

When this happens, the program running in the container will not get a chance to cleanup, nor will it get an OS level signal (e.g. SIGTERM or SIGQUIT).  The OOM Killer process in the kernel abruptly kills the process.

The OOM killer also exposes this event in dmesg.

$ dmesg | grep -Ei "killed process"
[ 1015.802037] Killed process 2808 (golang-memtest) total-vm:104576kB, anon-rss:7156kB, file-rss:1396kB, shmem-rss:0kB

 

REFERENCES

docker, runtime constraints on resources

docker, container has access to all host resources

kernel.org, cgroups reference

stackoverflow, no swap limit support fix

docker, your kernel does not support swap limit capabilities

stackexchange, swap limits

linuxhint, cgroups and cpu.cfs_period_us and cpu.cfs_quota_us

stackoverflow, definitions of cpu.cfs_period_us and cpu.fs_quota_us

thegeekdiary, quotas on xfs

linuxtechi, good explanation of xfs quotas

held.org.il, examples of project based quotas, /etc/projects and /etc/projid files

scriptthe.net, hard quotas on directory

stackoverflow, setting GRUB_CMDLINE_LINUX_DEFAULT to support xfs quota

xfs_quota man page, shows how to init configuration without config files

invent.life, create loopback device with ext4

man mount page

github, cadvisor container resource monitor

stackoverflow, warning on cgroup swap limit

github, swap memory limit and accounting, grub config

tc man page, network traffic control

github magnific0, wondershaper uses tc under the hood

chandanduttachowdhury, network limiting with tc

cybercracking, tc examples; adding delay to every package, variance, packet loss

github, docker image of fabianlee/golang-memtest

 

NOTES

Check memory limits inside container

cat /sys/fs/cgroup/memory/memory.usage_in_bytes
cat /sys/fs/cgroup/memory/memory.max_usage_in_bytes