A Look at How Often Docker Images are Updated

In our last blog, we reported on operating systems usage on Docker Hub, focusing on official base images.

Most users do not build their container image from scratch, they build on top of these base images. For example, extending an image such as library/alpine:latest with their own application content. Whenever one of these base operating system images is updated, images built on top are typically rebuilt in order to inherit the fixes included in the base image. In this blog, we will be looking at the update frequency of base images: frequency of updates, changes made and how that impacts end users.

Whenever one of these base operating system images is updated, images built on top are typically rebuilt in order to inherit the fixes included in the base image. In this blog, we will be looking at the update frequency of base images: frequency of updates, changes made and how that impacts end users.

If you want to check on the updated history of a particular image the Navigator makes that simple.

For example, looking at debian:latest, currently the most popular base operating system among official images.

Here you can see the date that the image was created and when it was analyzed by Anchore.
The Anchore service continually polls Docker Hub and when an update to a repository is detected the list of tags and images are retrieved and any new images are downloaded and analyzed. Anchore maintains a database with image and tag history so at any point in time previous versions of a Tag may be inspected. Clicking on the Previous Image button navigates to the image that was tagged library/debian:latest previously. The Next Image button is disabled because, at the time the screenshot was acquired, image ID this was the latest image tagged as library/debian:latest.

By clicking on Previous Image in the top left, you can explore the Navigator’s analysis of older images of the same tag. In this case, the previous version is only a week old, but for the most part, you will see that Debian is updated every two to five weeks.

Putting these dates onto a timeline, we see that debian:latest is updated roughly every month. Looking at the update frequency of other popular official operating system images, once a month is just about average. While this might seem ideal, what really matters is the timeliness of updates and the content of the update. For example, if a new critical vulnerability is discovered the day after the scheduled image update then a user should not wait another month for an update. Users can certainly update these images with fixes, in fact, this should be part of the due diligence that is performed in creating images, however, the content published in public registries should be secure off-the-shelf.

This timeline compares the update frequency of some major operating system repositories. Of these repos, none have a fixed update schedule. Ubuntu and Debian are pretty consistent, while the rest are quite varied. For example, CentOS sticks to about an update a month now, but before would have large gaps, up to 3 months long, between updates. On the flip side, Oracle Linux has clusters where multiple updates will come out in a short time period. What is interesting is that there are 4 that all have had 8 or 9 updates over the last year. Is that the number where exposure to security issues and pushing too many updates is balanced? Something else to consider is that having more packages means that there are more things to keep up to date, so lightweight operating systems like Alpine and BusyBox do not need as much maintenance. However, this doesn’t explain why CentOS and Fedora are updated infrequently, as they are both much larger than Debian and Ubuntu.

Moving on to popular non-OS images, the difference in update frequency is striking. NGINX, the repo with the fewest updates here, has more updates than Oracle Linux, which had the most updates of all the operating systems. Calling back to the fact that more complexity means more maintenance, this increase makes sense. In future blogs, we will dig into what is changing between image updates.

Because many of the application images are built on top of official base operating system images, in theory, they should be rebuilt when the underlying base image is updated. Sadly that is often not the case, where we will see a base operating system image be updated with a fix but the application image may not be rebuilt for several weeks and in some cases, it is rebuilt on top of an older base operating system image.

While all official images should follow Docker Hub best practices and should, therefore, be well maintained it is clear from our historic data that many images can be updated infrequently and carry security vulnerability for many weeks.

If you are trying to choose a non-official image, it is important that you look into its update history, since many images on Docker Hub are one-offs that were built by an engineer to ‘scratch an itch’ pushed to Docker Hub and never maintained. While that image may seem to have exactly what you are looking for it’s important to note that you are in effect adopting the image and you are then responsible for its care and feeding!

One last interesting piece of information is that there are a few days (10/21, 1/17, 2/28, 4/25, …) where many of the repos push updates at the same time. In many cases this occurs the day after their base image, debian:latest was updated. This backs up the idea that these images update more frequently because they have to keep up with updates of their base image.

As we alluded to earlier the content of an image update is just as, if not more, important than the timing of an update. In the next blog, we’ll dig into a more detailed timeline of updates, starting with the disclosure of a vulnerability, when the operating system vendor patched it, when that patch was included in a container image and when an application image pulled in the update.