How Many CVEs?

For most users analyzing or auditing container images usually means running a CVE scan and while that is certainly required, it should be just the first step. Anchore supports creating policies that can be used to assess the compliance of your containers, these policy checks could cover security, starting with the ubiquitous CVE scan but then going further to analyze the configuration of key security components, for example, you could have the latest version of the apache webserver but have configured the wrong set of TLS Ciphers suites leading to insecure communication. Outside of security, policies could cover application-specific configurations to comply with best practices or to enable consistency and predictability.

Today there are many tools that can perform CVE scans of a container image however when we speak to users we often hear that either they do not perform these scans or if they do they do not gate container deployments based on the results of these scans. When we asked these users why they didn’t stop their deployment based on the CVE scanner’s results, we were told “if we did then we’d not deploy any containers – they all fail!

This is a common issue, for example, if you look at the official images on Docker Hub or Docker Store for CentOS, Debian, Oracle or Ubuntu they all appear to have high or critical vulnerabilities many of which are unfixed, some appear to be CVEs that are unresolved for over a year or more.

We have covered this topic previously with respect to CentOS where we saw many vulnerabilities reported by other tools in the CentOS image that were not accurate and similar issues with Oracle and RHEL images.

It has been pointed out that Debian, which as we discussed in our previous blog, is the most popular operating system used on Docker Hub seems to have the most vulnerabilities. Regardless of the CVE scanner used, the Debian image looks insecure with many unpatched vulnerabilities, but looks can be deceptive.

We will take a look at the Debian image and discuss the results found by various scanners, explain the differences in results and show how you can remove the noise and get a clear view of the security of your containers.

Which Package is Vulnerable?

Let’s start by looking at a vulnerability reported in the latest Debian image: CVE-2017-12424 which describes a vulnerability in the shadow project which provides tools and libraries for maintaining the password database.

Looking at the output of most of the CVE scanners you will see output similar to the following:

Here we can see that shadow version 4.4-4.1 is installed and vulnerable to the critical severity CVE 2017-12424. But if you look for the shadow package in your image you will not find it.

root@debian:/# dpkg -s shadow
dpkg-query: package 'shadow' is not installed and no information is available

So if you try to upgrade that individual package you’ll receive an error.

Debian reports CVEs against source packages rather than against binary packages so in this example while the source package was shadow the binary package is called passwd.

You can look up the source package for a given binary package either using the dpkg utility or using apt-get source.

root@debian:/# dpkg -s passwd | grep Source
Source: shadow

This example using the shadow package is rather straight forward, I would expect that most readers of this article would quickly work out the mapping to the passwd binary package however in many other cases things are not so simple. For example, many non-kernel binaries are created from the Linux source package this leads to some tools reporting kernel CVEs in a container image that includes no kernel.

For this reason in the Anchore Cloud and Anchore Engine both report on the binary package not the underlying source package.

How Many Vulnerabilities?

Analyzing the same image with different scanners often results in very different numbers of reported vulnerabilities. In some cases, as we described in a previous blog, this may be a result of the scanner not taking into backporting of fixes, or not using distribution’s own security vulnerability feed. In other cases the mapping of CVE to source and image package causes confusion. For example, looking at the current debian:latest image using Anchore’s scanner we can see eight packages are shown as being vulnerable to the vulnerability described in CVE 2016-2779.

This CVE was reported against the util-linux source package. The binary packages listed in this report are all built from the util-linux source package. A tool, such as Anchore, that reports on binary packages will report seven packages against that CVE while a tool that reports on source packages may only report one. Whether one or all of these binary packages are vulnerable to the vulnerability described in the CVE is something that requires digging deeper into the bug reports and mailing list traffic. Ideally, the distributions would provide more binary specific details in the vulnerability data to assist in this mapping.
A more interesting question is: if even one of these packages was vulnerable why after nearly 18 months are fixes not available for these packages? We need to dig deeper.

Is it Really Vulnerable?

As we saw in the previous example we often see unfixed CVEs in images. For example in the current debian:latest image we see 50 vulnerabilities that have no fixes and 12 of these are rated as High severity.

Even if we just counted the number of unique CVEs, not packages, we still see: three high, three medium, one low and 15 Negligible CVEs.

Why is that number so high? Since this is the latest official Debian image there is certainly more to this, especially given the fact that the Debian security team are renowned for their focus and responsiveness.

We can start by looking at the CVE in Debian’s security tracker: CVE-2016-2779. Here you can see that the current stable version of Debian, stretch, is classified as being vulnerable, however, looking in the notes section we see the following:

The security team notes that no Debian Security Advisory will be issued for this vulnerability (no-dsa). You can read more about this on the Debian Security FAQ here, however, the concept here is that while a source package may be vulnerable the way that it is compiled or deployed may mitigate the issue. In some cases, this may be because the package is built with specific compile-time options that don’t trigger the security issue in other cases this may be due to the environment in which it is run. In this specific case, it was decided that the best approach was to address the underlying issue in the Linux kernel so that no version of this package could trigger the vulnerability.

Based on this data we should not be concerned about this high criticality CVE in our scan results. There is certainly an argument to be made that given this fact maybe this CVE should not be reported in the Debian vulnerability feed, which is the approach that the Red Hat-based distributions such as CentOS, Oracle and RHEL take. I have not looked into the history around this decision but can imagine the strong arguments to be made on both sides. Presumably anticipating this issue, the Debian team includes metadata in their vulnerability feed that indicates the No Debian Security Advisory decision and commentary. This is data we can then use as part of our analysis of the image.

Looking at the current debian:latest image using Anchore Cloud we can view the image’s policy status to see if it PASSES or FAILES based on the default image policy.

Here we can see that the image has failed, scrolling down we can see 12 High criticality vulnerabilities that led to this result. You can read more about the image policies here.

Anchore includes support for whitelists which allows for certain policy checks, such as select CVEs, to be suppressed. A CVE may be present in a package but not exploitable in that package’s configuration as we saw earlier in this blog. So using the Anchore policy editor a user can create and manage whitelists to filter out false positives.

Whitelists can be created and managed in the policy editor or from the image’s policy view but in the case of Debian security advisories it is easier to create a whitelist and upload into the Anchore Cloud.

We have published a simple utility that creates whitelists based on the data published in the Debian security tracker. You can clone this utility from our public GitHub repository.

git clone

Running the debian-whitelist.py utility will create a JSON document for each of the current Debian releases: Wheezy, Jessie, Stretch and Buster (which right now has no whitelisted CVEs).

1. Create a free account on the Anchore Cloud

2. Open the Policy editor by selecting the menu icon on the left navigation menu.

3. Expand the Whitelist editor

4. For each whitelist press the “Upload Whitelist Item” button and upload the JSON document. The whitelists will be named based on the version of Debian and the date. These names can be edited to be more user friendly.

You will now have whitelists for each Debian version.

Next, we need to use the Mapping Editor to define what whitelist is used for a specific image.

5. Expand the Mapping Editor

6. Select “Create New Mapping” to create a new mapping.

7. Give your mapping a name, eg. “Debian latest”

8. Specify library/debian as the repository name

9. Specify latest as the tag

10. Select the whitelist you created from the dropdown.

11. Select “Save All” to save the whitelist and policy mapping.

Now when you view the Debian image you will see that a user-defined policy has been used and that the image passes.

The whitelists CVEs can be viewed by checking the Show Whitelisted entries checkbox.

Currently, Anchore only applies the whitelist to the policy view and not to the list of CVEs presented in the security tab which shows all CVEs present in the image.

By using the default policy we are just performing basic CVE policy checks on the image but using the policy editor you can create policies that do much more.

Anchore Cloud 2.0

Today Anchore announced the release of Anchore Cloud 2.0 which builds on top of Anchore’s open source Engine to provide a suite of tools to allow organizations to perform a detailed analysis of container images and apply user-defined policies to ensure that containers meet the organization’s security requirements and operational best practices.

Anchore released the Anchore Navigator back in October 2017 and since then thousands of users have used the service to search for container images, perform analysis on these images and sign up to receive notifications when images were updated.

The Anchore Cloud 2.0 release adds a number of exciting new features for all users and a new paid tier which offers support and added features for subscribers.

Graphical Policy Editor

The new graphical policy editor allows all users to define their own custom policies and map which policies are used with which images. These policies can include checking for security vulnerabilities, package whitelists, blacklists, configuration files, secrets in image, manifest changes, exposed ports and many other user-defined checks. The policy editor supports CVE whitelisting – allowing a curated set of CVEs to be excluded from security vulnerability reporting.
Using the policy mapping feature, organizations can set up multiple different policies that will be used on different images based on use case. For example, the policy applied to a web-facing service may have different security and operational best practices rules than a database backend service.

Anchore policy editor view

Private Repositories

Subscribers can configure the Anchore Cloud to scan and analyze images in private repositories stored on DockerHub and Amazon EC2 Container Registry (ECR).

Once configured the service checks for changes to the repository approximately every 15 minutes. When a change is detected, for example, a new image is added to the repository or changes to tags, Anchore will download any new images, perform deep inspection and evaluate the images based on the policies defined by the user.

Anchore registries editor view

Notifications

Previously the Anchore Cloud allowed users to subscribe to a tag and be notified when that tag was updated – for example when a new debian:latest images were pushed to Docker Hub.

For subscribers, the Anchore Cloud can now alert you by email when CVEs have been added or removed from your image and when the policy status of your image has changed, for example, an image that previously passed is now failing policy evaluation.

Example of Anchore notification

On-Premises Integration

Anchore Cloud supports integration with Anchore’s open-source Engine for on-premises deployments, allowing the policies defined on Anchore Cloud service to be applied to images created and stored on-premises.

Anchore Cloud supports integration with CI/CD platforms such as Jenkins, allowing containers built in the cloud or on-premises to be scanned as part of the CI/CD workflow ensuring proper compliance prior to production deployment.

More Than Just Security Updates

In our last blog, we talked about how quickly different repos respond to updates to their base images. Any changes made by the base image will need to be implemented in the application images built on top of it, so updates to popular base images spread far and, as we saw from the last blog, quickly.

The only type of update we have covered so far in this series of blogs is security updates. However, that is only one part of the picture; package updates may contain non-security bug fixes and new features. To gain some insight into what is being changed in these updates, we have broken down exactly what packages change for a few of the more popular operating system images.

One interesting time to look at package differences is when the operating system gets updated to a new version.

 

Centos 7.4 overview in Anchore

Looking at the overview tab for library/centos:latest, when it just got updated to version 7.4, the Navigator shows in the chart on the righthand side that there were many changes with this update. Shown below is a breakdown of which packages have been updated since last September. Only a portion of the packages are shown, you can find the rest in the link below.

Focusing in on just that most recent update, we see that 80 of the 145 packages have been updated. The image from Sep 13th was CentOS 7.3, while the one from Sep 14th is CentOS 7.4. Looking into some of the changes, bash, like many others, received backports of bug fixes. Other packages were new additions, such as elfutils-default-yama-scope, while one, pygobject3-base, was removed from the image. In terms of CVE/Security updates, this was an ineffectual update; a quick check of the security tab of both versions (7.3, 7.4) shows that there were no changes in CVEs between the two.

Click the button below to access the full spreadsheet with all package updates for 6 popular operating systems.

View the Full Spreadsheet

In the spreadsheet, you’ll see Alpine stands out in terms of image size and reduced package count. Having more packages means having more packages to maintain. Even if Alpine were to update almost all of its 11 packages, as it did on May 25th, there would not be as many changes as a standard Debian update, such as the one on June 7th, where 25 of 81 packages were updated. There is a trend towards lightweight images, and the appeal of simpler updates might be one reason behind it. Among public repositories, Alpine is growing its share of usage as a base image. Other base operating systems are beginning to including slim versions of their containers, such as Debian, which has a slim version of each of its releases as well as Oracle and Red Hat.

Comparing the sizes of the two Debian tags included in the spreadsheet, stretch and stretch-slim, we see that the slim version is about half the size of the original, 95 MB vs 53 MB. The trend goes across releases too; Debian Stretch (Debian 9) images are around 90 MB while Jessie (Debian 8) images are around 120 MB. Ubuntu 16.04 is around 120 MB while 17.04 is around 90 MB. One repository not slimming its images is CentOS. It does not currently include slim versions, even though Red Hat Enterprise Linux, from which CentOS is based, has a slim image known as RHEL Atomic.

Part of slimming down containers is removing packages that are not necessary. In some instances, packages are included that are arguably not required in the context of a container, such as device-mapper or dracut. This harkens back to a previous blog, where we discussed how containers are often being used as micro-VMs rather than microservices. The packages listed above, among others, lend themselves to running a full machine, rather than running just a single application. Removing these extra packages is not as simple as it initially appears. For example in the CentOS 7 image dracut, which builds initramfs images to boot an OS, is pulled in as a requirement by kmod, which provides an infrastructure for kernel module loading, which is pulled in by systemd. We see many similar examples in the traditional Linux vendors’ images, where the package management system was designed before the advent of containers. This is a topic we will revisit in a future blog.

Even though smaller base images require less maintenance and storage, having fewer packages means less functionality. Most application images built on top of Alpine require that users add many more packages onto the base image so application images are often significantly larger than the 4MB base image. But having the choice to add to an image rather than working out how to remove packages certainly simplifies maintenance for the developer and allows for a reduced attack surface. In our next blog, we will look at some popular application images that are available based on multiple distributions to see how the image size, bug count and update frequencies compare.

To Update or Not to Update

In the previous blog, we presented our analysis of image update frequency for official DockerHub images and the implications for application images built on top of these base images. It was pointed out in a Reddit reply by /u/AskOnDock29 that users can update the operating system packages in the images themselves, independently of the official image and so the frequency, or infrequency, of base image updates, is not a concern since this is easily manageable by end-users. This Redditor is indeed correct, users can update operating system packages when building on top of an official or other base image. Whether this happens, in reality, is an interesting question that we will get to shortly.

When the Anchore Navigator downloads images from Docker Hub we derive the Dockerfile from metadata contained in the image. The Anchore Navigator’s Build Summary pane on the overview tab displays this information by showing the commands run in each layer of the dockerfile. Using library/msyql as an example, we see that new files and packages are added to the image however the base packages are not updated.

This is the view that the navigator gives of the mysql dockerfile. The derived Dockerfile is not identical to the Dockerfile used to construct the image, since metadata such as the name of files copied or the image that was used as a base, for example, are lost during the build. But the derived Dockerfile does include the commands used and image metadata. In this example, searching through each layer, we do not find package update instructions.

Running some quick analysis against our dataset, out of the 22,413 non-official images tagged as ‘latest’ since September of last year, 6,099 (27%) included package update commands in their Dockerfiles. Grouping by repository instead of image, 80 out of 559 (14%) non-official repositories at some point over the last year had update commands. This does not mean that all of these images have outdated packages or known CVEs since their base images may be up to date and there are other ways to include the latest packages, for example starting from scratch and adding in files and packages manually. Anchore’s dataset includes file and package manifests for all of these images so we can verify the package set to look for updates without requiring analysis of the Dockerfile.

So should a user upgrade operating system packages when they build their images? Ideally no. Docker’s best practices recommend that the user should not run apt-get upgrade or dist-upgrade within their Dockerfile, but should instead contact the maintainer of the parent image used. Minimizing package changes within the image also helps to improve the reproducibility of the build.

If there are no upgrade/update commands or manual package management, then there are two ways to keep the base image up to date: the user can update it manually, or the maintainers can push a new image whenever the base image is updated, which the user then has to use as they rebuild their image. As previously covered, the second method is preferable. Since the majority of repositories do not use upgrade commands and therefore depend on the maintainer or user updating the base image, it would be interesting to see how well repositories are at handling the responsibility and keeping up to date with base image updates.

Starting with about 10,000 public community images, we see that a new, updated image is pushed close to a week (six days and 20 hours) after an updated operating system image is made available. We have excluded the first image of each repository from the analysis since our focus is the frequency and timing of updates. The analysis does include many images that are just small side projects that served a single purpose and aren’t actively updated. At the same time, images that get updated nightly are also in there, so there is a balance.

However, looking at a number of the most popular community images offers a view into repositories that have a following and need to be maintained more actively than other images.

Here, the average time to update is just a little over five days, a good bit lower than the average for all of the images.

As mentioned earlier, no official images include update commands, so updates have to come in the form of image updates. Due to their elevated standard and visibility, these updates need to be timely; an update to the base image should be responded to quickly. We see that this is, in fact, the case; the average update times across the official images are about one day and 10 hours.

Taking a similar look into popular images, things to note are that none of these respond in longer than three days, and there is no real correlation between popularity and update times among these images. A repo like library/node takes nearly three days on average, while library/mysql takes a little over half a day. There is certainly a correlation on a larger scale – more popular images have quicker update times – but there is quite a bit of variance along the way.

To fully visualize see why these updates matter, we’ll go through the life cycle of a security flaw, RHSA-2017:1481. This flaw exploited the glibc package in Red Hat Enterprise Linux (RHEL) and could allow a user to increase their privileges. Because CentOS is compiled from RHEL sources, any images that are built on top of RHEL or CentOS carry this flaw. To focus on just one, we will be looking at jboss/wildfly, which is built on top of CentOS. Knowledge of the flaw was made public on June 19th of this year, and a fix was published by Red Hat almost immediately, with the fix for CentOS being made available on June 20th. The CentOS image was then updated to include the fix 15 days later on July 5th.

Using the security tab for that image, you can see that RHSA-2017:1481 is not present.

However, clicking on the previous image button will take you to the image that was pushed on June 5th, which was affected by the glibc flaw.

The maintainers of the jboss/wildfly image, have a really good update schedule, so a new image that implemented the fix was made available within only 50 minutes of the CentOS image being released, however since the parent image was only updated after 15 days the wildfly image was vulnerable during that period.

There are a number of key points to take away from this analysis:

  • Choose your base images carefully. Ensure that the base image you are using is well maintained. If not consider maintaining your own base image or pick a different base image to use.
  • Just because an image is official that does not mean that it is frequently updated or necessarily the best image to build from. You may other repos that have images suited to your needs.
  • Keep track of updates to the base images that you use. One method for tracking updates and receiving notifications is covered in this blog.

The frequency of updates is not the only metric to consider when looking at images; you need to know what has changed. Was the image just rebuilt based on a schedule? Or were files and packages changed? Users often focus solely on CVE (security) updates but do not consider other package updates that include bugfixes. In our next blog, we will take a deeper look into what changes in an image update.

A Look at How Often Docker Images are Updated

In our last blog, we reported on operating systems usage on Docker Hub, focusing on official base images.

Most users do not build their container image from scratch, they build on top of these base images. For example, extending an image such as library/alpine:latest with their own application content. Whenever one of these base operating system images is updated, images built on top are typically rebuilt in order to inherit the fixes included in the base image. In this blog, we will be looking at the update frequency of base images: frequency of updates, changes made and how that impacts end users.

Whenever one of these base operating system images is updated, images built on top are typically rebuilt in order to inherit the fixes included in the base image. In this blog, we will be looking at the update frequency of base images: frequency of updates, changes made and how that impacts end users.

If you want to check on the updated history of a particular image the Navigator makes that simple.

For example, looking at debian:latest, currently the most popular base operating system among official images.

Here you can see the date that the image was created and when it was analyzed by Anchore.
The Anchore service continually polls Docker Hub and when an update to a repository is detected the list of tags and images are retrieved and any new images are downloaded and analyzed. Anchore maintains a database with image and tag history so at any point in time previous versions of a Tag may be inspected. Clicking on the Previous Image button navigates to the image that was tagged library/debian:latest previously. The Next Image button is disabled because, at the time the screenshot was acquired, image ID this was the latest image tagged as library/debian:latest.

By clicking on Previous Image in the top left, you can explore the Navigator’s analysis of older images of the same tag. In this case, the previous version is only a week old, but for the most part, you will see that Debian is updated every two to five weeks.

Putting these dates onto a timeline, we see that debian:latest is updated roughly every month. Looking at the update frequency of other popular official operating system images, once a month is just about average. While this might seem ideal, what really matters is the timeliness of updates and the content of the update. For example, if a new critical vulnerability is discovered the day after the scheduled image update then a user should not wait another month for an update. Users can certainly update these images with fixes, in fact, this should be part of the due diligence that is performed in creating images, however, the content published in public registries should be secure off-the-shelf.

This timeline compares the update frequency of some major operating system repositories. Of these repos, none have a fixed update schedule. Ubuntu and Debian are pretty consistent, while the rest are quite varied. For example, CentOS sticks to about an update a month now, but before would have large gaps, up to 3 months long, between updates. On the flip side, Oracle Linux has clusters where multiple updates will come out in a short time period. What is interesting is that there are 4 that all have had 8 or 9 updates over the last year. Is that the number where exposure to security issues and pushing too many updates is balanced? Something else to consider is that having more packages means that there are more things to keep up to date, so lightweight operating systems like Alpine and BusyBox do not need as much maintenance. However, this doesn’t explain why CentOS and Fedora are updated infrequently, as they are both much larger than Debian and Ubuntu.

Moving on to popular non-OS images, the difference in update frequency is striking. NGINX, the repo with the fewest updates here, has more updates than Oracle Linux, which had the most updates of all the operating systems. Calling back to the fact that more complexity means more maintenance, this increase makes sense. In future blogs, we will dig into what is changing between image updates.

Because many of the application images are built on top of official base operating system images, in theory, they should be rebuilt when the underlying base image is updated. Sadly that is often not the case, where we will see a base operating system image be updated with a fix but the application image may not be rebuilt for several weeks and in some cases, it is rebuilt on top of an older base operating system image.

While all official images should follow Docker Hub best practices and should, therefore, be well maintained it is clear from our historic data that many images can be updated infrequently and carry security vulnerability for many weeks.

If you are trying to choose a non-official image, it is important that you look into its update history, since many images on Docker Hub are one-offs that were built by an engineer to ‘scratch an itch’ pushed to Docker Hub and never maintained. While that image may seem to have exactly what you are looking for it’s important to note that you are in effect adopting the image and you are then responsible for its care and feeding!

One last interesting piece of information is that there are a few days (10/21, 1/17, 2/28, 4/25, …) where many of the repos push updates at the same time. In many cases this occurs the day after their base image, debian:latest was updated. This backs up the idea that these images update more frequently because they have to keep up with updates of their base image.

As we alluded to earlier the content of an image update is just as, if not more, important than the timing of an update. In the next blog, we’ll dig into a more detailed timeline of updates, starting with the disclosure of a vulnerability, when the operating system vendor patched it, when that patch was included in a container image and when an application image pulled in the update.

Just Because They Pushed Doesn’t Mean You Need to Pull

While that may sound like advice your mother gave you after you got into a fight at school we are actually talking about Docker images.

Yesterday we started to notice a lot of activity on our worker nodes on anchore.io which were analyzing a large number of images that were updated on Docker Hub.

The Anchore service monitors Docker Hub looking for changes made to our customer’s private images, official images and thousands of other tags of popular images on Docker Hub.

We poll Docker Hub and when images are updated our workers pull down the new images and perform analysis and policy evaluations. Users can also subscribe to images to get notifications when images they use are updated.

Since yesterday we’ve seen over a thousand images get updated including official OS based images such as Alpine, CentOS, Debian, Oracle, and Ubuntu.

What was odd was that looking at these images we saw no changes in files or package manifests. As part of Anchore’s analysis we look at all the files in the image down to the checksum level and all the package data, this allows us to perform policy checks that go beyond the usual CVE checks that you see with most tools.

We show a brief changelog summary on the overview page for an image, showing how many files and packages were added, removed or changed.

What had us scratching our heads yesterday was the high number of images with no apparent changes. The image metadata, such as ID and Digest were changed but the underlying content was the same.

Digging deeper it appears that while with the actual content of the images has not changed, the manifests have been updated. This seems to have been driven by a change to the bashbrew utility which is used to build official images. Bashbrew now defaults to using the manifest list format which allows for multi-arch images, so even if an image has been built only for a single architecture it will now use the manifest list.

We will continue to dig into this but in the meantime, we’d recommend that you look to see what, if anything, changed in an image before you rebuild all your application images on top of a new base image.

Introducing the Anchore Engine

Today Anchore announced a new open source project that allows users to install a local copy of the powerful container analysis and policy engine that powers the Anchore Navigator service.

The Anchore Engine is an open source project that provides a centralized service for inspection, analysis and certification of container images. The Anchore Engine is provided as a Docker container image that can be run standalone or on an orchestration platform such as Kubernetes, Docker Swarm, Rancher or Amazon ECS.

Using the Anchore Engine, container images can be downloaded from Docker V2 compatible container registries, analyzed and evaluated against user-defined policies. The Anchore Engine can integrate with Anchore’s Navigator service allowing you to define policies and whitelists using a graphical editor that is automatically synchronized to the Anchore Engine.

The Anchore Engine can be integrated into CI/CD pipelines such as Jenkins to secure your CI/CD pipeline by adding image scanning including not just CVE based security scans but policy-based scans that can include checks around security, compliance and operational best practices.

The Anchore Engine can be accessed directly through a RESTful API or via the Anchore CLI. Adding an image to be analyzed is a simple one-line command:

$ anchore-cli image add docker.io/library/nginx:latest

The Anchore Engine will now download the image from the registry and perform deep inspection collecting data on packages, files, software artifacts and image metadata.

Once analyzed we can retrieve information about the image. For example, retrieving a list of packages:

$ anchore-cli image content docker.io/library/nginx:latest os

Will return a list of operating system (os) packages found in the image. In addition to operating system packages, we can retrieve details about files, Ruby GEMs and Node.JS NPMs.

$ anchore-cli image content docker.io/library/rails:latest gem
Package Version Location
actioncable 5.0.1 /usr/local/bundle/specifications/actioncable-5.0.1.gemspec
actionmailer 5.0.1 /usr/local/bundle/specifications/actionmailer-5.0.1.gemspec
actionpack 5.0.1 /usr/local/bundle/specifications/actionpack-5.0.1.gemspec
actionview 5.0.1 /usr/local/bundle/specifications/actionview-5.0.1.gemspec
activejob 5.0.1 /usr/local/bundle/specifications/activejob-5.0.1.gemspec
activemodel 5.0.1 /usr/local/bundle/specifications/activemodel-5.0.1.gemspec
activerecord 5.0.1 /usr/local/bundle/specifications/activerecord-5.0.1.gemspec
activesupport 5.0.1 /usr/local/bundle/specifications/activesupport-5.0.1.gemspec
arel 7.1.4 /usr/local/bundle/specifications/arel-7.1.4.gemspec

And if we wanted to see how many security vulnerabilities in an image you can run the following command:

$ anchore-cli image vuln docker.io/library/ubuntu:latest os
Vulnerability ID Package Severity Fix Vulnerability URL
CVE-2013-4235 login-1:4.2-3.1ubuntu5.3 Low None http://people.ubuntu.com/~ubuntu-security/cve/CVE-2013-4235
CVE-2013-4235 passwd-1:4.2-3.1ubuntu5.3 Low None http://people.ubuntu.com/~ubuntu-security/cve/CVE-2013-4235
CVE-2015-5180 libc-bin-2.23-0ubuntu9 Low None http://people.ubuntu.com/~ubuntu-security/cve/CVE-2015-5180
CVE-2015-5180 libc6-2.23-0ubuntu9 Low None http://people.ubuntu.com/~ubuntu-security/cve/CVE-2015-5180
CVE-2015-5180 multiarch-support-2.23-0ubuntu9 Low None http://people.ubuntu.com/~ubuntu-security/cve/CVE-2015-5180

As with the content sub-command we pass a parameter for the type of content we want to analyze – in this case, OS for operating system packages. Future releases will add support for non-package vulnerability data.

Next, we can evaluate the image against a policy that was defined either manually on the command line or using the Anchore Navigator

$ anchore-cli evaluate check registry.example.com/webapps/frontend:latest
Image Digest: sha256:86774cefad82967f97f3eeeef88c1b6262f9b42bc96f2ad61d6f3fdf54475ac3
Full Tag: registry.example.com/webapps/frontend:latest
Status: pass
Last Eval: 2017-09-09T18:30:22
Policy ID: 715a6056-87ab-49fb-abef-f4b4198c67bf

Here we can see that the image passed. To see the details of the evaluation you can add the –detail parameter. For example:

$ anchore-cli evaluate check registry.example.com/webapps/broker:latest --detail
Image Digest: sha256:7f97f3eeeef88c1b6262f9b42bc96f2ad61d6f3fdf54475ac354475ac
Full Tag: registry.example.com/webapps/broker:latest
Status: fail
Last Eval: 2017-09-09T17:30:22
Policy ID: 715a6056-87ab-49fb-abef-f4b4198c67bf

Gate                   Trigger              Detail                                                          Status        
DOCKERFILECHECK        NOHEALTHCHECK        Dockerfile does not contain any HEALTHCHECK instructions        warn
ANCHORESEC             VULNHIGH             HIGH Vulnerability found in package - libmount1 (CVE-2016-2779 - https://security-tracker.debian.org/tracker/CVE-2016-2779)                    stop          
ANCHORESEC             VULNHIGH             HIGH Vulnerability found in package - libncurses5 (CVE-2017-10684 - https://security-tracker.debian.org/tracker/CVE-2017-10684)                stop          
ANCHORESEC             VULNHIGH             HIGH Vulnerability found in package - libncurses5 (CVE-2017-10685 - https://security-tracker.debian.org/tracker/CVE-2017-10685)                stop

Here you can see that the broker image failed the policy evaluation due to 3 high severity vulnerabilities.

We can subscribe to an image to receive webhook notifications when an image is updated when new security vulnerabilities are found or if the image’s policy status is updated – for example going from Fail to Pass.

$ anchore-cli subscription activate image tag_update registry.example.com/webapps/broker:latest

A Breakdown of Operating Systems of Docker Hub

While containers are thought of as “micro-services” or applications, if you open up the image you will see more than just an application – more often than not, you’ll see an entire operating system image along with the application. If you dig into the image you will find that certain parts of the operating system are missing such as kernel and hardware-specific modules and often, but sadly not always, the package list is reduced. If you are deploying a pre-packaged container built by a 3rd party you may not even know what operating system has been used to build the container let alone what packages are inside.

As part of the analysis that Anchore performs on the container, it identifies the underlying operating system. To check this out go to the Anchore Navigator and search for the image that you wish to inspect. Halfway down on the overview tab you’ll see the operating system name and version listed. For example, searching for library/nginx:latest will show that it is built on top of Debian 9, Stretch.

nginx

Let’s take a look at what operating systems are used on Docker Hub:

  • Which operating system gets used the most?
  • How has the choice of operating system changed over time?
  • Are there different usage patterns for official images compared to public images?

To get our toes wet, here is the breakdown of what operating systems official images are being built on.

It is clear that Debian is the most popular, with Alpine taking second place, and then a number of others each taking a smaller share. Raspbian will also be analyzed even though it doesn’t appear in this chart, because it is not used as a base OS by any official images. When looking at public images’ usage of operating systems, we will see that Raspbian gets used a fair bit. These make up the 7 most popular operating systems amongst Docker repositories, with all others taking up a little less than 2% of the share, so they will be excluded to keep things uncluttered. A notable exclusion here is Red Hat Enterprise Linux. The license agreement prohibits redistribution, which is likely why we see CentOS but no RHEL in the list of official images however, our data shows many public RHEL images from users.

The repositories that are included in our dataset are those that have been analyzed by Anchore. This means all official repos, the most popular (based on a combination of pulls and stars) public community repos, and user-requested images. Right now Anchore is pulling data only from Docker Hub, but soon we will be expanding to includes images on Amazon EC2 Container Registry (Amazon ECR).

From these repositories, we looked at only the latest tag so that the information was pulled from tags that were being consistently updated. Also, different repositories have different update schedules; where one will push updates every other month, another might update every week. If we counted each update, it would skew results towards operating systems that have a couple of repositories that update multiple times a day. For this reason, we only counted a repository’s use of an operating system on its latest tag once, unless it switched to a different operating system later on.

Something else to note is the “Unknown” on the chart. If you look at library/swarm:latest, for example, you will see that the operating system is listed as “Unknown.” What this means is that swarm doesn’t have a standard operating system install and so the system cannot recognize what it is built on top of. Images like these are often statically compiled binaries, and so don’t require anything extra beyond what’s needed to run the application. With Docker’s recent improvements to multi-stage builds, binaries might see a rise in the near future as developers become more familiar with the process and greatly decrease their file size.

Image size is often used as a criteria for the selection of base images so we performed some quick analysis to see the average size of official images broken down by the operating system distribution.

To get some context, here are the sizes of the images of popular operating systems.

The difference in image size is striking: the range goes from BusyBox at 1MB all the way up to Fedora at 230MB. It’s interesting to see the clustering happening. Alpine and BusyBox are lightweight and right near 0MB, then the midweights like Debian and Ubuntu are around 100MB, and the largest are heavyweights such as CentOS and Oracle Linux up by 200MB.

Shown here is the size of official images split by what underlying OS it uses. Do note that the OS image is not excluded from the average, so for lesser-used operating systems, the average is brought down.

You can see that as application images are built on top of these base images their size grows as dependencies are added. For example, adding required runtimes such as Python or Java.

The pie chart above showing official OS distribution only covers the creation of images in the last three months, but our data extends further back.

Taking a look at the distribution of operating systems over the course of the past year, we see that Debian has always held its popularity among official repositories. It had a peak of over 80% back in February, and since then appears to have been ever so slowly tailing off. It looks like Alpine is gradually growing, but it is difficult to see any sure trends due to the fluctuation of the data, especially during the summer months which are traditionally slower. We will continue to monitor and report on this trend.

Digging more into Debian’s two-thirds share, we can look at the distribution of versions of Debian. Debian 8, Jessie, has held near 100% of the share until July amongst official images, with only a small number of images being built on Wheezy (7) and Stretch (9). This, of course, makes sense as Debian 9 was only released halfway through June, and has since been adopted by more than a third of images and growing. Before its stable release, a few repositories were using the unstable release, presumably favoring new features enough to make the jump ahead of everyone else.

Docker Hub official repositories make up only a small number of the total repositories on DockerHub. They follow best practices, are often base images that users build their own apps on top of, and are updated frequently. These standards don’t apply to community images. However, the most popular ones – those that we analyze – only just don’t make that mark. Despite that, there are quite a few differences in operating system usage between the community and official images.

Debian still holds the largest share, but only just. Both Alpine and Ubuntu see their percentage nearly double, with raspbian emerging and taking a small percentage, focused on IoT use cases. Ubuntu’s popularity might be explained by the fact that it is the most commonly used Linux distribution by users, and people like to work with something they are familiar with especially as they learn new technology. For Alpine, it’s possible that community repos are quicker to change tech quicker, and the appeal of the security and tiny size of Alpine is pulling more developers towards it.

To counter that willingness to change, Stretch doesn’t see as much adoption amongst community images as official ones, getting about half as much usage. What is interesting, however, is that unstable Stretch received more usage here than among official images, which may come from some users experimenting with it to see new features.

The graph of community operating system usage over time is much more interesting than the graph for official images, as there are a few trends to see. At the end of 2016, the distribution of operating systems was much more spread. Although Debian was leading then, it had a smaller share than it does now. Starting in February we saw a reduction in the usage of Ubuntu, and now it only has half of the usage of the leaders. Alpine started growing shortly after to take Ubuntu’s place at the top, joining Debian. The other four operating systems all have steadily tailed off, as developers choose to use one of the main three. Going forward, it will be interesting to see if Ubuntu’s recent uptick will continue at the expense of Debian.

Official images are typically smaller than public images since they are used as a foundation to build an application image. However, Alpine contradicts this trend, and public images using it are half the size of official images on average.

In our next blog, we will dig deeper into updates – looking at how frequently images are updated and the relationship between operating system patches, base image patches and updates to end-user images.

Scanning for Malicious Content

Ivan Akulov just published a rather worrying blog entitled Malicious Packages in NPM in which he documents a recent discovery of several malicious NPM packages that were copies of existing packages with similar names which, while they contained the same functionality, also included malicious code that would collect and exfiltrate environment variables from your system in the hope of finding sensitive information such as authentication tokens.

In the past, a developer would either write a software library or purchase a library from a software vendor. Today you can pick a free, open source library off-the-shelf from one of many different registries, each catering to a different community: NuGet for .NET developers, CPAN for Perl developers, RubyGems.org for Ruby developers, npmjs.org for Node.js developers, PyPi for Python developers, maven.org, etc.

This move to open source and community-focused development has helped drive the rapid pace of innovation that we’ve seen over the last 10 to 15 years. But as this story shows us, free software doesn’t come without a cost! Just because a piece of software is free that doesn’t mean that you shouldn’t perform the same level of due diligence in assessing the software as you would if you had to pay for it: where is the software coming from? how well is it maintained? how is it licensed? This process should not discourage the adoption of open source however it should ensure that you know what open source components you have, where they came from and how to support them internally.

The best approach is to start this process as early in the development cycle as possible, putting in place a process to screen software and libraries before they enter your ‘supply chain’. There are many tools that can help in this regard and the newer generation of tools from other vendors are designed with this new open source software paradigm in mind.
But no matter what tools and policies you have in place there will always be something that slips through the cracks, so it’s good to have a final check that you can put in place to ensure that software you deploy meets your compliance and operational best practices. And this is where Anchore comes in.

One of the policy rules that Anchore supports is the ability to blacklist certain packages, not just operating system packages but also software libraries such as Ruby Gems or Node.js NPMs.

So inspired by Ivan’s blog let’s add a policy check that blacklists these NPMs which will allow us to see if any of our images include these modules.

Once logged on launch the policy editor from the icon on the Navigation Bar.

For simplicity, we’ll just edit the default policy however you can create custom policies that can be mapped to images based on their registry, repository, and tag.

Pressing the   icon expands the list of policy items

We will create a new rule by pressing the button.

In the Gate field select NPM Checks and in the Trigger field select NPM Package Match (Name).

Then in the Parameters field select NPM Name-only match.

We now need to enter the modules that we are looking for.

Paste the following into the field and press the save button:

babelcli, crossenv, cross-env.js, d3.js, fabric-js, ffmepg, gruntcli, http-proxy.js, jquery.js, mariadb, mongose, mssql.js, mssql-node, mysqljs, nodecaffe, nodefabric, node-fabric, nodeffmpeg, nodemailer-js, nodemailer.js, nodemssql, node-opencv, node-opensl, node-openssl, noderequest, nodesass, nodesqlite, node-sqlite, node-tkinter, opencv.js, openssl.js, proxy.js, shadowsock, smb, sqlite.js, sqliter, sqlserver, tkinter

Under action, select  WARN to indicate that the presence of these packages will raise a warning rather than fail or stop the image.

Finally, click the  button to save the policy.

Next, from the Anchore Navigator home page search for an image that you wish to check. Once you have found the image navigate to the Image Policy tab to see if any warnings have been raised based on our new policy.

One of the great features of the Navigator is that it keeps historic data about tags and images so that you can navigate back through a tag’s history to look at previous versions. So perhaps the image you have deployed today does not include one of the trojan modules however an older version of this tag may have included a vulnerable component. This ability to look back may prove valuable in reviewing previous deployments either for audit purposes or when performing a post-mortem as part of incidence response.

The Case of the Missing Vulnerability

We extended one of the most popular features of the Anchore Navigator, tag notifications, in our latest Previously users could subscribe to a tag and receive a notification when a new image was pushed with that tag. For example, if you used the Debian image as the base image for your containers then you could subscribe to receive a notification when a new release was pushed.

In addition to tag update notifications, the Navigator can now send notifications when we detect changes to the policy status of your image, for example, if your image is now failing its policy check, or when CVEs change on your image.

Seeing a CVE change notification is common but usually, you expect to see “CVE Added” however this email is different.

Here you can see that I subscribed to library/python:latest and the current image ID that’s tagged with that tag is 968120d8…. and in the body of the notification you can see that one medium severity CVE has been removed.

When the Anchore Navigator first analyzed image ID 968120d8… a list of packages was retrieved. The Anchore service regularly pulls down vulnerability data from sources such as operating system distributors and the National Vulnerability Database (NVD). We match this data against the package manifest to identify vulnerabilities in the image.

The most common change we see is when a new vulnerability is reported against a specific package. The actual workflows we see vary from distribution to distribution. It is common to see a vulnerability of unknown severity added to an image when the vulnerability is first been disclosed then once the vulnerability has been triaged it moves from unknown severity to a specific severity such as Critical, High, Medium, Low or Negligible.

In some cases as more in-depth analysis occurs a distributor or the upstream vulnerability database provider may change their assessment of not just the severity but also the version number of the vulnerable package. For example, if may initially be thought that version 2.x of package foo is vulnerable to a CVE but on further analysis, it may be found that only version 2.1 is vulnerable.
In this example, the vulnerability was analyzed and it was found that the current version of ImageMagick (version 8:6.8.9.9-5+deb8u9) in Debian Jessie is not vulnerable to this issue and so the associated feed was updated by the Debian security team. Anchore picked up the change to this feed which triggered the notifications.

Sadly seeing vulnerabilities being removed from an image is not very common, you are more likely to see new vulnerabilities being added to images or vulnerability severities being increased which is why it’s important not just to check the image once but keep a constant eye on the status of the image which is where the Anchore Navigator’s notifications feature can help.