Preparing for a critical vulnerability

One morning, you wake up and see a tweet like the one above. The immediate response is often panic. This sounds bad; it probably affects everyone, and nobody knows for certain what to do next. Eventually, the panic subsides, but we still have a problem that needs to be dealt with. So the question to ask is: What can we do?

Don’t panic

Having a vague statement about a situation that apparently will probably affect everyone sounds like a problem we can’t possibly prepare for. Waiting is generally the worst option in situations like this, but there are some things we can do to help us out.

One of the biggest challenges in modern infrastructure is just understanding what you have. This sounds silly, but it’s a tough problem because of how most software is built now. You depend on a few open-source projects, and those projects depend on a few other open-source projects, which rely on even more open-source projects. And before you know it, you have 300 open-source projects instead of the 6 you think you installed.

Our goal is to create an inventory of all our software. If we have an accurate and updated inventory, you don’t have to wonder if you’re running some random application, like CUPS. You will know beyond a reasonable doubt. Knowing what software you do (or don’t) have deployed can bring an amazing amount of peace of mind.

This is not new

This was the same story during log4J and xz emergencies. What induced panic in everyone wasn’t the vulnerabilities themselves but the scramble to find where those libraries were deployed. In many instances, we observed engineers manually connecting to servers with SSH and dissecting container images.

These security emergencies will never end, but they all play out similarly. There is a time between when something is announced and good actionable guidance appearing. The security community will come to a better understanding of the issue, then we can figure out the best way to deal with whatever the problem is. This could be updating packages, it could mean adjusting firewall rules, or maybe changing a configuration option.

While we wait for the best guidance, what if we were going through our software inventory? When Log4Shell happened almost everyone spent the first few days or weeks (or months) just figuring out if they had Log4j anywhere. If you have an inventory, those first few days can be spent putting together a plan for applying the guidance you know is coming. It’s a much nicer way to spend time than frantically searching!

The inventory

Creating an inventory sounds like it shouldn’t be that hard. How much software do you REALLY have? It’s hard, and software poses a unique challenge because there are nearly infinite ways to build and assemble any given application. Do you have OpenSSL as an operating system package? Or is it a library in a random directory? Maybe it’s statically compiled into the binary. Maybe we download a copy off the internet when the application starts. Maybe it’s all of these … at the same time.

This complexity is taken to a new level when you consider how many computers, servers, containers, and apps are deployed. The scale means automation is the only way we can do this. Humans cannot handcraft an inventory. They are too slow and make too many mistakes for this work, but robots are great at it!

But the automation we have today isn’t perfect. It’s early days for many of these scanners and inventory formats (such as a Software Bill of Materials or SBOM). We must grasp what possible blind spots our inventories may have. For example, some scanners do a great job finding operating system packages but aren’t as good at finding Java archives (jar files). This is part of what makes the current inventory process difficult. The tooling is improving at an impressive rate; don’t write anything off as too incomplete. It will change and get better in the future.

Enter the SBOM

Now that we have mentioned SBOMs, we should briefly explain how they fit into this inventory universe. An SBOM does nothing by itself; it’s just a file format for capturing information, such as a software inventory.

Anchore developers have written plenty over the years about what an is SBOM, but here is the tl;dr:

An SBOM is a detailed list of all software project components, libraries, and dependencies. It serves as a comprehensive inventory that helps understand the software’s structure and the origins of its components.

An SBOM in your project enhances security by quickly identifying and mitigating vulnerabilities in third-party components. Additionally, it ensures compliance with regulatory standards and provides transparency, which is essential for maintaining trust with stakeholders and users.

An example

To explain what all this looks like and some of the difficulties, let’s go over an example using the eclipse-temurin Java runtime container image. It would be very common for a developer to build on top of this image. It also shows many of the challenges in trying to pin down a software inventory.

The Dockerfile we’re going to reference can be found on GitHub, and the container image can be found on Docker Hub.

The first observation is that this container uses Ubuntu as the underlying container image.

This is great, Ubuntu has a very nice packaging system and it’s no trouble to see what’s installed. We can easily do this with Syft.

bress@anchore   ~ syft ubuntu:24.04
  Parsed image sha256:61b2756d6fa9d6242fafd5b29f674404779be561db2d0bd932aa3640ae67b9e1
  Cataloged contents 74f92a6b3589aa5cac6028719aaac83de4037bad4371ae79ba362834389035aa
   ├──  Packages                        [91 packages]
   ├──  File digests                    [2,259 files]
   ├──  File metadata                   [2,259 locations]
   └──  Executables                     [722 executables]
NAME                 VERSION                      TYPE
apt                  2.7.14build2                 deb
base-files           13ubuntu10.1                 deb
base-passwd          3.6.3build1                  deb
bash                 5.2.21-2ubuntu4              deb
bsdutils             1:2.39.3-9ubuntu6.1          deb
coreutils            9.4-3ubuntu6                 deb
dash                 0.5.12-6ubuntu5              deb
debconf              1.5.86ubuntu1                deb
debianutils          5.17build1                   deb
diffutils            1:3.10-1build1               deb
dpkg                 1.22.6ubuntu6.1              deb
e2fsprogs            1.47.0-2.4~exp1ubuntu4.1     deb
findutils            4.9.0-5build1                deb
gcc-14-base          14-20240412-0ubuntu1         deb
gpgv                 2.4.4-2ubuntu17              deb
grep                 3.11-4build1                 deb
gzip                 1.12-1ubuntu3                deb
hostname             3.23+nmu2ubuntu2             deb
init-system-helpers  1.66ubuntu1                  deb

There has been nothing exciting so far. But if we look a little deeper at the eclipse temurin Dockerfile, we see it’s installing the Java JDK using wget. That’s not something we’ll find just by looking at Ubuntu packages.

bress@anchore ~ syft ubuntu:24.04
  Parsed image sha256:61b2756d6fa9d6242fafd5b29f674404779be561db2d0bd932aa3640ae67b9e1
  Cataloged contents 74f92a6b3589aa5cac6028719aaac83de4037bad4371ae79ba362834389035aa
   ├──  Packages                        [91 packages]
   ├──  File digests                    [2,259 files]
   ├──  File metadata                   [2,259 locations]
   └──  Executables                     [722 executables]
NAME                 VERSION                      TYPE
apt                  2.7.14build2                 deb
base-files           13ubuntu10.1                 deb
base-passwd          3.6.3build1                  deb
bash                 5.2.21-2ubuntu4              deb
bsdutils             1:2.39.3-9ubuntu6.1          deb
coreutils            9.4-3ubuntu6                 deb
dash                 0.5.12-6ubuntu5              deb
debconf              1.5.86ubuntu1                deb
debianutils          5.17build1                   deb
diffutils            1:3.10-1build1               deb
dpkg                 1.22.6ubuntu6.1              deb
e2fsprogs            1.47.0-2.4~exp1ubuntu4.1     deb
findutils            4.9.0-5build1                deb
gcc-14-base          14-20240412-0ubuntu1         deb
gpgv                 2.4.4-2ubuntu17              deb
grep                 3.11-4build1                 deb
gzip                 1.12-1ubuntu3                deb
hostname             3.23+nmu2ubuntu2             deb
init-system-helpers  1.66ubuntu1                  deb

If we scan this image with Syft, we can see a few different types of packages installed.

bress@anchore ~ syft eclipse-temurin:8u422-b05-jre-noble
  Parsed image sha256:d2c2442dea2a2b1164bd6dd39af673db2215ff680910aff7417432b00a3c8e4d
  Cataloged contents 805b45dee2c503f1cca36e1ecc6e8625538592e2db32cc04e317a246fb86d0fc
   ├──  Packages                        [142 packages]
   ├──  File digests                    [3,856 files]
   ├──  File metadata                   [3,856 locations]
   └──  Executables                     [809 executables]
NAME                 VERSION                            TYPE

hostname             3.23+nmu2ubuntu2                   deb
init-system-helpers  1.66ubuntu1                        deb
jaccess              UNKNOWN                            java-archive
jce                  1.8.0_422                          java-archive
jfr                  1.8.0_422                          java-archive
jsse                 1.8.0_422                          java-archive
libacl1              2.3.2-1build1                      deb
libapt-pkg6.0t64     2.7.14build2                       deb
libassuan0           2.5.6-1build1                      deb
libattr1             1:2.5.2-1build1                    deb
libaudit-common      1:3.1.2-2.1build1                  deb

The jdk and jre are binaries in the image, as are some Java archives. This is a gotcha to watch for when you’re building an inventory. Many inventories and scanners may only look for known-good packages, not binaries and other files installed on the system. In a perfect world, our SBOM tells us details about everything in the image, not just one package type.

At this point, you can imagine a developer adding more things to the container: code they wrote, Java Archives, data files, and maybe even a few more binary files, probably installed with wget or curl.

What next

This sounds pretty daunting, but it’s not that hard to start building an inventory. You don’t need a fancy system. The easiest way is to find an open source SBOM generator, like Syft, and put the SBOMs in a directory. It’s not perfect, but even searching through those files is faster than manually finding every version of CUPS in your infrastructure.

Once you understand an initial inventory, you can investigate more complete solutions. There are countless open-source projects, products (such as Anchore Enterprise), and services that can help here. For example, when starting to build the inventory, don’t expect to go from zero to complete overnight. Big projects need immense patience.

It’s like the old proverb that the best time to plant a tree was twenty years ago; the second best time is now. The best time to start an inventory system was a decade ago; the second best time is now.

If you’d like to discuss any topics raised in this post, join us on this discourse thread.

We don’t know how to fix the xz problem, but we can detect it

A very impressive and scary attack against the xz library was uncovered on Friday, which made for an interesting weekend for many of us.

There has been a lot written about the technical aspects of this attack, and more will be uncovered over the next few days and weeks. It’s likely we’re not done learning new details about this attack. This doesn’t appear to affect as many organizations as Log4Shell did, but it’s a pretty big deal. Especially what this sort of attack means for the larger ecosystem of open source. Trying to explain the details isn’t the point of this blog post. There’s another angle of this story that’s not getting much attention. How can we solve problems like this (we can’t), and what can we do going forward?

The unsolvable problem

Sometimes reality can be harsh, but the painful truth about this sort of attack is that there is no solution. Many projects and organizations are happy to explain how they keep you safe, or how you can prevent supply chain attacks, by doing this one simple thing. However, the industry as it stands today lacks the ability to prevent an attack created by a motivated and resourced threat actor. If we want to use an analogy, preventing an attack like xz is the equivalent of the pre-crime science fiction dystopian stories. The idea behind pre-crime is to use data or technology to predict when a crime is going to happen, then stopping it. This leads to a number of problems in any society that adopts such a thing as one can probably imagine.

If there is a malicious open source maintainer, we lack the tools and knowledge to prevent this sort of attack, as you can’t actually stop such behavior until after it happens. It may be possible there’s no way to stop something like this before it happens.

HOWEVER, that doesn’t mean we are helpless. We can take a page out of the playbook of the observability industry. Sometimes we’re able to see problems as they happen or after they happen, then use that knowledge from the past to improve the future, that is a problem we can solve. And it’s a solution that we can measure. If you have a solid inventory of your software, looking for affected versions of xz becomes simple and effective.

Today and Tomorrow

Of course looking for a vulnerable version of xz, specifically versions 5.6.0 and 5.6.1, is something we should all be doing. If you’ve not gone through the software you’re running you should go do this right now. See below for instructions on how to use Syft and Anchore Enterprise to accomplish this.

Finding two specific versions of xz is a very important task right now, there’s also what happens tomorrow. We’re all very worried about these two specific versions of xz, but we should prepare for what happens next. It’s very possible there will be other versions of xz that end up having questionable code that needs to be removed or downgraded. There could be other libraries that have problems (everyone is looking for similar attacks now). We don’t really know what’s coming next. The worst part of being in the middle of attacks like this are the unknowns. But there are some things we do know. If you have an accurate inventory of your software, figuring out what software or packages you need to update becomes trivial.

Creating an inventory

If you’re running Anchore Enterprise the good news is you already have an inventory of your software. You can create a report that will look for images affected by CVE-2024-3094.

The above image shows how a report in Anchore Enterprise can be created.  Another feature of Anchore Enterprise allows you to query all of your SBOMs for instances of specified software via an API call, by package name.  This is useful for gaining insights about the location, ubiquity, and the version spread of the software, present in your environment.

The package names in question are the liblzma5 and xz-libs packages, which cover the common naming across rpm, dpkg, and apkg based Linux distributions.  For example:

   

We don’t know how to fix the xz problem, but we can detect it

See the Anchore Enterprise API Browser for more information about the API, and the Anchore Enterprise Documentation for more details on reporting, vulnerability scanning, and other functions of the system.

If you’re using Syft, it’s a little bit more complicated, but still a very solvable problem. The first thing to do is generate SBOMs for the software you’re using. Let’s create SBOMs for container images in this example.

It’s important to keep those SBOMs you just created somewhere safe. If we then find out in a few days or weeks that other versions of xz shouldn’t be trusted, or if a different open source library has a similar problem, we can just run a query against those files to understand how or if we are affected.

Now that we have a directory full of SBOMs, we can run a query similar to these to figure out which SBOMs have a version of xz in them.

While the example looks for xz, if the need to quickly look for other packages arises in the near future, it’s easy to adjust your query. It’s even easier if you store the SBOMs in some sort of searchable database.

What now?

There’s no doubt it’s going to be a very interesting couple of weeks for many of us. Anyone who has been watching the news is probably wondering what wild supply chain story will happen next. The pace of solving open source security problems hasn’t kept pace with the growth of open source. There will be no quick fixes. The only way we get out of this one is a lot of hard questions and even more hard work.

But in the meantime, we can focus on understanding and defending what we have. Sometimes the best defense is a good defense.

Want a primer on software suppy chain security? Get our free white paper here.

Open Source is Bigger Than You Can Imagine

If we pay attention to the news lately we hear about supply chain security and how it’s the most important topic ever and we need to start doing something right now. But the term “supply chain security” isn’t well defined.  The real challenge we actually have is understanding open source. Open source is in everything now. There is no supply chain problem, there is an understanding open source problem.

Log4Shell was our Keyser Söze moment

There’s a scene in the movie “The Usual Suspects”, where the detective realizes everything he was just told has been a lie. His entire world changed in an instant. It was a plot twist, not even the audience saw coming. Humans love plot twists and surprises in our stories, but not in real life. Log4Shell was a plot twist but in real life. It was not a fun time.

Open source didn’t take over the world overnight. It took decades. It was a silent takeover that only the developers knew about. Until Log4Shell. When Log4Shell happened everyone started looking for Log4j and they found it, everywhere they looked. But while finding Log4j, we also found a lot more open source. And I mean A LOT more. Open source was in everything, both the software acquired from other vendors and the software built in house. Everything from what’s running on our phones to what’s running the toaster. It’s all full of open source software.

Now that we know open source is everywhere, we should start to ask what open source really is. It’s not what we’ve been told. There’s often talk of “the community”, but there is no community. Open source is a vast collection of independent projects. Some of these projects are worked on by Fortune 100 companies, some by scrappy startups, and some are just a person in their basement who only can work on their project from 9:15pm to 10:05pm every other Wednesday. And open source is big. We can steal a quote from Douglas Adams’ Hitchhiker’s Guide to the Galaxy to properly capture the magnitude of open source:

Space Open source … is big. Really big. You just won’t believe how vastly hugely mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to space open source.”

The challenge for something like open source isn’t just claiming it’s big. We all know it’s big. The challenge is showing how mind-bogglingly big it is. Imagine the biggest thing we can, open source is bigger.

Let’s do some homework.

The size of NPM

For the rest of this post we will focus on NPM, the Node Package Manager. NPM is how we would install dependencies for our Node.js applications. The reason this data was picked is it’s very easy to work with, it has good public data, and it’s the largest package ecosystem in the world today.

It should be said, NPM isn’t special in the context of the below data, if we compare these graphs to Python’s PyPI for example, we see very similar shapes, just not as large. In the future we may explore other packaging ecosystems, but fundamentally it’s going to look a lot like this. All of this data was generated using the scripts stored in GitHub, the repo is aptly named npm-analysis.

Let’s start with the sheer number of NPM package releases over time. It’s a very impressive and beautiful graph.

This is an incredible number of packages. At the time of capturing data, there were 32,600,904 packages. There are of course far more now, just look at the growth. By packages, we mean every version of every package released. There are about 2.3 million unique packages, but when we take those packages times all the released versions, we end up with over 32 million.

It’s hard to imagine how big this really is. There was a proposal recently that suggested we could try to conduct a security review on 10,000 open source projects per year. This is already a number that would need thousands of people to accomplish. But even at 10,000 projects per year, it would take more than 3,000 years to get through just npm at its current size. Ignoring the fact that we’ve been adding more than 1 million packages per year, so doing some math … we will be done … never, the answer is never.

The people

As humans, we love to start creating reasons for this sort of growth. Maybe it’s all malicious packages, or spammers using NPM to sell vitamins. “It’s probably big projects publishing lots of little packages”, or “have you ever seen the amount of stuff in the React framework”? It turns out almost all of NPM is single maintainer projects. The graph below shows the number of maintainers for a given project. We see there are more than 18 million releases that list a single maintainer in their package.json file. That’s over half of all NPM releases ever having just one person maintaining them.

This graph shows a ridiculous amount of NPM is one person, or a small team. If we look at the graph on a logarithmic scale we can see what the larger projects look like, the linear graph is sort of useless because of the sheer number of one person projects.

These graphs contain duplicate entries when it comes to maintainers. There are many maintainers who have more than one project, it’s quite common in fact. If we filter the graph by the number of unique maintainers, we see this chart.

It’s a lot less maintainers, but we see the data is still dominated by single maintainer projects. In this data set we see 727,986 unique NPM maintainers. This is an amazing number of developers. This a true testament to the power and reach of open source.

New packages

Now that we see there are a lot of people doing an enormous amount of work. Let’s talk about how things are growing. We mentioned earlier that more than one million packages and versions are being added per year.

If this continues we’re going to be adding more than one million new packages per month soon.

Now, it should be noted this graph isn’t new packages, it’s new releases, so if an existing project releases five updates, it shows up in this graph all five times.

If we only look at brand new packages being added, we get the below graph. A moving average was used here because this graph is a bit jumpy otherwise. New projects don’t get added very consistently.

This shows us we’re adding less than 500,000 new projects per year, which is way better than one million! But still a lot more than 10,000.

The downloads

We unfortunately don’t have an impressive graph of downloads to show. The most npm data we can get is for one year of download statistics and it’s a single number, it’s not spread out by date.

In the last year, there were 130,046,251,733,027 NPM downloads. That feels like a fake number, 15 digits. That’s 130 TRILLION downloads. Now, that’s not spread out very evenly. The median downloads of a package are only 217. The bottom 5% are 71 downloads, and the top 5% are more than 16,000 downloads. It’s pretty clear the number of downloads are very uneven. The most popular projects are getting most of the downloads.

Here is a graph of the top 100 projects by downloads. It follows a very common power distribution curve.

We probably can’t imagine what this download data over all time must look like. It’s almost certainly even more mind boggling than the current data set.

Most of these don’t REALLY matter

Nobody would argue if someone said that the vast majority of NPM packages will never see widespread use. Using the download data we can show 95% of NPM packages aren’t widely used. But the sheer scale is what’s important. 5% of NPM is still more than 100,000 unique packages. That’s a massive number, even at our 10,000 packages a year review, that’s more than ten years of work and this is just NPM.

If we filter our number of maintainers graph to only include the top 5% of downloaded packages, it basically looks the same, just with smaller numbers

Every way we look at this data, these trends seem to hold.

Now that we know how incredibly huge this all really is, we can start to talk about this supposed supply chain and what comes next.

What we can actually do about this

First, don’t panic. Then the most important thing we can do is to understand the problem. Open source is already too big to manage and growing faster than we can keep up. It is important to have realistic expectations. Before now many of us didn’t know how huge NPM was. And that’s just one ecosystem. There is a lot more open source out there in the wild.

There’s another quote from Douglas Adams’ Hitchhiker’s Guide to the Galaxy that seems appropriate right now:

‘“I thought,” he said, “that if the world was going to end we were meant to lie down or put a paper bag over our head or something.”

“If you like, yes,” said Ford.

“Will that help?” asked the barman.

“No,” said Ford and gave him a friendly smile.”’

Open source isn’t a force we command, it is a resource for us to use. Open source also isn’t one thing, it’s a collection of individual projects. Open source is more like a natural resource. A recent report from the Atlantic Council titled Avoiding the success trap: Toward policy for open-source software as infrastructure compares open source to water. It’s an apt analogy on many levels, especially when we realize most of the surface of the planet is covered in water.

The first step to fixing a problem is understanding it. It’s hard to wrap our heads around just how huge open source is, humans are bad at exponential growth. We can’t have an honest conversation about the challenges of using open source without first understanding how big and fast it really is. The intent of this article isn’t to suggest open source is broken, or bad, or should be avoided. It’s to set the stage to understand what our challenge looks like.

The importance and overall size of open source will only grow as we move forward. Trying to use the ideas of the past can’t work at this scale. We need new tools, ideas, and processes to face our new software challenges. There are many people, companies, and organizations working on this but not always with a grasp of the true scale of open source. We can and should help existing projects, but the easiest first step is to understand how big our open source use is. Do we know what open source we’re using?

Anchore is working on this problem every day. Come help with our open source projects Syft and Grype, or have a chat with us about our enterprise solution.

Josh Bressers
Josh Bressers is vice president of security at Anchore where he guides security feature development for the company’s commercial and open source solutions. He serves on the Open Source Security Foundation technical advisory council and is a co-founder of the Global Security Database project, which is a Cloud Security Alliance working group that is defining the future of security vulnerability identifiers.

Breaking Down NIST SSDF: Spotlight on PW.6 Compilers and Interpreter Security

In this part of the long-running series breaking down NIST Secure Software Development Framework (SSDF), also known as the standard NIST 800-218, we are going to discuss PW 6. This control is broken into two parts, PW.6.1 and PW.6.2. These two controls are related and defined as:

PW.6.1: Use compiler, interpreter, and build tools that offer features to improve executable security.
PW.6.2: Determine which compiler, interpreter, and build tool features should be used and how each should be configured, then implement and use the approved configurations.

We’re going to lump both of these together for the purpose of this post. It doesn’t make sense to split these two controls apart when we are reviewing what this actually means, but there will be two posts for PW.6, this is part one. Let’s start by looking at the examples for some hints on what the standard is looking for:

PW.6.1
Example 1: Use up-to-date versions of compiler, interpreter, and build tools.
Example 2: Follow change management processes when deploying or updating compiler, interpreter, and build tools, and audit all unexpected changes to tools.
Example 3: Regularly validate the authenticity and integrity of compiler, interpreter, and build tools. See PO.3.

PW.6.2
Example 1: Enable compiler features that produce warnings for poorly secured code during the compilation process.
Example 2: Implement the “clean build” concept, where all compiler warnings are treated as errors and eliminated except those determined to be false positives or irrelevant.
Example 3: Perform all builds in a dedicated, highly controlled build environment.
Example 4: Enable compiler features that randomize or obfuscate execution characteristics, such as memory location usage, that would otherwise be predictable and thus potentially exploitable.
Example 5: Test to ensure that the features are working as expected and are not inadvertently causing any operational issues or other problems.
Example 6: Continuously verify that the approved configurations are being used.
Example 7: Make the approved tool configurations available as configuration-as-code so developers can readily use them.

If we review the references, you will find there’s a massive swath of suggestions. Everything from code signing to obfuscating binaries, to handling compiler warnings, to threat modeling. The net was cast wide on this one. Every environment is different. Every project or product uses its own technology. There’s no way to “one size fits all” this control. This is one of the challenges that has made compliance for developers so very difficult in the past. We have to determine how this applies to our environment, and the way we apply this finding will be drastically different than the way someone else applies it.

We’re going to split this topic along the lines of build environments and compiler/interpreter security. For this blog, we are going to focus on using modern protection technology, specifically in compiler security and runtimes. Of course, you will have to review the guidance and understand what makes sense for your environment, everything we discuss here is for example purposes only.

Compiler security
When we think about the security of applications, we tend to focus on the code itself. Security vulnerabilities are the result of attackers causing unexpected behavior in the code. Printing an unescaped string, adding or subtracting a very large integer. Maybe even getting the application to open a file it shouldn’t. We’ve all heard about memory safety problems and how hard they are to avoid in certain languages. C and C++ are legendary for their lack of memory protection. Our intent should be to write code that doesn’t have security vulnerabilities. The NSA and even Consumer Reports have recently come out against using memory unsafe languages. We can also lean on technology to help reduce the severity of memory safety bugs when we can’t abandon memory unsafe languages just yet, maybe never. There’s still a lot of COBOL out there, after all.

While attackers can exploit some bugs in ways that cause unexpected behavior, there are technologies, especially in compilers, that can lower the severity or even eliminate the danger of certain bug classes. For example, stack buffer overflows in C used to be a huge problem, then we created stack canaries which has reduced the severity of these bugs substantially.

Every compiler is different, every operating system is different, and every application is different, so all of this has to be decided for each individual application. For the purposes of simplicity, we will use gcc to show how some of these technologies work and how to enable them. The Debian Wiki Hardening page has a huge amount of detail, we’ll just cover some of the quick easy things.

user@debian:~/test$
user@debian:~/test$ gcc -o overflow test-overflow.c
root@debian:~/test$ ./overflow
Segmentation fault
user@debian:~/test$ gcc -fstack-protector -o overflow test-overflow.c
user@debian:~/test$ ./overflow
*** stack smashing detected ***: terminated
Aborted
user@debian:~/test$

In the above example, we can see how the compiler can issue a warning instead of crashing if we enable the gcc stack protector feature.

Most of these protections will only reduce the severity of a very narrow group of bugs. These languages still have many other problems and moving away from a memory unsafe language is the best path forward. Not everyone can move to a memory safe language, so compiler flags can help.

Compiler warnings are bugs
There was once a time when compiler warnings were ignored because they were just warnings. It didn’t really matter, or so we thought. Compiler warnings were just suggestions from the compiler, if there’s time later those warnings can be fixed. Except there is never time later. It turns out that sometimes those warnings are really important. They can be hints that a serious bug is waiting to be exploited. It’s hard to know which warnings are harmless and which are serious, so the current best practice is to fix them all to minimize vulnerabilities in your code.

If we use our example code, we can see:

user@debian:~/test$
user@debian:~/test$ gcc -o overflow test-overflow.c
test-overflow.c: In function 'function':
test-overflow.c:6:2: warning: '__builtin_memcpy' writing 24 bytes into a region of size 9 overflows the destination [-Wstringop-overflow=]
6 | strcpy(s, "This string is too long");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
user@debian:~/test$

We see a warning telling us our string is too long. The build doesn’t fail, but that’s not a warning you should ignore.

Interpreted languages

The suggestion in the SSDF is for interpreted languages is to use the latest interpreter. These languages are memory safe, but they are still vulnerable to logic bugs. Many of the interpreters are written in C or C++, so you could double check they are built with the various compiler hardening features enabled.

There aren’t often protections built into the interpreter itself. This goes back to the wide swath of guidance for this control. Programming languages have an infinite number of possible use cases, the problem set is too large to accurately protect. Memory safety is a very narrow set of problems that we still can’t get right. General purpose programming is an infinitely wide set of problems.

There were some attempts to secure interpreted languages in the past, but the hardening proved to be too easy to break to rely on as a security feature. PHP and Ruby used to have safe mode, but it turned out they weren’t actually safe. Compiler and interpreter protections are hard to make effective in meaningful ways.

The best way to secure interpreted languages is to run code in sandboxes using things like virtualization and containers. Such guidance won’t be covered in this post. In fact SSDF doesn’t have guidance on how to run applications securely, SSDF focuses on development. There is plenty of other guidance on that, we’ll make sure to cover those once the SSDF series is complete.

This complexity and difficulty are almost certainly why the SSDF guidance is to just run the latest interpreter. The latest interpreter version will ensure any bugs, security or otherwise are fixed.

Wrapping up
As we can see from this post, optimizing compiler and runtime security isn’t a simple task. It’s one of those things can can feel easy, but it’s really not. The devil is in the details. The only real guidance here is to figure out what works best in your environment and go with that.

If you missed the first post in this series, you can view it here. Next time we will discuss build systems. Build systems have been a popular topic over the last few years as they have been targets for attackers. Luckily for us there is some solid guidance we can draw upon for securing a build system.

Josh Bressers
Josh Bressers is vice president of security at Anchore where he guides security feature development for the company’s commercial and open source solutions. He serves on the Open Source Security Foundation technical advisory council and is a co-founder of the Global Security Database project, which is a Cloud Security Alliance working group that is defining the future of security vulnerability identifiers.

Anchore Adds Support for NIST 800-218 SSDF

This blog post has been archived and replaced by the supporting pillar page that can be found here:

https://anchore.com/wp-admin/post.php?post=987473414&action=edit

The blog post is meant to remain “public” so that it will continue to show on the /blog feed. This will help discoverability for people browsing the blog and potentially help SEO. If it is clicked on it will automatically redirect to the pillar page.

Why is this massive supply chain attack being ignored?

If you read security news, you may have heard about a recent attack that resulted in 144,000, that’s one hundred and forty four THOUSAND packages being uploaded to NuGet, PyPI, and NPM. That’s a mind boggling number, it seems like with all the supply chain news it would be all anyone is talking about. Except it seems to have flared up quickly then died right down.

The discovery of this attack was made by Checkmarx. Essentially what happened was attackers created a number of accounts in the NuGet, PyPI, and NPM packaging ecosystems. Those fake accounts then uploaded a huge number of packages that linked to phishing sites in the package description. The intention seems to have been to improve search ranking of those sites as well as track users that enter sensitive details.

Supply chain security is an overused term

This concept called “supply chain security” is a term that  is very overused these days. What we tend to call software supply chain security is many other things that are sometimes hard to describe. Reproducible builds, attestation, source code control, slim containers are a few examples. An attack like this can’t be solved with the current toolset we have which is almost certainly why it’s not getting the attention it deserves. It’s easy to talk about something an exciting project or startup can help with. It’s much harder to understand and fix systemic problems.

Why this one is different

To understand why this is so different and hard, let’s break this problem down into its pieces. The first part of the attack is the packaging ecosystems. The accounts in question were valid, they weren’t hacked or trying to impersonate someone else. The various packaging ecosystems have low barriers to entry, this is why we all use them and why they are so incredible. In these packaging ecosystems we see new accounts and packages all the time. In fact there are thousands of new packages added every day. There’s nothing unexpected if an attacker creates many accounts, no alarm bells would be expected. Once an account exists, it can start adding packages.

The second piece of this attack is someone has to download the package in question. It should be pointed out that in this particular instance the actual package content isn’t malicious, but it’s safe to say nobody wants any of these packages in their application. The volume of bad packages are important for this part of the attack. Developers will accidentally typo package names, or they might stumble on a package thinking it solves whatever their problem is. Or they might just have bad luck and install something by accident. Again, this is working within the constraints of the system. So far nothing happening is outside of everyday operations.

Then the last part of this attack is how it gets cleaned up. The packaging ecosystems have stellar security teams working behind the scenes. As soon as they find a bad package it gets delisted. It’s rare for these malicious packages to last more than a few days once they are discovered. Quickly removing packages is the best course of action. But again, the existing supply chain security solutions won’t pick up any of these happenings at this time. When a package is delisted, it just vanishes. How do you know if any of the packages you already installed are a problem? What if your artifact registry just cached a malicious package? It can be difficult to understand if you have a malicious package installed.

How should this work?

How we detect these problems is where things start to get really hard. There will be calls for the packaging ecosystems to lock down their environments, that’s probably a bad idea. The power of open source is how fast and easy it is to collaborate. Putting up walls won’t solve this, it just moves the problem somewhere else, often in a way that hides the real issues.

We have existing databases that track vulnerabilities and bad packages, but they can’t handle this scale today. There are examples of malicious packages listed in OSV and GitHub’s vulnerability database. Other databases like CVE have explicitly stated they don’t want to track this sort of malware. Just knowing where to look and how to catalog these malicious packages isn’t simple, yet it’s an ongoing problem. There have been several instances of malicious packages just this year.

To understand the scale of this data, the CVE project has existed since 1999 and there are about 200,000 IDs total at the end of 2022. Adding 144,000 new IDs would be significant.

At the end of the day, the vulnerability databases are where this data needs to exist. Creating a new way to track malicious packages and expecting everyone to watch it just creates new problems. We are good at finding and fixing vulnerabilities in our software, this is fundamentally the same problem. Malicious packages are no different than vulnerabilities. We also need to keep in mind this will continue to happen.

There are a huge number of tools that exist and parse vulnerability databases, then alert developers. Alerting developers is exactly what these datasets and tools were built for, but none of them are picking up this type of supply chain problem today. If we add this data to the existing data all the pieces can fall into place with minimal disruption.

What can we do right now?

A knee jerk reaction to an event like this is to create constraints on developers in an attempt to only use trusted packages. While that can work, it’s always important to remember that when you create constraints for a person, they become more creative. Using curated open source repositories will need ongoing maintenance. If you just make pulling new packages harder without the ability to quickly add new packages, the developers will find another way.

At the moment there’s no good solution for detecting these packages. The best option is to generate a software bill of materials (SBOM) for all of your software, then look for the list of known bad packages against what’s in the SBOMs. In this particular case even if you have one of these packages in your environment, it will be harmless. But the purpose of this post is to explain the problem so the community can have informed conversations. This is about starting to work together to solve hard problems.

In the future we need to see lists of these known malicious packages cataloged somewhere. It’s boring and difficult work though, so it’s unlikely to get much attention. This is the equivalent of buried utilities that let modern society function. Extremely important, but not something that turns many heads unless it goes terribly wrong.

There’s no way any one group can solve this problem. We will need a community effort. Everyone from the packaging ecosystems, to the vulnerability databases, to the tool manufacturers, and even the security researchers all need to be on the same page. There are efforts underway to help with this. OSV and GitHub allow community contributions. The OpenSSF has a Securing Software Repos working group. The Cloud Security Alliance has the Global Security Database. These are some of the places to find or generate productive and collaborative conversations that can drive progress that hinders use of malicious packages in the software supply chain.

Josh Bressers
Josh Bressers is vice president of security at Anchore where he guides security feature development for the company’s commercial and open source solutions. He serves on the Open Source Security Foundation technical advisory council and is a co-founder of the Global Security Database project, which is a Cloud Security Alliance working group that is defining the future of security vulnerability identifiers.

Breaking Down NIST SSDF: Spotlight on PS.3.2

This is the second post in a long running series to explain the details of the NIST Secure Software Development Framework (SSDF), also known as the standard NIST 800-218. You can find more details about the SSDF on the NIST website.

Today we’re going to cover control PS.3.2 which is defined as

PS.3.2: Collect, safeguard, maintain, and share provenance data for all components of each software release (e.g., in a software bill of materials [SBOM]).

This one sounds really simple, we just need an SBOM, right? But nothing is ever that easy, especially in the world of cybersecurity compliance.

Let’s break this down into multiple parts. Nearly every word in this framework is important for a different reason. The short explanation is we need data that describes our software release. Then we need to safely store that data. It sounds simple, but like many things in our modern world of technology, the devil is in the details.

Start with the SBOM

Let’s start with an SBOM. Yes, you need an SBOM. That’s the provenance data. There are many ways to store release data, but the current expectation across the industry is that SBOMs will be the primary document. The intent is we have the ability to receive and give out SBOMs. For the rest of this post we will put a focus on how to meet this control using an SBOM and SBOM management.

It doesn’t matter how fast or slow the release process is, every time you ship or deploy software, you need an SBOM. For most of us the days of putting out a release every few years are long gone, almost everyone is releasing software at a breakneck pace. Humans cannot be a part of this process, because humans are slow and make mistakes. To solve the challenge of SBOM automation, we need, well, automation. SBOMs should be generated automatically during stages of the development process. There are many different ways to accomplish this, here at Anchore we’re pretty partial to the Syft SBOM generator. We will be using Syft in our examples,  but there are many ways to create this data.

Breaking it Down

Creating an SBOM is the easiest step of meeting this control. If we have a container we need an SBOM for, let’s use the Grype container for our example. It can be as easy as running

syft -o spdx-json anchore/grype:latest

and we have an SBOM of the Grype container image in the SPDX format. In this example we generated an SBOM from a container in the Docker registry, but there’s no reason to wait for a container to be pushed to the registry to generate an SBOM. You can add Syft into the build process. For  example, you can see a Syft GitHub action that does this step  automatically on every build. There are even ways to include the SBOM in the registry metadata now.

Once we have our SBOMs generated, keep in mind the ‘s’ is important, you are going to have a lot of SBOMs. Some applications will have one, some will have multiple. For example if you ship three container images for the application you will end up with at least three SBOMs. This is why the word “collect” exists in the control. Collecting all the SBOMs for a release is important. Collecting really just means making sure you can find the SBOMs that were automatically generated. In our case, we would collect and store the SBOMs in Anchore Enterprise. It’s a tool that does a great job of keeping track of a lot of SBOMs. More details can be found on the Anchore Enterprise website.

Protect the Data Integrity

After the SBOMs are collected, we have to safeguard the SBOMs contents. The word safeguard isn’t very clear. One of the examples states “Example 3: Protect the integrity of provenance data, and provide a way for recipients to verify provenance data integrity.” This seems pretty straightforward. It would be dishonest to make the claim “just sign the SBOM and you’re done” because digital signatures are still hard.

It’s probably best to use whatever mechanisms you use to safeguard your application artifacts to also safeguard the SBOM. This could be digital signatures. It could be a read only bucket storage over HTTPS. It could be checksum data available out of band. Maybe just a system that provides audit logs of when data changes. There’s no single way to do this and unfortunately there’s no good advice that can be handed out for this step. Be wary of anyone claiming this is a solved problem today. The smart folks working on Syft have some ideas on how to deal with this.

We also are expected to maintain the SBOMs we are now collecting and safeguarding. This one seems easy as in theory an SBOM is a static document. I think one could interpret this in several ways. NIST has a glossary, it doesn’t define maintain, but does define maintenance as “Any act that either prevents the failure or malfunction of equipment or restores its operating capability.” It’s safe to say the intent of this is to make sure the SBOMs are available now and into perpetuity. In a fast moving industry it’s easy to forget that in two or more years from now the data in an SBOM could be needed by customers, auditors, or even forensic investigators. But on the other side of that coin, it’s just as possible that in a few years what passes as an SBOM today won’t be considered an SBOM. Maintaining SBOMs should not be disregarded as unimportant or simple. You should find an SBOM management system that can store and convert SBOM formats as a way to future proof the documents.

There are new products coming to market that can help with this maintain stage. They are being touted as SBOM management platforms. Anchore Enterprise is a product that does this. There are also open source alternatives such as Dependency Track. There will no doubt be even more of these tools into the future as SBOM use increases and the market matures.

Lastly, and possibly most importantly, we have to share the SBOMs.

One aspect of SBOMs that keeps coming up is an idea that every SBOM needs to be available to the public. This is specifically covered by CISA in their SBOM FAQ. It comes up on a pretty regular basis and is a point of confusion. You get to decide who can access an SBOM. You can only distribute an SBOM to your customers, you can distribute them to the public, you can keep them internal only. Today there isn’t a well defined way to distribute SBOM data. Many ecosystems have their own ways of including SBOM data. For example in the world of containers, registries are putting them in metadata. Even GoReleaser lets you create SBOMs. Depending how your product or service is accessed, there may not be a simple answer to this question.

One solution could be having customers email support asking for a specific SBOM. Maybe you have the SBOM available in the same place customers download your application or login to your service. You can even just package the SBOM up into the application, like a file in a zip archive. Once again, the guidance does not specifically tell us how to accomplish this.

Pro Tip: Make sure you include instructions for anyone downloading the SBOM how to verify the integrity of your application and your SBOM. PS3.1 talks about how to secure the integrity of your application and we’ll cover that in a future blog post.

Final Thoughts

This is one control out of 42. It’s important to remember this is a journey, it’s not a one and done sort of event. We have many more blog posts to share on this topic, and a lot of SBOMs to  generate. Like any epic journey, there’s not one right way to get to the destination. 

Everyone has to figure out how they want to meet each NIST SSDF control, ideally in a way that is valuable to the organization as well as customers. Processes that create unnecessary burden will always end up worked around, and processes integrated into existing workflows are far less cumbersome. Let’s aim high and produce verifiable components that not only meet NIST compliance, but also ease the process for downstream consumers.

To sum it all up, you need to create SBOMs for every release, safeguard them the same way you safeguard your application, store them in a future proof manner, and be able to share the SBOMs. There’s no one way to do any of this, if you have any questions subscribe to our newsletter for monthly updates on software supply chain security insights and trends.

Josh Bressers
Josh Bressers is vice president of security at Anchore where he guides security feature development for the company’s commercial and open source solutions. He serves on the Open Source Security Foundation technical advisory council and is a co-founder of the Global Security Database project, which is a Cloud Security Alliance working group that is defining the future of security vulnerability identifiers.

Ask Me Anything: SBOMs and the Executive Order

The software supply chain is under intense pressure and scrutiny with the rise of malicious attacks that target open source software and components. Over the past year the industry has received guidance from the government with the Executive Order on Improving the Nation’s Cybersecurity and the most recent M-22-18 Enhancing the Security of the Software Supply Chain through Secure Software Development Practices. Now, perhaps more than ever before, it’s critical to have a firm understanding of the integrity of your software supply chain to ensure a strong security posture. This webinar will provide you with open access to a panel of Anchore experts who can discuss the role of a software bill of material (SBOM) and answer questions about how to understand and tackle government software supply chain requirements.

An Introduction to the Secure Software Development Framework

It’s very likely you’ve heard of a new software supply chain memo from the US White House that came out in September 2022. The content of the memo has been discussed at length by others. The actual memo is quite short and easy to read, you wouldn’t regret just reading it yourself.

The very quick summary of this document is that everyone working with the US Government will need to start following NIST 800-218, also known as the NIST Secure Software Development Framework, or SSDF. This is a good opportunity to talk about how we can start to do something with SSDF today. For the rest of this post we’re going to review the actual SSDF standard and start creating a plan of tackling what’s in it. The memo isn’t the interesting part, SSDF is.

This is going to be the first of many, many blog posts as there’s a lot to cover in the SSDF. Some of the controls are dealt with by policy. Some are configuration management, some are even software architecting. Depending on each control, there will be many different ways to meet the requirements. No one way is right, but there are solutions that are easier than others. This series will put extra emphasis on the portions of SSDF that deal with software bill of materials (SBOM) specifically, but we are not going to ignore the other parts.

An Introduction to the Secure Software Development Framework (SSDF)

If this is your first time trying to comply with a NIST standard, keep in mind this will be a marathon. Nobody starts following the entire compliance standard on day one. Make sure to set expectations with yourself and your organization appropriately. Complying with a standard will often take months. There’s also no end state, these standards need to be thought about as continuous projects, not one and done.

If you’re looking to start this journey I would suggest you download a spreadsheet NIST has put together that details the controls and standards for SSDF. It looks a little scary the first time you load it up, but it’s really not that bad. There are 42 controls. That’s actually a REALLY small number as far as NIST standards go. Usually you will see hundreds or even thousands.

An Overview of the NIST SSDF Spreadsheet

There are 4 columns: Practices, Tasks, Notional Implementation Examples, References

If we break it down further we see there are 19 practices and 42 Tasks. While this all can be intimidating, we can work with 19 practices and 42 tasks. The practices are the logical groupings of tasks, and the tasks are the actual controls we have to meet. The SSDF document covers all this in greater detail, but the spreadsheet makes everything more approachable and easy to group together.

The Examples Column

The examples column is where the spreadsheet really shines. The examples are how we can better understand the intent of a given control. Every control has multiple examples and they are written in a way anyone can understand. The idea here isn’t to force a rigid policy on anyone, but to show there are many ways to accomplish these tasks. Most of us learn better from examples than we do from technical control text, so be sure to refer to the examples often.

The References Section

The references sections are scary looking. Those are a lot of references and anyone who tries to read them all will be stuck for weeks or months. It’s OK though, they aren’t something you have to actively read, it’s to help give us additional guidance if something isn’t clear. There’s already a lot of security guidance out there, it can be easier to cross reference work that already exists than it is to make up all new content. This is how you can get clarifying guidance on the tasks. It’s also possible you already are following one or more of these standards which means you’ve already started your SSDF journey.

The Tasks

Every task has a certain theme. There’s no product you can buy that will solve all of these requirements. Some themes can only be met with policy. Some are secure software development processes. Most will have multiple ways to meet them. Some can be met with commercial tools, some can be met with open source tools.

Interpreting the Requirements

Let’s cover a very brief example (we will cover this in far more detail in a future blog post). PO 1.3. 3rd party requirements. The text of this reads

PO.1.3: Communicate requirements to all third parties who will provide commercial software components to the organization for reuse by the organization’s own software. [Formerly PW.3.1]

This requirement revolves around communicating your own requirements to your suppliers. But today the definition of supplier isn’t always obvious. You could be working with a company. But what if you’re working with open source? What if the company you’re working with is using open source? The important part of this is better explained in the examples: Example 3: Require third parties to attest that their software complies with the organization’s security requirements.

It’s easier to understand this in the context of having your supplier prove they are in compliance with your requirements. Proving compliance can be difficult in the best situations. Keep in mind you can’t just do this in one step. You probably first just need to know what you have (SBOM is a great way to do this.) Once you know what you have, you can start to define expectations for others. And once you have expectations and an SBOM you can hand out an attestation.

One of the references for this one is NIST 800-160. If we look at section 3.1.1, there are multiple pages that explain the expectations. There isn’t a simple solution as you will see if you read through NIST 800-160. This is an instance where a combination of policy, technology, and process will all come together to ensure the components used are held to a certain standard.

This is a lot to try to take in all at once, so we should think about how to break this down. Many of us already have existing components. How we tackle this with existing components is not the same approach we would take with a brand new application security project. One way to think about this is you will first need an inventory of your components before you can even try to create expectations for your suppliers.

We could go on explaining how to meet this control, but for now let’s just leave this discussion here. The intent was to show what this challenge looks like, not to try to solve it today. We will revisit this in another blog post when we can dive deep into the requirements and some ideas on how to meet the control requirements, and even define what those requirements are!

Your Next Steps

Make sure you check back for the next post in this series where we will take a deep dive into every control specified by the SSDF. New compliance requirements are a challenge, but they exist to help us improve what we are already doing in terms of secure software development practices. Securing the software supply chain is not just a popular topic, it’s a real challenge we all have to meet now. It’s easy to talk about securing the software supply chain, it’s a lot of hard work to actually secure it. But luckily for us there is more information and examples to build off of than ever before. Open source isn’t about code, it’s about sharing information and building communities. Anchore has several ways to help you on this journey. You can contact us, join our community Discourse forum, and check out our open source projects: Syft and Grype.

Josh Bressers
Josh Bressers is vice president of security at Anchore where he guides security feature development for the company’s commercial and open source solutions. He serves on the Open Source Security Foundation technical advisory council and is a co-founder of the Global Security Database project, which is a Cloud Security Alliance working group that is defining the future of security vulnerability identifiers.

NSA Securing the supply chain for developers: the past, present, and future of supply chain security

Last week the NSA, CISA, and ODNI released a guide that lays out supply chain security, but with a focus on developers. This was a welcome break from much of the existing guidance we have seen which mostly focuses on deployment and integration rather than the software developers. The software supply chain is a large space, and that space includes developers.

The guide is very consumable. It’s short and written in a way anyone can understand. The audience on this one is not compliance professionals. They also provide fantastic references. Re-explaining the document isn’t needed, just go read it.

However, even though the guide is very readable, it could be considered immature compared to much of the other guidance we have seen come from the government recently. This immaturity of a developer focused supply chain guide came through likely because this is in fact an immature space. Developer compliance has never been successful outside of some highly regulated industries and this guide reminds us why. Much of the guidance presented has themes of the old heavy handed way of trying to do security, while also attempting to incorporate some new and interesting concepts being pioneered by groups such as the Open Source Security Foundation (OpenSSF).

For example, there is guidance being presented that suggests developer systems not be connected to the Internet. This was the sort of guidance that was common a decade ago, but no developers could imagine trying to operate a development environment without Internet access now. This is a non-starter in most organizations. The old way of security was to create heavy handed rules developers would find ways to work around. The new way is to empower developers while avoiding catastrophic mistakes.

But next to outdated guidance, we see modern guidance such as using Supply chain Levels for Software Artifacts, or SLSA. SLSA is a series of levels that can be attained when creating software to help ensure integrity of the built artifacts. SLSA is an open source project that is part of the OpenSSF project that is working to create controls to help secure our software artifacts.

If we look at SLSA Level 1 (there are 4 levels), it’s clearly the first step in a journey. All we need to do for SLSA level 1 is keep metadata about how an artifact was built and what is in it. Many of us are already doing that today! The levels then get increasingly more structured and strict until we have a build system that cannot connect to the internet, is version controlled, and signs artifacts. This gradual progress makes SLSA very approachable.

There are also modern suggestions that are very bleeding edge and aren’t quite ready yet. Reproducible builds are mentioned, but there is lack of actionable guidance on how to accomplish this. Reproducible builds are an idea where you can build the source code for a project on two different systems and get the exact same output, bit for bit. Today everyone doing reproducible builds does so from enormous efforts, not because the build systems allow it. It’s not realistic guidance for the general public yet.

The guide expands the current integrator guidance of SBOM and verifying components is an important point. It seems to be pretty accepted at this point that generating and consuming SBOMs are table stakes in the software world. The guide reflects this new reality.

Overall, this guide has an enormous amount of advice contained in it. Nobody could do all of this even if they wanted to, don’t feel like this is an all or none effort. This is a great starting point for developer supply chain security. We need to better define the guidance we can give to developers to secure the supply chain. This guide is the first step, the first draft is never perfect, but the first draft is where the journey begins.

Understand what you are doing today, figure out what you can easily do tomorrow, and plan for some of the big things well into the future. And most importantly, ignore the guidance that doesn’t fit into your environment. When guidance doesn’t match with what you’re doing it doesn’t mean you’re doing it wrong. Sometimes the guidance needs to be adjusted. The world often changes faster than compliance does.

The most important takeaway isn’t to view this guide as an end state. This guide is the start of something much bigger. We have to start somewhere, and developer supply chain security starts here. Both how we protect the software supply chain and how we create guidance are part of this journey. As we grow and evolve our supply chain security, we will grow and evolve the guidance and best practices.

3 Myths of Open Source Software Risk and the One Nobody Is Discussing

Open source software is being vilified once again and, in some circles, even considered a national security threat. Open source software risk has been a recurring theme: First it was classified as dangerous because anyone could work on it and then it was called insecure because nobody was in charge. After that, the concern was that open source licenses were risky because they would require you to make your entire product open source.

Let’s consider where open source stands today. It’s running at minimum 80% of the world. Probably more. Some of the most mission-critical applications and services on the planet (and on Mars) are open source. The reality is, open source software isn’t inherently more risky than anything else. It’s simply misunderstood, so it’s easy to pick on.

Myth 1: Open source software is a risk because it isn’t secure

Open source software may not be as risky as you have been led to believe, but that doesn’t mean it gets a free pass either.

The most recent and top-of-mind example is the Log4Shell vulnerability in Log4j. It’s easy to put the blame on open source, but it’s lack of proper insight into our infrastructure that is the fundamental issue.

The question, “Are we running Log4j?” took many of us weeks to answer when we needed that answer in a few minutes. The key to managing our software risk (and that’s all software, not just open source) is to have the ability to know what is running and where it’s running. This is the literal purpose for a software bill of materials (SBOM).

The foundation for managing open source risk begins with knowing what we have in our software supply chain. Any software can be a potential risk if you don’t know you’re running it. You should be generating and receiving an SBOM for every piece of software used and have the capability to store and search the data. Not knowing what you’re running in your software supply chain is a far greater risk than actually running it.

The reality is that open source software is just software. It’s when we do a poor job of incorporating it into our products, deploying it, and tracking it that creates this mythic “security risk” we often hear about.

Myth 2: Open source software is a risk because it isn’t high quality

It was easier a decade ago to claim that open source software was inferior because there wasn’t a lot of open source in use. Today too much of the world runs on top of open source software to make the claim that it is low quality — the idea is simply laughable.

The real purpose behind the message that open source software is not suitable for enterprise use — and which you’ll often hear from legacy software vendors — is that open source software is inferior to commercially developed software.

In actuality, we’re not in a place to measure the quality of any of our software. While work is ongoing to fill this need, your best option today is to find the open source software that solves your problem and then make sure that it is up to date and has no major bugs that can leave your software supply chain susceptible to vulnerabilities.

Myth 3: Open source software is a risk because you can’t trust the people writing it

Myth 3 is loosely tied to the first myth that open source software is not secure. There are efforts to measure open source quality, which is a noble cause. Not all open source is created equally. It’s a common misbelief that open source projects with only one maintainer are of lower quality (see myth 2) and you can’t trust the people who build them.

There are plenty of projects in wide use where nobody really knows who is working on them. It’s a GitHub ID and that’s about it. So it’s possible the maintainer is an adversary. It’s also possible the intern that your endpoint vendor just hired is an adversary. The only difference is that in the open source world, we can at least figure it out.

Although there are open source projects that are nefarious, there are also many people working to uncover the malicious activity. They include a wide range of individuals from end users pointing out strange behavior to researchers scanning repositories and endpoint teams looking for active threats. The global community is a mighty power when it turns its attention to finding malicious open source software.

Again, open source software risk is less about trust than it is about having insight into what we are using and how we are using it. Trying to find malicious code is not realistic for many of us, but when it does get found, we need the ability to quickly pinpoint it in our software and remove it.

The true risk of open source software

In an era where the use of open source software is only increasing, the true risk in using open source — or any software for that matter — is failing to understand how it works. In the early days of open source, we could only understand our software by creating it. There wasn’t a difference between being an open source user and an open source contributor.

Open source is very different today. The number of open source users is huge (the population of the world to be exact), while the number of open source contributors is much smaller. And this is OK because everyone shouldn’t be expected to be an open source contributor. There’s nothing wrong with taking in open source packages and using them to build something else. That’s the whole point!

If there’s one piece of advice I can give, it’s that consuming open source can help you create better software faster as long as you manage risk. There are many good tools that scan for vulnerabilities and there are SBOM-driven solutions to help you identify security issues in all your software components. Open source is an experience where we will all have a different journey. But like any journey, we have to pay attention along the way or we could find ourselves off course.

Josh Bressers
Josh Bressers is vice president of security at Anchore where he guides security feature development for the company’s commercial and open source solutions. He serves on the Open Source Security Foundation technical advisory council and is a co-founder of the Global Security Database project, which is a Cloud Security Alliance working group that is defining the future of security vulnerability identifiers.

Grype now supports CycloneDX and SPDX

In the world of software bills of materials (SBOM) there are currently two major standards: Software Package Data Exchange (SPDX) and CycloneDX. SPDX is a product of the Linux Foundation. It’s been a standard for over ten years now. CycloneDX is brought to us by the OWASP project. It’s a bit newer than SPDX, and just as capable. If you’re following the SBOM news, these two formats are often topics of discussion.

It is expected that anyone who is creating or consuming SBOMs will probably use one of these two formats to ensure a certain amount of interoperability. If you expect the consumers of your software to keep track of your SBOM, you need a standard way of communicating. Likewise, if we are expecting an SBOM from our vendors, we want to make sure it’s in a format we can actually use. This is one of those cases where more isn’t better, two is plenty.

If you’re familiar with Anchore’s open source projects Syft and Grype, there’s also another format you’ve probably seen known as the Syft lossless SBOM. This format has been tailored specifically to the needs of Syft and Grype when the projects were just starting out. It’s a great format and contains a huge amount of information, but there aren’t a lot of tools out there that can generate or consume this SBOM format today.

When we think about vulnerability scanners, we tend to think about pointing a scanner at a container, or directory, or even a source repo, then scanning that location to find vulnerabilities in the dependencies. Grype has a neat trick though, it can scan an SBOM for vulnerabilities. This means instead of having to first scan the files to identify them, then figure out if any have vulnerabilities. Grype can skip over that identification  step by using an SBOM. Most of the time a vulnerability scanner spends is in this identification stage, scanning an SBOM for vulnerabilities is incredibly fast.

Initially Grype was only able to use a Syft format SBOM to scan for vulnerabilities. This is awesome, but we come back to the problem of what happens when a vendor gives us an SBOM in SPDX or CycloneDX format? The easy answer is to support those formats too of course. The next obvious question is which format should Grype support next; SPDX or CycloneDX? Since making a decision is hard, and SBOM formats are like children, you can’t really pick a favorite, it was decided to support both!

If you download the latest version of Grype you can now use it to scan your SPDX and CycloneDX SBOMs for vulnerabilities. If a vendor ships you an SBOM, it can be fed directly into Grype. We’re pretty sure Grype is the first open source vulnerability scanner that supports both SPDX and CycloneDX at the time of writing this. We think that’s a pretty big deal!

Now, it should be noted that this functionality is very new. There are going to be bugs and difficulties scanning SPDX and CycloneDX SBOMs. We would be fools to pretend the features are perfect. However, Grype is also an open source project, you don’t have to sit on the sidelines and watch. Open source is a team sport. If you scan an SBOM with Grype and run into any problems, please file a bug here. You can even submit a patch if that’s more your style, we love pull requests from our community.

Stay tuned for even more awesome features coming soon. We’re just getting started!

Understanding SBOM Management and The Six Ways It Prevents SBOM Sprawl

This blog post has been archived and replaced by the supporting pillar page that can be found here:

https://anchore.com/wp-admin/post.php?post=987473370&action=edit

The blog post is meant to remain “public” so that it will continue to show on the /blog feed. This will help discoverability for people browsing the blog and potentially help SEO. If it is clicked on it will automatically redirect to the pillar page.

Viewpoint: The Future of Software Supply Chain Security

Hello friends. My name is Josh and I’ve just started at Anchore as the Vice President of Security. I’ll talk more about what the role means in future posts, but for the moment I want to answer a few questions that are top of mind right now. Namely, what do I think the future of software supply chain security looks like and why did I choose to work with Anchore to help organizations better protect their software supply chains.

Back in 2004 I started working on the Red Hat Product Security Team. My focus has always been on securing the open source we all use  every day. Back then we were securing the open source supply chain, but there wasn’t really a name for this practice yet. I have always felt very strongly about the integrity and security of software products as well as the security of open source. It’s a happy coincidence that these two topics have merged in the last few years!

Today open source software makes up a majority of the code in almost every software application we use. Combining open source with cloud platforms and modern development technologies has completely changed the way we build and deliver software applications. These changes have helped us to fundamentally transform how we interact with technology in our jobs and our lives. But now, this dependence on software, and the supply chain that produces it, has created a new set of attack points for bad actors. I believe this industry-wide and economy-wide realization will change the foundations of how we build, deliver, and use technology.

What’s different now?

There was once a time the security team would end every conversation with “if you don’t listen to us someday you’ll be sorry.” Nobody was ever sorry. But the world has changed a lot and that  “someday” may be now. There are many new threats and attacks that create significant and measurable losses. Breaches are expensive, ransomware is expensive, personal data has monetary value now. Anchore’s 2021 Software Supply Chain Security Report found that 64% of organizations had been impacted by a software supply chain attack in the past year. We exist at a nexus point that has made the risk very real. Every company has gone digital with almost everything online now, DevOps has made the number of services uncountable and the pace of change almost unmeasurable. Meanwhile, the adversaries are organized and highly motivated. Separately any one of these factors might be manageable, but when you put it all together we need to completely rethink the approach to software supply chain security. Big problems need bold new ideas.

We are also in a period of disruptive change that allows for new ideas and real change to happen much faster than normal. The explosion of ransomware and increasing supply chain attacks against the backdrop of a global pandemic that have changed the very foundations of society, are creating the imperative to act. As we witness the growing attention software supply chain security is getting, it’s important to notice that we are no longer just talking about software supply chain security, we’re actually taking concrete steps to solve the problems.

What will the future look like?

Now we are starting to understand what the future will look like as we move toward solutions that will help better protect the software supply chain against these growing risks. In the past it was very common to conduct a security review once a product was “done”. This often resulted in a lot of missed security vulnerabilities being run in production or delivered to customers. Modern day development has changed such that security is expected to be tightly integrated into every step of the development process. This is where the term “shift left” originated.

We are already seeing the beginning of this change with the growing attention paid to the software bill of materials (SBOM) and vulnerability scanning as critical components of software supply chain security. Neither of these ideas are new, but we are seeing convergence around SBOM standards. There are groups like The Linux Foundation’s OpenSSF and the Cloud Native Computing Foundation (CNCF) that are working in the open source ecosystem to create a common understanding of the problems and define potential solutions. The United States Cybersecurity and Infrastructure Agency (CISA) has a supply chain task force. Conferences have entire supply chain tracks to share emerging best practices. The time to address software supply chain security is here.

There are new practices, processes, and tools that will need to be put into place to protect the software supply chain. While the importance of SBOM and vulnerability scanning is well understood, the critical challenge is in using the data to improve security. I think what we do with this data is the biggest area for improvement. Having an SBOM by itself isn’t useful. You need the ability to store it, track it over time, to aggregate the data, search the data, and get back actionable answers to questions. The same holds true for vulnerability scanning. Just scanning software after it has been built isn’t enough. What happens after the scan runs? How do you use the data to identify and remediate problems to reduce risk?

I want to use the Heartbleed vulnerability as a great example of where we started, where we are today, and where I want to see us go next. If you were around for Heartbleed, it was an eye opening experience. Just determining which systems you had running a vulnerable version of OpenSSL was a herculean task. Most of us had to manually go looking for files on a disk. Today with our ability to generate and distribute SBOMs, it’s not hard to figure out what systems are using OpenSSL. We can even construct policies now that could prevent a new build or deployment that contains an old version of OpenSSL.

The future I want to see is having insight into your end-to-end software supply chain and the security of the software you create and use. Being able to craft policies that can be enforced about what your software should look like. Not just having the ability to ask what a vulnerability or bug means for your application, but having tools that tell you before you even ask the question.

Why Anchore?

This all brings us to Anchore. I’ve known about Anchore for quite some time. In my previous role in Product Security, I worked with a large number of organizations focused on software supply chain issues. This included open source projects, software vendors, consultants, and even supply chain working groups. It became very obvious that while there was increasing focus on software supply chain security, there wasn’t always a consensus on the best practices or tooling needed.

The current state of tools is very uneven. Few tools provide comprehensive SBOMs with all of the relevant metadata needed to make accurate security assessments and decisions. Some scanning tools want to report zero false positives, resulting in lots of false negatives. Other tools simplistically report every possible vulnerability which results in lots of irrelevant false positives. I’m not looking to point any fingers here, this is all very new and everyone is continuing to learn. In my experience the sweet spot is somewhere in the middle—some false positives should be expected, but too many or too few are both bad. The purpose of tooling is to help provide data to make decisions. Bad data results in bad decisions.

Every time I interacted with any organization in the software supply chain space, I kept seeing Anchore as occupying the sweet spot in the middle over and over again. Anchore starts from a foundation of open source tools that are easy for developers to integrate and use. Syft, an open source SBOM generator, is incredibly useful and accurate. Grype, an open source vulnerability scanner, is one of the best vulnerability scanners I’ve ever used. Anchore’s commercial product, Anchore Enterprise, builds on that open source foundation and adds some powerful features for cataloging SBOMs, remediating vulnerabilities, and enforcing policies. Everywhere I looked it seemed that Anchore was the one company that “got it.” Anchore was doing all the things that were important to me in a way that made sense. Relevant scanning results, easy SBOM creation and use, and the ability to leverage existing policies (like CIS) instead of trying to build new ones.

And lastly, open source. Open source isn’t something I think is a good idea, it’s part of who I am. My entire life has been shaped and built within the open source community. I know anywhere I work has to be extremely open, very open source friendly, and have a culture that mirrors the ways open source thinks and works. Anchore has the open source culture and open source focus that I know is so very important. They have a whole blog dedicated to their culture, give it a read, it’s fantastic!

What’s next?

The easiest way to see what’s next is to give the Anchore open source tools a spin. Generate an SBOM with Syft. Then scan the SBOM file for vulnerabilities with Grype. It’s all open, try them out, file some bugs, submit pull requests. Open source works best when everyone works together.

If you want to pull it all together for an end-to-end solution for securing your software supply chain, check out Anchore Enterprise. It’s a nice way to tie the tools together in one place to meet the needs of a larger organization or multiple teams.

I love to talk about these topics. If you’re interested in having a chat or even just saying hi feel free to reach out. Watch this space, there’s a lot to talk about, and even more work to do. It’s going to be a truly epic adventure!

How to Check for CISA Catalog of Exploited Vulnerabilities

Last week the United States Cybersecurity and Infrastructure Security Agency (CISA) published a binding operational directive describing a list of security vulnerabilities that all federal agencies are required to fix. Read the directive here: https://cyber.dhs.gov/bod/22-01/ 

The directive establishes a CISA-managed catalog of known exploited vulnerabilities that carry significant risk to federal agencies. The list can be found here: https://www.cisa.gov/known-exploited-vulnerabilities-catalog

While CISA’s directive is binding only on U.S. federal agencies, companies can also leverage this catalog to prioritize vulnerabilities that may put their organization at risk.

There has been a lot of discussion about this directive and what it will mean. Rather than add commentary about the directive itself, let’s discuss what’s actually inside this list of vulnerabilities and what actions you can take to check if you are using any of the software in question.

It’s important to understand that the list of vulnerabilities in this catalog will not be static. CISA has stated in their directive that the list will be modified in the future, meaning that we can expect more vulnerabilities to be added. Even if a federal agency is not currently running any of the vulnerable software versions, as the list grows and evolves and the software that is running evolves, it will be important to have a plan for the future. Think about handling vulnerabilities like delivering the mail. Even if you finish all your work by the end of the day, there will be more tomorrow.

If you work with lists of vulnerabilities you will be used to vulnerabilities having a severity assigned by the National Vulnerability Database (NVD). The NVD is a U.S. government repository of vulnerability data that is managed by the National Institute of Standards and Technology (NIST). The data in NVD enriches the CVE data set with additional product information as well as a severity rating for the vulnerability based on the CVSS scoring system.

It is very common for policy decisions to be made based on the NVD CVSS severity rating. Any vulnerability with a CVSS score of critical or important is expected to be fixed very quickly, while more time is allowed to fix medium and low severity vulnerabilities. The idea is that these severity ratings can help us decide which vulnerabilities are the most dangerous, and those should be fixed right away.

However, this new list of must-fix vulnerabilities from CISA goes beyond just considering the CVSS score. At the time of writing this the CISA list contains 291 vulnerabilities that require special attention. But why these 291 when there are an almost immeasurable number of vulnerabilities in the wild? The directive indicates that these vulnerabilities are being actively exploited, which means there are attackers using these vulnerabilities to break into systems right now.

Not all vulnerabilities are created equally

Examining the catalog of vulnerabilities from CISA, many of the IDs have received a rating of critical or important from NVD, but not all. For example CVE-2019-9978 is a WordPress plugin with a severity of medium. Why would a medium severity rating make this list? Attackers don’t pay attention to severity.

Remember this list isn’t based on the NVD CVSS severity rating, it’s based on which vulnerabilities are being actively exploited. CISA has information that organizations do not and is aware of attackers using these particular vulnerabilities to attack systems. The CVSS rating does not indicate if a vulnerability is being actively attacked, it only scores on potential risk. Just because a vulnerability is rated as medium doesn’t mean it can’t be attacked. The severity only describes the potential risk; low risk does not mean zero risk.

How Anchore can help

There are a few options Anchore provides that can help you handle this list. Anchore has an open source tool called Grype which is capable of scanning containers, archives, and directories for security vulnerabilities. For example, you can use Grype to scan the latest Ubuntu image by running
docker run anchore/grype ubuntu:latest
docker run anchore/grype output
You will have to manually compare the output of Grype to the list from CISA to determine if you are vulnerable to any of the issues, luckily CISA has provided a CSV of all the CVE IDs here:
https://www.cisa.gov/sites/defaultkn/files/csv/known_exploited_vulnerabilities.csv

Here’s a simplified example you can use right now to check if a container is vulnerable to any of the items on the CISA list.

First, use Grype to scan a container image. You can also scan a directory or archive; this example just uses containers because it’s simple. Extract just the CVE IDs, sort them, then store the sorted list in a file called scan_ids.txt in /tmp.

docker run anchore/grype  | sed -r 's/.*(CVE-[0-9]{4}-[0-9]{4,}).*/\1/g' | sort > /tmp/scan_ids.txt

Next download the CISA csv file, extract the CVE IDs, sort it, and store the results in a file called “cisa_ids.txt” in /tmp/

curl https://www.cisa.gov/sites/default/files/csv/known_exploited_vulnerabilities.csv | sed -r 's/.*(CVE-[0-9]{4}-[0-9]{4,}).*/\1/g' | sort > /tmp/cisa_ids.txt

Then compare the two lists, looking for any IDs that are on both lists

comm -1 -2 /tmp/cisa_ids.txt /tmp/scan_ids.txt

The “comm” utility when run with the “-1 -2” flags only returns things it finds in both lists. This command will return the overlap between the vulnerabilities found by Grype and those on the CISA list. If the container doesn’t contain any CVE IDs on the CISA list, then nothing is returned.

Users of Anchore Enterprise can take advantage of a pre-built, curated CISA policy pack that will scan container images and identify any vulnerabilities found that are on the CISA list.

Download the CISA policy pack for Anchore Enterprise here.

Once downloaded, Anchore customers can upload the policy pack to Anchore Enterprise by selecting the Policy Bundles tab as seen below:

Anchore policy tab

Next, upload the policy pack by selecting the Paste Bundle button.

Upload policy bundle to Anchore

If done correctly, you should see something very similar to what is depicted below, where you can see the raw json file loaded into the policy editor:

Loaded policy bundle

Lastly, activate by clicking the radio button for the bundle, so that it can be used in your CI/CD pipelines and/or runtime scans to detect the relevant CVEs from the CISA catalog that are specified within the policy.

Activate a policy on Anchore

You can now see the results generated by the CISA policy pack against any of your images, as demonstrated below against an image that contains Apache Struts vulnerabilities that are included within the CISA vulnerability list.

Policy results

From here, you can easily generate automated reports listing which CVEs from the CISA policy exist within your environments.

Looking ahead

Organizations should expect new vulnerabilities to be added to the CISA catalog in the future. Attackers are always changing tactics, finding new ways to exploit existing vulnerabilities, and finding new vulnerabilities. Security is a moving target and security teams must remain vigilant. Anchore will continue to follow the guidance coming out of organizations such as CISA and enable customers and users to take action to secure their environments based on that guidance.