The NVD Enrichment Crisis: One Year Later—How Anchore is Filling the Vulnerability Data Gap

About one year ago, Anchore’s own Josh Bressers broke the story that NVD (National Vulnerability Database) was not keeping up with its vulnerability enrichment. This week, we sat down with Josh to see how things are going.

> Josh, can you tell our readers what you mean when you say NVD stopped enriching data?

Sure! When people or organizations disclose a new security vulnerability, it’s often just a CVE (Common Vulnerabilities and Exposures) number (like CVE-2024-1234) and a description. 

Historically, NVD would take this data, and NVD analysts would add two key pieces of information: the CPEs (Common Platform Enumerations), which are meant to identify the affected software, and the CVSS (Common Vulnerability Scoring System) score, which is meant to give users of the data a sense of how serious the vulnerability is and how it can be exploited. 

For many years, NVD kept up pretty well. Then, in March 2024, they stopped.

> That sounds bad. Were they able to catch up?

Not really. 

One of the problems they face is that the number of CVEs in existence is growing exponentially. They were having trouble keeping up in 2024, but 2025 is making CVEs even faster than 2024 did, plus they have the backlog of CVEs that weren’t enriched during 2024. 

It seems unlikely that they can catch up at this point.

Graph showing how few CVE IDs are being enriched with matching data since April 2024
Graph showing how few CVE IDs are being enriched with matching data since April 2024
Graph showing the number of total CVEs (green) and the number of enriched CVEs (red). "The line slopes say it all"—NVD is behind and the number of unreviewed CVEs is growing.
Graph showing the number of total CVEs (green) and the number of enriched CVEs (red). “The line slopes say it all”—NVD is behind and the number of unreviewed CVEs is growing.

> So what’s the upshot here? Why should we care that NVD isn’t able to enrich vulnerabilities?

Well, there are basically two problems with NVD not enriching vulnerabilities. 

First, if they don’t have CPEs on them, there’s no machine-readable way to know what software they affect. In other words, part of the work NVD was doing is writing down what software (or hardware) is affected in a machine-readable way, enabling vulnerability scanners and other software to tell which components are affected. 

The loss of this is obviously bad. It means that there is a big pile of security flaws that are public—meaning that threat actors know about them—but security teams will have a harder time detecting them. Un-enriched CVEs are not labeled with CPEs, so programmatic analysis is off the table and teams will have to fall back to manual review.

Second, enrichment of CVEs is supposed to add a CVSS score—essentially a severity level—to CVEs. CVSS isn’t perfect, but it does allow organizations to say things like, “this vulnerability is very easy to exploit, so we need to get it fixed before this other CVE which is very hard to exploit.” Without CVSS or something like it, these tradeoffs are much harder for organizations to make.

> And this has been going on for more than a year? That sounds bad. What is Anchore doing to keep their customers safe?

The first thing we needed to do was make a place where we can take up some of the slack that NVD can’t. To do this, we created a public database of our own CVE enrichment. This means that, when major CVEs are disclosed, we can enrich them ahead of NVD, so that our scanning tools (both Grype and Anchore Secure) are able to detect vulnerable packages—even if NVD never has the resources to look into that particular CVE.

Additionally, because NVD severity scores are becoming less reliable and less available, we’ve built a prioritization algorithm into Anchore Secure that allows customers to keep doing the kind of triaging they used to rely on NVD CVSS for.

> Is the vulnerability enhancement data publicly available?

Yes, the data is publicly available. 

Also, the process for changing it is out in the open. One of the more frustrating things about working with NVD enrichment was that sometimes they would publish an enrichment with really bad data and then all you could do was email them—sometimes they would fix it right away and sometimes they would never get to it.

With Anchore’s open vulnerability data, anyone in the community can review and comment on these enrichments.

> So what are your big takeaways from the past year?

I think the biggest takeaway is that we can still do vulnerability matching. 

We’re pulling together our own public vulnerability database, plus data feeds from various Linux distributions and of course GitHub Security Advisories to give our customers the most accurate vulnerability scan we can. In many ways, reducing our reliance on NVD CPEs has improved our matching (see this post, for example).

The other big takeaway is that, because so much of our data and tooling are open source, the community can benefit from and help with our efforts to provide the most accurate security tools in the world.

> What can community members do to help?

Well, first off, if you’re really interested in vulnerability data or have expertise with the security aspects of specific open source projects/operating systems, head on over to our vulnerability enhancement repo or start contributing to the tools that go into our matching like Syft, Grype, and Vunnel.

But the other thing to do, and I think more people can do this, is just use our open source tools!

File issues when you find things that aren’t perfect. Ask questions on our forum.

And of course, when you get to the point that you have dozens of folders full of Syft SBOMs and tons of little scripts running Grype everywhere—call us—and we can let Anchore Enterprise take care of that for you.


Learn the 5 best practices for container security and how SBOMs play a pivotal role in securing your software supply chain.

Grype Support for Azure Linux 3 released

On September 26, 2024 the OSS team at Anchore released general support for Azure Linux 3, Microsoft’s new cloud-focused Linux distribution. This blog post will share some of the technical details of what goes into supporting a new Linux distribution in Grype.

Step 1: Make sure Syft identifies the distro correctly

In this case, this step happened automatically. Syft is pretty smart about parsing /etc/os-release in an image, and Microsoft has labeled Azure Linux in a standard way. Even before this release, if you’d run the following command, you would see Azure Linux 3 correctly identified.

syft -q -o json mcr.microsoft.com/azurelinux/base/core:3.0 | jq .distro
{
  "prettyName": "Microsoft Azure Linux 3.0",
  "name": "Microsoft Azure Linux",
  "id": "azurelinux",
  "version": "3.0.20241005",
  "versionID": "3.0",
  "homeURL": "https://aka.ms/azurelinux",
  "supportURL": "https://aka.ms/azurelinux",
  "bugReportURL": "https://aka.ms/azurelinux"
}

Step 2: Build a vulnerable image

You can’t test a vulnerability scanner without an image that has known vulnerabilities in it. So just about the first thing to do is make a test image that is known to have some problems.

In this case, we started with Azure’s base image and intentionally installed an old version of the golang RPM:

FROM mcr.microsoft.com/azurelinux/base/core:3.0@sha256:9c1df3923b29a197dc5e6947e9c283ac71f33ef051110e3980c12e87a2de91f1

RUN tdnf install -y golang-1.22.5-1.azl3

This has a couple of CVEs against it, so we can use it to test whether Grype is working end to end.

$ docker build -t azuretest:latest .
$ docker image save azuretest:latest > azuretest.tar
$ grype ./azuretest.tar
  Parsed image sha256:49edd6d1eff19d2b34c27a6ad11a4a8185d2764ae1182c17c563a597d173b8
  Cataloged contents e649de5ff4361e49e52ecdb8fe8acb854cf064247e377ba92669e7a33a228a00
   ├──  Packages                        [122 packages]
   ├──  File digests                    [11,141 files]
   ├──  File metadata                   [11,141 locations]
   └──  Executables                     [426 executables]
  Scanned for vulnerabilities     [84 vulnerability matches]
   ├── by severity: 3 critical, 57 high, 3 medium, 0 low, 0 negligible (21 unknown)
   └── by status:   84 fixed, 0 not-fixed, 0 ignored
NAME          INSTALLED      FIXED-IN         TYPE       VULNERABILITY   SEVERITY
coreutils     9.4-3.azl3     0:9.4-5.azl3     rpm        CVE-2024-0684   Medium
curl          8.8.0-1.azl3   0:8.8.0-2.azl3   rpm        CVE-2024-6197   High
curl-libs     8.8.0-1.azl3   0:8.8.0-2.azl3   rpm        CVE-2024-6197   High
expat         2.6.2-1.azl3   0:2.6.3-1.azl3   rpm        CVE-2024-45492  High
expat         2.6.2-1.azl3   0:2.6.3-1.azl3   rpm        CVE-2024-45491  High
expat         2.6.2-1.azl3   0:2.6.3-1.azl3   rpm        CVE-2024-45490  High
expat-libs    2.6.2-1.azl3   0:2.6.3-1.azl3   rpm        CVE-2024-45492  High
expat-libs    2.6.2-1.azl3   0:2.6.3-1.azl3   rpm        CVE-2024-45491  High
expat-libs    2.6.2-1.azl3   0:2.6.3-1.azl3   rpm        CVE-2024-45490  High
golang        1.22.5-1.azl3  0:1.22.7-2.azl3  rpm        CVE-2023-29404  Critical
golang        1.22.5-1.azl3  0:1.22.7-2.azl3  rpm        CVE-2023-29402  Critical
golang        1.22.5-1.azl3  0:1.22.7-2.azl3  rpm        CVE-2022-41722  High
krb5          1.21.2-1.azl3  0:1.21.3-1.azl3  rpm        CVE-2024-37371  Critical

Normally, we like to build test images with CVEs from 2021 or earlier against them because this set of vulnerabilities changes slowly. But hats off to the team at Microsoft. We could not find an easy way to get a three-year-old vulnerability into their distro. So, in this case, the team did some behind-the-scenes work to make it easier to add test images that only have newer vulnerabilities as part of this release.

Step 3: Write the vunnel provider

Vunnel is Anchore’s “vulnerability funnel,” the open-source project that downloads vulnerability data from many different sources and collects and normalizes them so that grype can match them. This step was pretty straightforward because Microsoft publishes complete and up-to-date OVAL XML, so the Vunnel provider can just download it, parse it into our own format, and pass it along.

Step 4: Wire it up in Grype, and profit scan away

Now Syft identifies the distro, we have test images to use in our CI/CD pipelines so that we’re sure we don’t regress, and Vunnel is downloading the Azure Linux 3 vulnerability data from Microsoft, we’re ready to release the Grype change. In this case, it was a simple change telling Grype where to look in its database for vulnerabilities about the new distro.

Conclusion

There are two big upshots of this post:

First, anyone running Grype v0.81.0 or later can scan images built from Azure Linux 3 and get accurate vulnerability information today, for free.

Second, Anchore’s tools make it possible to add a new Linux distro to Syft and Grype in just a few pull requests. All the work we did for this support was open source – you can go read the pull requests on GitHub if you’d like (vunnel, grype-db, grype, test-images). And that means that if your favorite Linux distribution isn’t covered yet, you can let us know or send us a PR.

If you’d like to discuss any topics this post raises, join us on discourse.

Who watches the watchmen? Introducing yardstick validate

Grype scans images for vulnerabilities, but who tests Grype? If Grype does or doesn’t find a given vulnerability in a given artifact, is it right? In this blog post, we’ll dive into yardstick, an open-source tool by Anchore for comparing the results of different vulnerability scans, both against each other and against data hand-labeled by security researchers.

Quality Gates

In Anchore’s CI pipelines, we have a concept we call a “quality gate.” A quality gate’s job is to ensure each change to each of our tools results in matching at least as good as the version before the change. To talk about quality gates, we need a couple of terms:

  • Reference tool configuration, or just “reference tool” for short – this is an invocation of the tool (Grype, in this case) as it works today, without the change we are considering making
  • Candidate tool configuration, or just “candidate tool” for short – this is an invocation the tool with the change we’re trying to verify. Maybe we changed Vunnel, or the grype source code itself, for example.
  • Test images are images that Anchore has built that are known to have vulnerabilities
  • Label data is data our researchers have labeled, essentially writing down, “for image A, for package B at version C, vulnerability X is really present (or is not really present)”

The important thing about the quality gate is that it’s an experiment – it changes only one thing to test the hypothesis. The hypothesis is always, “the candidate tool is as good or better than the reference tool,” and the one thing that’s different is the difference between the candidate tool and the reference tool. For example, if we’re testing a code change in Grype itself, the only difference between reference tool and candidate tool is the code change; the database of vulnerabilities will be the same for both runs. On the other hand, if we’re testing a change to how we build the database, the code for both Grypes will be the same, but the database used by the candidate tool will be built by the new code.

Now let’s talk through the logic of a quality gate:

  1. Run the reference tool and the candidate tool to get reference matches
  2. If both tools found nothing, the test is invalid. (Remember we’re scanning images that intentionally have vulnerabilities to test a scanner.)
  3. If both tools find exactly the same vulnerabilities, the test passes, because the candidate tool can’t be worse than the reference tool if they find the same things
  4. Finally, if both the reference tool and the candidate tool find at least one vulnerability, but not the same set of vulnerabilities, then we need to Do Match

Matching Math: Precision, Recall, and F1

The first math problem we do is easy: Did we add too many False Negatives? (Usually one is too many.) For example, if the reference tool found a vulnerability, and the candidate tool didn’t, and the label data says it’s really there, then the gate should fail – we can’t have a vulnerability matcher that misses things we know about!

The second math problem is also pretty easy: did we leave too many matches unlabeled? If we left too many matches unlabeled, we can’t do a comparison, because, if the reference tool and the candidate tool both found a lot of vulnerabilities, but we don’t know whether they’re really present or not, we can’t say which set of results is better. So the gate should fail and the engineer making the change will go and label more vulnerabilities

Now, we get to the harder math. Let’s say the reference tool and the candidate tool both find vulnerabilities, but not exactly the same ones, and the candidate tool doesn’t introduce any false negatives. But let’s say the candidate tool does introduce a false positive or two, but it also fixes false positives and false negatives that the reference tool was wrong about. Is it better? Now we have to borrow some math from science class:

  • Precision is the fraction of matches that are true positives. So if one of the tools found 10 vulnerabilities, and 8 of them are true positives, the precision is 0.8.
  • Recall is the fraction of vulnerabilities that the tool found. So if there were 10 vulnerabilities present in the image and Grype found 9 of them, the recall is 0.9.
  • F1 score is a calculation based on precision and recall that tries to reward high precision and high recall, while penalizing low precision and penalizing low recall. I won’t type out the calculation but you can read about it on Wikipedia or see it calculated in yardstick’s source code.

So what’s new in yardstick

Recently, the Anchore OSS team released the yardstick validate subcommand. This subcommand encapsulates the above work in a single command, which centralized a bunch of test Python spread out over the different OSS repositories.

Now, to add a quality gate with a set of images, we just need to add some yaml like:

pr_vs_latest_via_sbom_2022:
    description: "same as 'pr_vs_latest_via_sbom', but includes vulnerabilities from 2022 and before, instead of 2021 and before"
    max_year: 2022
    validations:
      - max-f1-regression: 0.1 # allowed to regress 0.1 on f1 score
        max-new-false-negatives: 10
        max-unlabeled-percent: 0
        max_year: 2022
        fail_on_empty_match_set: false
    matrix:
      images:
        - docker.io/anchore/test_images:azurelinux3-63671fe@sha256:2d761ba36575ddd4e07d446f4f2a05448298c20e5bdcd3dedfbbc00f9865240d
      tools:
        - name: syft
          # note: we want to use a fixed version of syft for capturing all results (NOT "latest")
          version: v0.98.0
          produces: SBOM
          refresh: false
        - name: grype
          version: path:../../+import-db=db.tar.gz
          takes: SBOM
          label: candidate # is candidate better than the current baseline?
        - name: grype
          version: latest+import-db=db.tar.gz
          takes: SBOM
          label: reference # this run is the current baseline

We think this change will make it easier to contribute to Grype and Vunnel. We know it helped out in the recent work to release Azure Linux 3 support.

If you’d like to discuss any topics this post raises, join us on discourse.