Generating a software bill of materials (SBOM) as part of your DevSecOps process is an essential step to secure your software supply chain. SBOMs are becoming critical due to the growing prominence of catastrophic supply chain attacks like XZ Utils, Log4j and CUPS.
SBOMs are a comprehensive record of every software component in an application—along with critical metadata such as supplier, licensing, and security details. SBOMs serve as the foundational data structure for security, OSS licensing compliance, risk management use-cases—and many more across a DevSecOps organization.
Importantly, SBOMs are needed to comply with the White House Executive Order (EO) 14028 and the recently ratified Cyber Resilience Act in the European Union.
Fortunately, there are a number of free, open source tools that generate SBOMs. Generating your first one only takes a few simple steps:
syft <source> -o <format>
If you need instant gratification, check out our one minute tutorial. It covers generating an SBOM and scanning it for vulnerabilities:
Join on December 10, 2024 for a live discussion with VP of Security Josh Bressers on the latest trends. Hear practical steps for building a more resilient software supply chain. Register Now.
There are many tools available for generating SBOMs, so the first thing you’ll need to do is pick one to use. Below is a TL;DR of the most impactful evaluation criteria:
A full-length discussion of each of these points can be found at the end of this article.
Some of the most popular SBOM tools are:
cdxgen
) by OWASPIf you are looking for a comprehensive list of open source SBOM generation tools in the wild, we track the complete list in our open source SBOM eBook. For this article, we’ll focus on Syft. It is easy to use in many different scenarios and supports a variety of ecosystems. Syft can run on your desktop, in CI systems, as a Docker container and scan a wide variety of ecosystems from Linux distributions to many types of build dependency specifications.
Additionally, Syft is the recommended software composition analysis (SCA) tool for generating SBOMs in NVIDIA’s AI Blueprint for Vulnerability Analysis.
Enough with the prelude! Let’s get into the meat of how to generate an SBOM. We’ll walk you through the process below:
The first thing to do is install Syft. There are a number of ways to do this:
curl
The recommended method to get Syft for macOS and Linux is by using curl
:
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin
For macOS, Syft is available to install from Homebrew:
brew install syft
You can directly download Syft binaries for many platforms including Windows from the GitHub releases page.
There is also a Syft Docker image with every release: anchore/syft
, which can be run like this:
docker run -it --rm anchore/syft <args>
To confirm Syft was installed correctly, simply run:
syft version
Which will produce output similar to:
Application: syft
Version: 1.4.1
BuildDate: 2024-05-09T19:45:46Z
GitCommit: c200896a9644f9b6bd4bc3785c848276c33bb53c
GitDescription: v1.4.1
Platform: darwin/arm64
GoVersion: go1.21.9
Compiler: gc
Note: Syft was version 1.4.1 at the time of this writing
Once you have Syft installed, creating your first SBOM is simple. Syft supports multiple sources to scan when generating an SBOM using both the local filesystem and container images.
To generate an SBOM for a Docker or OCI image—even without a Docker daemon, simply run:
syft <image>
Syft can generate SBOMs from a variety of other sources, such as, scanning the filesystem by directory or specific file, Podman, tar archives, or directly from an OCI registry even when Docker is not available. Check out the full list of sources.
To scan the latest Alpine image, simply run:
syft alpine:latest
You should see output similar to this:
✔ Loaded image alpine:latest
✔ Parsed image sha256:ace17d5d883e9ea5a21138d0608d…
✔ Cataloged contents 064d96b7a2fc6398b4c596e2a693fcc961a…
├── ✔ Packages [15 packages]
├── ✔ File digests [80 files]
├── ✔ File metadata [80 locations]
└── ✔ Executables [17 executables]
NAME VERSION TYPE
alpine-baselayout 3.4.3-r2 apk
alpine-baselayout-data 3.4.3-r2 apk
alpine-keys 2.4-r1 apk
apk-tools 2.14.0-r5 apk
busybox 1.36.1-r15 apk
busybox-binsh 1.36.1-r15 apk
ca-certificates-bundle 20230506-r0 apk
libc-utils 0.7.2-r5 apk
libcrypto3 3.1.4-r5 apk
libssl3 3.1.4-r5 apk
musl 1.2.4_git20230717-r4 apk
musl-utils 1.2.4_git20230717-r4 apk
scanelf 1.3.7-r2 apk
ssl_client 1.36.1-r15 apk
zlib 1.3.1-r0 apk
By default, the SBOM you’ll see will be a nicely formatted table rather than any standardized SBOM format, which leads us to…
Depending on your use-cases, it may be important to use a particular SBOM format. The most popular ones are Software Package Data Exchange (SPDX) and CycloneDX, both of which Syft supports. Syft also has a lossless intermediate format which interoperates with Anchore’s open source vulnerability scanner, Grype. If you’re looking for more detailed information on SBOM formats, standards and examples follow the link.
While Syft supports these different formats, they have slightly different goals and features. It may be important to pick SPDX or CycloneDX for interoperability with other tools or as a standardized format to distribute to downstream consumers.
If your use-case requires an SBOM in SPDX format, Syft has you covered. SPDX has been around the longest of all the formats mentioned here. There are multiple variants of SPDX. Syft supports SPDX Tag-Value (spdx-tag-value
) and SPDX JSON (spdx-json
). For SPDX JSON, simply add the -o spdx-json
argument. For example, run:
syft alpine:latest -o spdx-json
You’ll see there is a lot more data than the table view allows! You should see something resembling:
{
"spdxVersion": "SPDX-2.3",
"dataLicense": "CC0-1.0",
"SPDXID": "SPDXRef-DOCUMENT",
"name": "alpine",
"...": "...",
"creationInfo": {
"licenseListVersion": "3.25",
"creators": [
"Organization: Anchore, Inc",
"Tool: syft-1.19.0"
],
"created": "2025-01-27T15:29:58Z"
},
"packages": [
{
"name": "alpine-baselayout",
"SPDXID": "SPDXRef-Package-apk-alpine-baselayout-421bc6506abee7e4",
"versionInfo": "3.6.8-r1",
"supplier": "Person: Natanael Copa ([email protected])",
"...": "...",
"sourceInfo": "acquired package info from APK DB: /lib/apk/db/installed",
"licenseConcluded": "NOASSERTION",
"licenseDeclared": "GPL-2.0-only",
"copyrightText": "NOASSERTION",
"description": "Alpine base dir structure and init scripts",
"externalRefs": [
{
"referenceCategory": "SECURITY",
"referenceType": "cpe23Type",
"referenceLocator": "cpe:2.3:a:alpine-baselayout:alpine-baselayout:3.6.8-r1:*:*:*:*:*:*:*"
},
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:apk/alpine/[email protected]?arch=aarch64&distro=alpine-3.21.2"
}
]
},
]
}
Not only does this format contain the package names, but also Package URLs, license information, and a host of additional metadata, such as, the location of associated files Syft identified within the package.
Similarly, if you need to generate an SBOM in CycloneDX format use a CycloneDX format option. Syft supports CycloneDX XML (cyclonedx-xml
) and JSON (cyclonedx-json
). For CycloneDX XML:
syft <source> -o cyclonedx-xml
To run this against the same latest Alpine image, run:
syft alpine:latest -o cyclonedx-xml
And you should see a result resembling this:
<?xml version="1.0" encoding="UTF-8"?>
<bom xmlns="http://cyclonedx.org/schema/bom/1.6" serialNumber="urn:uuid:ffc83bc6-8806-4331-b515-2d1962589034" version="1">
<metadata>
<timestamp>2025-01-27T10:49:00-05:00</timestamp>
<tools>
<components>
<component type="application">
<author>anchore</author>
<name>syft</name>
<version>1.19.0</version>
</component>
</components>
</tools>
<component bom-ref="327aecd176f7b31f" type="container">
<name>alpine</name>
<version>sha256:47badde288cf303fe43766ba3c0be01df313b84ad91480c1f21b7e907a7f2337</version>
</component>
</metadata>
<components>
<component bom-ref="pkg:apk/alpine/[email protected]?arch=aarch64&distro=alpine-3.21.2&package-id=421bc6506abee7e4" type="library">
<publisher>Natanael Copa <ncopa@alpinelinux.org></publisher>
<name>alpine-baselayout</name>
<version>3.6.8-r1</version>
<description>Alpine base dir structure and init scripts</description>
<licenses>
<license>
<id>GPL-2.0-only</id>
</license>
</licenses>
<cpe>cpe:2.3:a:alpine-baselayout:alpine-baselayout:3.6.8-r1:*:*:*:*:*:*:*</cpe>
<purl>pkg:apk/alpine/[email protected]?arch=aarch64&distro=alpine-3.21.2</purl>
<externalReferences>
<reference type="distribution">
<url>https://git.alpinelinux.org/cgit/aports/tree/main/alpine-baselayout</url>
</reference>
</externalReferences>
<properties>
<property name="syft:package:foundBy">apk-db-cataloger</property>
<property name="syft:package:type">apk</property>
<property name="syft:package:metadataType">apk-db-entry</property>
<property name="syft:cpe23">cpe:2.3:a:alpine-baselayout:alpine_baselayout:3.6.8-r1:*:*:*:*:*:*:*</property>
...
</properties>
</component>
...
</components>
</bom>
Again, there is a lot more data than the table allows, but a different set of data than the SPDX format because there simply is not a one-to-one mapping of properties between the two.
The last format we’ll talk about is Syft’s own format. If there isn’t a need to provide an SBOM to other tools or you’re using a tool that supports Syft’s lossless intermediate format (e.g., Grype), then the Syft JSON format will deliver the highest fidelity data. Both SPDX and CycloneDX prune data fields from the metadata that Syft’s scan generates while the Syft lossless format does not.
Although Grype works great with SPDX and CycloneDX, there could be a situation where data was lost converting to one of these formats that Grype’s matching uses. This impairs Grype’s ability to detect vulnerabilities which is why we recommend the Syft lossless format.To use the Syft JSON format, use the -o json
argument.
There’s a lot more that this open source SBOM tool can do. A few features of note:
--file path/to/file
--exclude path/**/*.txt
.syft.yaml
fileSee all Syft features and configuration options, in the docs.
Now that you’ve got an SBOM, what’s next? A logical next step would be to integrate with your build pipeline to have SBOMs generated automatically. In fact, there could be more than one location where it makes sense to generate SBOMs such as build time and after a container is built or during a release process.
The SBOMs then could be scanned for license compliance and continuously for vulnerabilities. In fact, if you are using GitHub Actions, there are a couple actions to do just that: sbom-action
to generate SBOMs using Syft and scan-action
to perform vulnerability scanning. For a few repositories, it’s very simple to set these up but might be challenging when there are a lot of repositories to keep track of.
As we’ve talked about, using SBOMs as a central part of securing your software supply chain is increasingly important. Integrating automated SBOM generation into your DevOps process is vital. Storing, managing, and analyzing those SBOMs to inform security measures should be an important consideration for you and your organization.
For more comprehensive SBOM management, an enterprise level solution like Anchore Enterprise will enable you to generate comprehensive SBOMs with every build, detect drift from one build to the next, share SBOMs internally or externally, and quickly identify risk such as vulnerabilities, secrets, malware, and misconfigurations.
As promised at the beginning of this article, below is a detailed discussion of the criteria to choose the best SBOM generation tool for your organization.
When choosing from the array of SBOM generation tools in the market, it is important to frame your decision with the outcome(s) that you are trying to achieve. If your goal is to improve the response time/mean time to remediation when the next Log4j-style incident occurs—and be sure that there will be a next time—an SBOM tool that excels at correctly identifying open source licenses in a code base won’t be the best solution for your use-case (even if you prefer its CLI ;-D).
What to Do:
It can be tempting to prioritize an SBOM generator that is best suited to our preferences and workflows; we are the ones that will be using the tool regularly—shouldn’t we prioritize what makes our lives easier? If we prioritize our needs above the goal of the initiative we might end up putting ourselves into a position where our choice in tools impedes our ability to recognize the desired outcome. Using the correct framing, in this case by focusing on the use-cases, will keep us focused on delivering the best possible outcome.
SBOMs can be utilized for numerous purposes: security incident response, open source license compliance, proactive vulnerability management, compliance reporting or software supply chain risk management. We won’t address all use-cases/outcomes in this blog post, a more comprehensive treatment of all of the potential SBOM use-cases can be found on our website.
Example SBOM Use-Cases:
Pro tip: While you will inevitably leave many SBOM use-cases out of scope for your current project, keeping secondary use-cases in the back of your mind while making a decision on the right SBOM tool will set you up for success when those secondary use-cases eventually become a priority in the future.
SBOM generators aren’t just tools to ingest data and re-format it into a standardized format. They are typically paired with a software composition analysis (SCA) tool that scans an application/software artifact for metadata that will populate the final SBOM.
Support for the complete array of programming languages, build artifacts and operating system ecosystems is essentially an impossible task. This means that support varies significantly depending on the SBOM generator that you select. An SBOM generator’s ability to help you reach your organizational goals is directly related to its support for your organization’s software tooling preferences. This will likely be one of the most important qualifications when choosing between different options and will rule out many that don’t meet the needs of your organization.
Considerations:
This is one of the most important criteria when evaluating different SBOM tools. An SBOM generator may claim support for a particular programming language but after testing the scanner you may discover that it returns an SBOM with only direct dependencies—honestly not much better than a package.json or go.mod file that your build process spits out.
Two different tools might both generate a valid SPDX SBOM document when run against the same source artifact, but the content of those documents can vary greatly. This variation depends on what the tool can inspect, understand, and translate. Being able to fully scan an application for both direct and transitive dependencies as well as navigate non-ideomatic patterns for how software can be structured end up being the true differentiators between the field of SBOM generation contenders.
Imagine using two SBOM tools on a Debian package. One tool recognizes Debian packages and includes detailed information about them in the SBOM. The latter can’t fully parse the Debian .deb format and omits critical information. Both produce an SBOM, but only one provides the data you need to power use-case based outcomes like security incident response or proactive vulnerability management.Let’s make this example more concrete by simulating this difference with Syft, Anchore’s open source SBOM generation tool:
$ syft -q -o spdx-json nginx:latest > nginx_a.spdx.json
$ grype -q nginx_a.spdx.json | grep Critical
libaom3 3.6.0-1+deb12u1 (won't fix) deb CVE-2023-6879 Critical
libssl3 3.0.14-1~deb12u2 (won't fix) deb CVE-2024-5535 Critical
openssl 3.0.14-1~deb12u2 (won't fix) deb CVE-2024-5535 Critical
zlib1g 1:1.2.13.dfsg-1 (won't fix) deb CVE-2023-45853 Critical
In this example, we first generate an SBOM using Syft then run it through Grype—our vulnerability scanning tool. Syft + Grype uncover 4 critical vulnerabilities.
Now let’s try the same thing but “simulate” an SBOM generator that can’t fully parse the structure of the software artifact in question:
$ syft -q -o spdx-json --select-catalogers "-dpkg-db-cataloger,-binary-classifier-cataloger" nginx:latest > nginx_b.spdx.json
$ grype -q nginx_b.spdx.json | grep Critical
In this case, we are returned none of the critical vulnerabilities found with the former tool.
This highlights the importance of careful evaluation of the SBOM generator that you decide on. It could mean the difference between effective vulnerability risk management and a security incident.
If the SBOM generator is packaged as a self-contained binary with a command line interface (CLI) then it should tick this box. CI/CD build tools are most amenable to this deployment model. If the SBOM generation tool in question isn’t a CLI then it should at least run as a server with an API that can be called as part of the build process.
Integrating with an organization’s DevSecOps pipeline is key to enable a scalable SBOM generation process. By implementing SBOM creation directly into the existing build tooling, organizations can leverage existing automation tools to ensure consistency and efficiency which are necessary for achieving the desired outcomes.
Using an open source SBOM tool is considered an industry best practice. This is because it guards against the risks associated with vendor lock-in. As a bonus, the ecosystem for open source SBOM generation tooling is very healthy. OSS will always have an advantage over proprietary in regards to ecosystem coverage and data quality because it will get into the hands of more users which will create a feedback loop that closes gaps in coverage or quality.
Finally, even if your organization decides to utilize a software supply chain security product that has its own proprietary SBOM generator, it is still better to create your SBOMs with an open source SBOM generator, export to a standardized format (e.g., SPDX or CycloneDX) then have your software supply chain security platform ingest these non-proprietary data structures. All platforms will be able to ingest SBOMs from one or both of these standards-based formats.
Now that you understand the many reasons to generate SBOMs (whether for compliance or vulnerability analysis) using Syft to generate SBOMs is a flexible and simple process with many options to tailor SBOMs to your specific use-cases.
If you’d like to explore using Anchore Enterprise for its robust features like continuous visibility, SBOM monitoring, drift detection, and policy enforcement then access a free 15 day trial here.