Early in 2020 we started work on Syft, an SBOM generator that is easy to install and use and supports a variety of SBOM formats. In late September of 2020 we released the first version of Syft with seven ecosystems supported. Since then we’ve had 168 releases made up of the work from 134 contributors, collectively adding an additional 18 ecosystems -- a total of over 40 package catalogers.
Today we’re going one step further: we’re thrilled to announce the release of Syft v1.0.0! 🎉
What is Syft?
At a high level, Syft answers the seemingly simple question: “what is in my application?”. To a finer point, Syft is a standalone CLI tool and library that scans container images and file systems in order to generate an SBOM: the Software Bill of Materials that describes what software components were found.
While Syft’s capability to generate a detailed SBOM is useful in its own right, we consider Syft to be a foundational tool that supports both our growing list of open-source offerings as well as dozens of other OSS projects. Ultimately it delivers a way for users to both generate SBOM material for their projects as well as use those same SBOMs for other important functions such as vulnerability scanning, OSS license reporting, tracking components at various stages of the application lifecycle, and other custom use cases.
Some of the important dimensions of Syft are:
- Sources: types of software applications and artifacts that can be scanned by Syft to produce an SBOM, such as docker container images, source code directories, and more.
- Catalogers: functions that are implemented for a given type of software artifact / packaging ecosystem, such as Java JAR files, RPMs, Go modules, and more.
- Output Formats: interchangeable formats to put the SBOM material into, post-SBOM generation, such as the Syft-native JSON, SPDX (multiple versions), CycloneDX (multiple versions), and user-customizable formats via templating.
Over time, we’ve seen the list of sources, catalogers, and output formats continue to grow, and we are looking forward to adding even more capabilities as the project continues forward.
Capabilities of Syft
To start with, Syft needs to know how to read what you're giving it. There are a number of different types of sources Syft supports: directories, individual files, archives, and - of course - container images! And there are a lot of options such as letting you choose what scope you want to be cataloged within an image. By default we consider a squashed layer scope, which is most like what the filesystem will look like when a container is created from the image. But we also allow for looking at all layers within the container image, which would include contents that are distributed but not available when a container is created. In future versions we want to expand these selections to support even more use cases.
When scanning a source we have several catalogers scouring for packaging artifacts, covering several ecosystems:
- Alpine (apk)
- C/C++ (conan)
- Dart (pubs)
- Debian (dpkg)
- Dotnet (deps.json)
- Objective-C (cocoapods)
- Elixir (mix)
- Erlang (rebar3)
- Go (go.mod, Go binaries)
- Haskell (cabal, stack)
- Java (jar, ear, war, par, sar, nar, native-image)
- JavaScript (npm, yarn)
- Jenkins Plugins (jpi, hpi)
- Linux kernel module and archives (ko & vmlinz)
- Nix (outputs in /nix/store)
- PHP (composer)
- Python (wheel, egg, poetry, requirements.txt)
- Red Hat (rpm)
- Ruby (gem)
- Rust (cargo.lock)
- Swift (cocoapods, swift-package-manager)
- Wordpress Plugins
We don’t blindly run all catalogers against all kinds of input sources though – we tailor the catalogers used to create the most accurate SBOM possible based on what is being analyzed. For example, when scanning images, we enable the alpm-db-cataloger but don’t enable the cocoapods-cataloger:
If there is a set of catalogers that you must / must not use for whatever reason, you can always tailor the set that runs with --override-default-catalogers and --select-catalogers to meet your needs. If you’re using Syft as an API, you can go further and implement your own package cataloger and provide it to Syft when creating an SBOM.
When it comes to SBOM formats, we are unopinionated about which format might be best for your use case -- so we support as many as we can, including the most popular formats: SPDX and CycloneDX. This is a core decision point for us -- this way you can pivot between SBOM formats and not be locked into a format based on a tooling decision. You can even select which version of a format to output (even output multiple at once!):
syft scan alpine:latest -o [email protected]
syft scan alpine:latest -o [email protected]
syft scan alpine:latest -o [email protected]
syft scan alpine:latest -o [email protected]=./alpine.spdx.json -o cyclonedx-json=./alpine.cdx.json
From the beginning, Syft was designed to lean into the Unix philosophies: do one thing and one thing well, allowing for it to plug into downstream tooling easily. For instance, to use your SBOM to get a vulnerability analysis, pipe the results to Grype:
Or to discover software licenses and check for compliance, pipe the results to Grant:
What does v1 mean to us?
Version 1 primarily signals a stabilization of the CLI and API. Moving forward you can expect that breaking changes will always coincide with a major version bump of Syft. Some specific guarantees we wanted to call out explicitly are:
- We version our JSON schema of the syft JSON output separately than that of Syft, the application. In the past in a v0 state this has meant that breaking changes to the JSON schema would only result in a minor version bump of syft. Moving forward the JSON schema version used for the default JSON output of Syft will never include breaking changes.
- We’ve implicitly supported the ability to decode any previous version of the Syft JSON model, however, moving forward this will be an explicit guarantee -- If you have a Syft JSON document of any version, then you will be able to convert it into the latest Syft JSON version.
- Stereoscope, our library for parsing container images and crafting squashed filesystem representations (critical to the functionality of Syft), will now be versioned at each release and have release tags.
What does this mean for the average user? In short: probably nothing! We will continue to ship enhancements and fixes every few weeks while remaining compatible with legacy SBOM document schemas.
So are we done?...
…Not even close 🤓. As the SBOM landscape keeps changing our goal is to grow Syft in terms of what can be cataloged, the accuracy of the results, the formats that can be expressed, and its usefulness in more use cases.
Have an ecosystem you need that we don’t support yet? Let us know, and let’s build it together! Found an issue or have a feature idea? Open an issue and let’s talk about it! Looking to contribute and are looking for a good place to start? We have some issues set aside just for you! Curious as to what we’re looking to work on next? Check out our ever-growing roadmap! And always, come join us every-other week at our community office hours chats if you want to meet with us and talk about topical issues regarding our OSS projects.