Source vs Build SBOMs
Before we dive into the actual guide, let’s start by talking about SBOM generation. Generally speaking, an SBOM is either built from source (also known as ‘pre-build’ in CycloneDX terminology) or built after installation. When we talk about source SBOMs, we normally refer to generating an SBOM from a lockfile, and for built SBOMs, we look at what has been installed by the package manager. These might seem interchangeable, and the output might be identical. However, this really boils down to the quality of the lockfile, so let’s dive into this with some examples.
Lockfile Example
There are a lot of ways to manage packages in Python. The most widely used way to pin dependencies in Python is to use requirements.txt
with pip
. We’ll talk about why you might not want to use this approach later on, but let’s unpack some scenarios where you might end up with an incomplete SBOM.
Let’s imagine that we have a Flask-based Python server. It’s possible that your requirements.txt
file will only list Flask like this:
Flask==3.0.3
Every time you run pip install -r requirements.txt
, you’ll have all your packages installed.
However, if we were to try to generate an SBOM for this, we’d just get one entry - namely Flask. This is of course incomplete. None of Flask’s dependencies (i.e., transitive dependencies) are captured here. Thus, it’s important that we capture all the dependencies generated by pip freeze
and lock them into our requirements.txt
file, resulting in something that looks like this:
blinker==1.8.2
click==8.1.7
Flask==3.0.3
importlib_metadata==8.5.0
itsdangerous==2.2.0
Jinja2==3.1.4
MarkupSafe==2.1.5
Werkzeug==3.0.4
zipp==3.20.2
Now, this probably doesn’t come as a surprise to any experienced developer, but it’s still important to highlight that the SBOM is directly correlated to the quality of the lockfile. If it’s missing there, it will be missing in the SBOM.
Now, let’s move on to some more complicated issues.
Advanced Gotchas
If you’re a seasoned Python developer, you’ve most likely run across requirements.txt
, setup.cfg
, or pyproject.toml
files that include expressions like foobar>=1.2.3
(defined in PEP 440). While there are good reasons for using ranges or less/more-than expressions for versions, this can really create issues when we are using these for SBOM generation. This is where source/pre-build and build SBOMs will diverge. Technically speaking, it’s impossible to generate a source/pre-build SBOM for a lockfile that has non-precise pinning, as the version is ambiguous.
The Importance of Including Hashes
Modern Python package managers like Poetry, pipenv, and Conda will automatically add cryptographic hashes to all installed packages.
This is very important, and a requirement if you want to meet NTIA Minimum Elements for your SBOMs. Remember, if you’re building the SBOM from a lockfile, the quality of the lockfile is only as good as the input.
It is possible to add hashes to requirements.txt
files too, but it requires a bit of extra work:
$ pip install pip-tools
$ pip freeze > requirements.in
$ pip-compile \
--generate-hashes requirements.in \
> requirements.txt
You’ll now notice that if you open up your new requirements.txt
file, you have hashes added to each package:
$ cat requirements.txt
[...]
flask==3.0.3 \
--hash=sha256:34e815dfaa43340d1d15a5c3a02b8476004037eb4840b34910c6e21679d288f3 \
--hash=sha256:ceb27b0af3823ea2737928a4d99d125a06175b8512c445cbd9a9ce200ef76842
[...]
It is however important to point out that this is less supported in SBOM generation tools compared to the those of other package managers as it is less common.
Generating an SBOM from a Python Lockfile
There are a number of tools available to generate SBOMs for Python on our Resources page. Your personal preferences may vary depending on whether you want a format-agnostic tool that can generate both SPDX and CycloneDX SBOMs or if you want a format-specific tool.
It’s important to stress that regardless of the tool you use, you still need to go through all steps in the SBOM lifecycle.
The output of any of the tools completes the first step (e.g., “Generation”), but you still need to do Enrichment and Augmentation to have a complete SBOM.
With that out of the way, one of the more versatile SBOM tools is Trivy from Aqua Security. This tool is able to generate both CycloneDX and SPDX from a number of Python lockfiles.
$ trivy fs \
--format cyclonedx \
--output my_sbom.cdx.json \
requirements.txt
As warned about above earlier, Trivy does not support reading hashes from requirements.txt
files. For this, we might consider using a more specialized tool, like CycloneDX’s own Python library:
$ cyclonedx-py \
requirements requirements.txt \
> my_sbom.cdx.json
Let’s Make Your Life Easier
Hopefully, the steps above didn’t scare you away from building SBOMs. Now that you understand the basics, you are probably thinking about how to automate this in your CI/CD pipeline. Enter the sbomify GitHub Action, a swiss army knife for SBOMs. It’s an abstraction layer that ties together various other open source SBOM tools to make your life easier.
The GitHub Actions module is open source and can be run both with and without the sbomify platform.
Instead of running the commands above, you can just add a job that looks like this:
---
name: Upload an SBOM to sbomify
on: [push]
jobs:
[...]
upload-sbom:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Upload SBOM
uses: sbomify/github-action@master
env:
TOKEN: 'My GitHub Secret Reference'
COMPONENT_ID: 'my-component-id'
LOCK_FILE: 'requirements.txt'
OUTPUT_FILE: 'my_sbom.cdx.json'
AUGMENT: false
ENRICH: false
UPLOAD: false
At the end of this run, you will have an SBOM.
If you have an sbomify account, you can also automatically augment this SBOM with your information (vendor, supplier, and license information) with the AUGMENT: true
setting. We can also set UPLOAD: true
to automatically upload the SBOM to our platform. Once uploaded, you can invite all your stakeholders.
The GitHub Action can take most Python lockfiles (Pipfile.lock
, poetry.lock
, and requirements.txt
) as input and build an SBOM for you with minimal fuss.