sbomify logo

The ultimate SBOM guide for Python

Source vs Build SBOMs

Before we dive into the actual guide, let’s start by talking about SBOM generation. Generally speaking, an SBOM is either built from source (also known as ‘pre-build’ in CycloneDX terminology) or built after installation. When we talk about source SBOMs, we normally refer to generating an SBOM from a lockfile, and for built SBOMs, we look at what has been installed by the package manager. These might seem interchangeable, and the output might be identical. However, this really boils down to the quality of the lockfile, so let’s dive into this with some examples.

Lockfile Example

There are a lot of ways to manage packages in Python. The most widely used way to pin dependencies in Python is to use requirements.txt with pip. We’ll talk about why you might not want to use this approach later on, but let’s unpack some scenarios where you might end up with an incomplete SBOM.

Let’s imagine that we have a Flask-based Python server. It’s possible that your requirements.txt file will only list Flask like this:

Flask==3.0.3

Every time you run pip install -r requirements.txt, you’ll have all your packages installed.

However, if we were to try to generate an SBOM for this, we’d just get one entry - namely Flask. This is of course incomplete. None of Flask’s dependencies (i.e., transitive dependencies) are captured here. Thus, it’s important that we capture all the dependencies generated by pip freeze and lock them into our requirements.txt file, resulting in something that looks like this:

blinker==1.8.2
click==8.1.7
Flask==3.0.3
importlib_metadata==8.5.0
itsdangerous==2.2.0
Jinja2==3.1.4
MarkupSafe==2.1.5
Werkzeug==3.0.4
zipp==3.20.2

Now, this probably doesn’t come as a surprise to any experienced developer, but it’s still important to highlight that the SBOM is directly correlated to the quality of the lockfile. If it’s missing there, it will be missing in the SBOM.

Now, let’s move on to some more complicated issues.

Advanced Gotchas

If you’re a seasoned Python developer, you’ve most likely run across requirements.txt, setup.cfg, or pyproject.toml files that include expressions like foobar>=1.2.3 (defined in PEP 440). While there are good reasons for using ranges or less/more-than expressions for versions, this can really create issues when we are using these for SBOM generation. This is where source/pre-build and build SBOMs will diverge. Technically speaking, it’s impossible to generate a source/pre-build SBOM for a lockfile that has non-precise pinning, as the version is ambiguous.

The Importance of Including Hashes

Modern Python package managers like Poetry, pipenv, and Conda will automatically add cryptographic hashes to all installed packages.

This is very important, and a requirement if you want to meet NTIA Minimum Elements for your SBOMs. Remember, if you’re building the SBOM from a lockfile, the quality of the lockfile is only as good as the input.

It is possible to add hashes to requirements.txt files too, but it requires a bit of extra work:

$ pip install pip-tools
$ pip freeze > requirements.in
$ pip-compile \
    --generate-hashes requirements.in \
    > requirements.txt

You’ll now notice that if you open up your new requirements.txt file, you have hashes added to each package:

$ cat requirements.txt
[...]
flask==3.0.3 \
    --hash=sha256:34e815dfaa43340d1d15a5c3a02b8476004037eb4840b34910c6e21679d288f3 \
    --hash=sha256:ceb27b0af3823ea2737928a4d99d125a06175b8512c445cbd9a9ce200ef76842
[...]

It is however important to point out that this is less supported in SBOM generation tools compared to the those of other package managers as it is less common.

Generating an SBOM from a Python Lockfile

There are a number of tools available to generate SBOMs for Python on our Resources page. Your personal preferences may vary depending on whether you want a format-agnostic tool that can generate both SPDX and CycloneDX SBOMs or if you want a format-specific tool.

It’s important to stress that regardless of the tool you use, you still need to go through all steps in the SBOM lifecycle.

Lifecycle

The output of any of the tools completes the first step (e.g., “Generation”), but you still need to do Enrichment and Augmentation to have a complete SBOM.

With that out of the way, one of the more versatile SBOM tools is Trivy from Aqua Security. This tool is able to generate both CycloneDX and SPDX from a number of Python lockfiles.

$ trivy fs \
    --format cyclonedx \
    --output my_sbom.cdx.json \
    requirements.txt

As warned about above earlier, Trivy does not support reading hashes from requirements.txt files. For this, we might consider using a more specialized tool, like CycloneDX’s own Python library:

$ cyclonedx-py \
    requirements requirements.txt \
    > my_sbom.cdx.json

Let’s Make Your Life Easier

Hopefully, the steps above didn’t scare you away from building SBOMs. Now that you understand the basics, you are probably thinking about how to automate this in your CI/CD pipeline. Enter the sbomify GitHub Action, a swiss army knife for SBOMs. It’s an abstraction layer that ties together various other open source SBOM tools to make your life easier.

The GitHub Actions module is open source and can be run both with and without the sbomify platform.

Instead of running the commands above, you can just add a job that looks like this:

---
name: Upload an SBOM to sbomify

on: [push]

jobs:
  [...]
  upload-sbom:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Upload SBOM
        uses: sbomify/github-action@master
        env:
          TOKEN: 'My GitHub Secret Reference'
          COMPONENT_ID: 'my-component-id'
          LOCK_FILE: 'requirements.txt'
          OUTPUT_FILE: 'my_sbom.cdx.json'
          AUGMENT: false
          ENRICH: false
          UPLOAD: false

At the end of this run, you will have an SBOM.

If you have an sbomify account, you can also automatically augment this SBOM with your information (vendor, supplier, and license information) with the AUGMENT: true setting. We can also set UPLOAD: true to automatically upload the SBOM to our platform. Once uploaded, you can invite all your stakeholders.

The GitHub Action can take most Python lockfiles (Pipfile.lock, poetry.lock, and requirements.txt) as input and build an SBOM for you with minimal fuss.