Skip to main content

Source of Origin and Build Provenance

Within software development, ensuring the authenticity and integrity of packages can be critical to the software supply chain. Source of origin verification and build provenance play key roles in this context, providing a robust framework to validate the origins of packages and the integrity of the build process.

What is source of origin verification and build provenance?

Source of origin verification is a practice that enhances the level of confidence in the authenticity of a package, ensuring that it indeed originates from a specific code source repository, along with other claimed metadata. This practice is crucial for several reasons:

  • It ensures authenticity and legitimacy. A verified source of origin provides additional guarantees around the legitimacy of a package, ensuring that it is genuine and has not been tampered with. In the current landscape, some package managers lack the capability to verify that the metadata published alongside a package is accurate and truthful. This gap in verification can lead to security vulnerabilities and misinformation.

    Without source of origin verification, package managers are susceptible to attacks such as starjacking, where attackers claim to be associated with a popular repo to deceive users. Additionally, it becomes easier for attackers to perform typosquatting attacks, where a malicious package is created with a name similar to a popular package, in an attempt to trick users into downloading it. A verified source of origin helps prevent these attacks by ensuring that the package’s claimed origin and popularity metrics are authentic.

  • It establishes a foundation for build provenance. Source of origin verification is a foundational step in establishing build provenance, which ensures that the compiled source code results in a package with integrity. Build provenance goes beyond verifying the source of the code, ensuring that the entire build process, from code compilation to package distribution, is secure and verifiable.

Both source of origin verification and build provenance are significantly enhanced by the use of cryptographic signing. By signing packages and their associated metadata, developers can provide a verifiable proof of origin and integrity. Users, in turn, can verify these signatures to ensure that the package has not been altered since it was signed, and that it indeed originates from the claimed source.

How does Trusty display source of origin and build provenance?

Trusty displays two different types of source of origin and build provenance information for open source packages:

  1. Sigstore provenance. Sigstore is an open source project that makes it easier for developers to cryptographically sign and verify artifacts. When packages have been built and signed with Sigstore using GitHub Actions, Trusty displays a badge indicating that there is a verifiable chain of trust back to the source code.
    1. Currently, npm is the only package ecosystem that supports publishing packages with provenance (learn more here).
  2. Historical provenance. For packages that have not been signed or built with Sigstore, Trusty can establish a link from a published package to its source code by mapping Git tags and releases to published versions of the package. This is called “historical provenance.”

How Trusty establishes historical provenance

When Sigstore provenance information is not available for a package, Trusty relies on another method to determine whether there is a strong link from a published package back to its source code, and a clear proof of origin. This method, called historical provenance, calculates the number of Git tags and releases in the package’s listed source repository, and compares those to the number of published versions of that package listed on the package manager’s registry (e.g., on the PyPI or crates registry).

Why do we rely on Git tags to determine this?

Git tags are commonly used to mark a specific state of the code that corresponds to a given release (via a commit). Each commit in Git contains a hash of its contents, which includes the source code, commit message, author, and date, as well as the hash of the previous commit(s). This chaining ensures that every commit is a snapshot of the entire repository's history up to that point. If any part of a commit's data were to change, its hash would change, invalidating all subsequent commits. This makes the history relatively tamper-proof (and even more so when combined with a digital signature). Additionally, all package managers record and publish event-based timestamps.

To map out historical provenance, we compare the number of releases published on the package manager registry to the number of tags in the source repository, and the times at which they happened. We also do a fuzzy match on the strings themselves. If the repo and the package share even a small number of versions, Trusty assumes the package did in fact come from its listed source repo.

For any given package, when there is a sufficient number of matching tags to versions, Trusty will display a notification that there is a strong mapping from the package to its source repository, providing proof of origin.

If we are not able to match any tags to package versions—or if the source repo does not include any tags—Trusty will display a notification that we are unable to match the package to its source repo. In this case, you should take a closer look at this package and make sure you can trust it, before installing.

Sigstore and SLSA: Gold standards for establishing source of origin and build provenance

At Stacklok, we believe that Sigstore and Supply chain Levels for Software Artifacts (SLSA) and its foundation built upon in-toto and The Update Framework (TUF) represent the best solutions currently available for achieving source of origin verification and build provenance. Sigstore provides a transparent and accessible platform for signing software artifacts, while SLSA offers a comprehensive framework for ensuring the provenance of software artifacts.

Historical provenance is not a replacement for Sigstore or SLSA verification. However, in the absence of that information, it can provide a strong signal to developers about whether or not they can trust that a package is what it says it is. For developers, having this signal before they install a package is critical for keeping their software secure.

We welcome any other tools and methods that achieve similar outcomes, as our primary goal at Stacklok is to enhance the security of the open source software ecosystem. We are all about making open source more secure for all.

Fostering adoption and collaboration

Stacklok is committed to nurturing the adoption of Sigstore and SLSA, and we continue to actively work with open source communities to introduce and implement these controls. By collaborating with developers, maintainers, and users, we aim to foster a culture of security and trust, ensuring that the software supply chain remains robust, secure, and verifiable.

As present, the package managers that support source of origin verification with Sigstore are:

Others are in the process of adding support:

We are working with the maintainers of other package managers to add support for source of origin verification via the Sigstore community and partners.