Notes on Github Actions, Reproduciblity, and Trusting Trust

created: March 10, 2026

I am working towards a course project related to "trusting trust" attacks in the Github CI/CD ecosystem. The idea is that Github actions have a lot of pull in supply chains, and the publication method allows malicious developers to upload compromised actions artifacts.

For example, Github prefers that all Javascript actions are minified and compressed to the smallest size possible, recommending rollup.js to compile all dependencies into one large .js file. This file format is small and portable, but not so fun for auditors. Github's trust model assumes the build artifacts developers upload match the source, but there's nothing stopping someone from injecting obfuscated JavaScript into the release, compromising the integration or deployment of pipelines which make use of it!

The usual solution for linking source code to build artifacts is reproducible builds. To be done effectively, that requires the project maintainers to be extremely explicit in what source code, environment, and/or dependencies were used to produce their artifacts (named 'provenance attestations'). Under those circumstances, anyone can have some degree of certainty an artifact is legitimate by building it themselves starting from source under the same conditions and getting an output which is identical down to each individual bit.

Note this says nothing on whether the original source code is legitimate. It also is not true protection from 'trusting trust' attacks, because all of the independent builders could have used a compromised toolchain (i.e., a specially crafted compiler binary which always compiles in a backdoor). For the latter, you could bootstrap the full environment by compiling absolutely everything from source without use of a binary you didn't compile yourself. Nonetheless, reproduciblity is what I wanted to focus on.

Problem with reproducibility number two: if a Github actions maintainer truly wanted to compromise dependent projects by injecting malware into their bundled actions, they certainly would not release information on how they built their project. If the maintainer doesn't want to cooperate, it's difficult to discern if something is intentionally non-deterministic.

What is there to do?

Diffoscope is the standard tool for determining what differs between two sets of build artifacts. In the JavaScript example, it uses js-beautify to de-obfuscate bundled code and diff them in a readable way. In cases of unreproduciblity, it may be possible to feed this prettified output into further static analysis tools to access whether or not a payload has been introduced. Though, most papers I have read leave this as a manual process. Perhaps it would be good to leave to an LLM and assign a confidence level?

There are legitimate reasons why a piece of software isn't reproducible, but they can usually be solved. I've recently come across this in packaging software for GNU Guix (which supports both reproducible and bootstrappable builds), in which diffoscope also saved the day, and may write a follow up post. As for actions, the question I ask myself now is whether or not some heuristic could be automated and scaled up to cover a larger portion of the Github actions marketplace. For example, could we create a Github app to scan new commits from tracked repositories and upload potentially evil code snippets publicly?