Inspecting pull requests to understand behavior changes
I have a very bad habit to confess.
😇 I pin my third-party GitHub Actions to a SHA.1 😇
This ensures that a specific commit is running every time. It’s widely considered a security best practice (docs ). It is also one of the easiest steps to take in securing your CI system should you choose to use third-party reusable Actions (or literally any open source software built on it, which is a lot.)
😈 But I also blindly accept Dependabot pull requests without reviewing them. 😈
This completely negates the benefit of pinning to an immutable version to prevent software supply chain attacks in CI. Let’s stop doing that, shall we? 🙈
Let’s inspect the automated pull requests that Dependabot makes for updating third-party GitHub Actions and all of the transitive dependencies that changed with it. Finished workflow for the impatient.
We’ll use an open source, air-gap capable YARA-based scanner that accepts many types of inputs with flexible outputs. It can run as a check before merging code, a promotion check to move into an offline system, or many other places during software development.
Using malcontent
To do this, we’re going to add a workflow using malcontent to check our changes to GitHub Actions. There is a lot of hidden complexity in the GitHub Actions ecosystem. Some assembly is required to do this. Let’s start with using it and work up to having it review more complex code change systems.
Take a minute to read the documentation on the project’s repo . There are a ton of nifty features I won’t be needing for this walkthrough.
A few important things to note:
- You can use it in a container or install it natively. We’ll be using it in a container here, since that provides portability for it and all dependencies needed.
- It can do a quick
scan
or deeperanalyze
of an arbitrary set of files. It can alsodiff
capabilities between two directories or binaries. Pull requests lend themselves naturally to differential analysis. - There’s a couple output formats, but none are standardized (eg, SARIF or similar) for drop-in ingest into other tools. The JSON output is handy for machine parsing. Markdown is always pretty.
Some assembly required
From here, we’re going to
- Get a basic scan or two going locally - jump to section
- Move that basic scan into GitHub Actions - jump to section
- Run a differential analysis on a pull request in GitHub Actions as a check - jump to section
- Combine it to analyze differences in a GitHub Actions version bump from Dependabot - jump to section
This will hopefully prevent some unfortunate recent attacks such as tj-actions and reviewdog from earlier in 2025. This sort of attack is inherent to using third-party code in your build systems, as I’d learned during the Codecov attack of 2021.
1. Some basic scans
Let’s start locally and use the pre-packaged Docker container. This includes all dependencies and tens of thousands of YARA rules in the image, making it highly portable for air-gapped usage. It’s free to use and updated at least daily to save me from a ton of setup and update work too.
Scan a container image or folder
1
2
3
4
docker run --rm -v .:/tmp \
cgr.dev/chainguard/malcontent:latest \
--format=markdown --output=/tmp/container-scan.md \
scan -i ghcr.io/some-natalie/jekyll-in-a-can:latest
Many GitHub Actions are shipped as containers, so we’ll use this capability. A few interesting things to note:
- It doesn’t need the
--privileged
flag because it isn’t running the container or any container runtime. Under the hood, it’s using crane to interact with the container image as a tarball of filesystem snapshots. - The
-i
flag specifies it as an image and not a file. - In order to write the output to a file, it needs a volume mount. I did the lazy thing and used my current working directory as
/tmp
in the container. Obviously edit this if we’re not in an ephemeral environment (or playing around locally). - Full results of above command.
Analyze a file
1
2
3
4
docker run --rm -v .:/tmp \
cgr.dev/chainguard/malcontent:latest \
--format=json --output=/tmp/malcontent.json \
analyze /tmp/nc
This does a deeper look into any particular file and enumerates what it can do. The full results of this command are in JSON in this case. It’s handy for
- Looking at compiled artifacts without access to source code, such as libraries you pull in during a build. 🕵🏻♀️
- Using GitHub Actions or other compute-as-a-service platform to analyze the contents of a remote file store. Since those are ephemeral VMs, the security of this system is someone else’s problem. 🙊
- Inspecting those MP3 files you definitely didn’t torrent in college. 😉 🎧
Diff the capabilities of two files or folders
1
2
3
4
docker run --rm -v .:/tmp \
cgr.dev/chainguard/malcontent:latest \
--format=json --output=/tmp/malcontent.json \
diff /tmp/nc /tmp/libffmpeg.dirty.dylib
This is the foundation we’re going to build our pull request work flow from. It only looks at the differences from the first file to the second, meaning we can highlight only what’s changed. The full results from this command are available for further use, but we’ll use the Markdown output to comment on the real PR in GitHub.
2. Now in a CI pipeline
For our first workflow, let’s inspect an arbitrary file in a repository or remote storage bucket. For simplicity, use the built-in run summary system rather than uploading the report elsewhere. (full analyze workflow file)
1
2
3
4
5
6
7
# stuff above here, like cloning or copying or mounting a remote file share
- name: Analyze the files
id: malcontent
run: |
docker run --rm -v ${{ github.workspace }}:/tmp \
cgr.dev/chainguard/malcontent:latest \
--format=markdown analyze /tmp >> $GITHUB_STEP_SUMMARY
a very simple scan of files in GitHub Actions
3. Viewing diffs at pull request
To use differential analysis, remember a few things:
- First, build the original code with all the old dependencies (or pull the finished artifacts in).
- Then, build the proposed changes with the changed dependencies.
- Build both of these in separate directories.
- Create the diff between each finished product to analyze.
- Upload that report somewhere that it will be seen in code review and/or fail the build based on a threshold.
Here’s the full diff workflow example. It creates some lovely pull request comments, as shown below.
if only every naughty PR was so straightforward as to simply say “add malware”
4. Looking at GitHub Actions PRs
Now let’s add in how Dependabot works, some nuances of the GitHub Actions ecosystem, and our scanning knowledge from above to inspect PRs changing our Actions. (full example file )
Note this does not work for grouped Dependabot PRs. One PR per changed dependency only because we’re looking at each changed dependency, not testing our project that uses them.
Setup
This workflow only needs to run on pull requests changing the workflow files that target the default branch (main
in this case). However, it only needs to be able to read the content of the repo. Specifying all this in YAML looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
name: 🔍 Malcontent differential analysis 🔍
on:
pull_request:
branches: ["main"]
types:
- opened
- synchronize
- reopened
paths:
- ".github/workflows/**.yml"
- ".github/workflows/**.yaml"
permissions:
contents: read
From here, there are three jobs that must be done in order.
Extract information about the Action
First, we need to parse the pull request for what’s changed, then output those changes as job outputs we can use later on in the workflow. The diff we’re interacting with looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
diff --git a/.github/workflows/zizmor.yml b/.github/workflows/zizmor.yml
index 5bfbf9f..04ee17f 100644
--- a/.github/workflows/zizmor.yml
+++ b/.github/workflows/zizmor.yml
@@ -23,7 +23,7 @@ jobs:
persist-credentials: false
- name: Install the latest version of uv
- uses: astral-sh/setup-uv@6b9c6063abd6010835644d4c2e1bef4cf5cd0fca # v6.0.1
+ uses: astral-sh/setup-uv@f0ec1fc3b38f5e7cd731bb6ce540c5af426746bb # v6.1.0
- name: Run zizmor 🌈
run: uvx zizmor --format=sarif . > results.sarif
However, we need to break this up into the old and new Action’s repo (owner/repo
) and tag or SHA in use. That second one needs to be sanitized to remove the version number as an in-line comment if there’s one there.
This is a simple yet verbose set of tasks to do with grep
. This section of the workflow contains the full set of scripting steps.
Now figure out how to set up dependencies
From here, we’ll need to check out and build the code for this Action on the original and proposed changes. To figure out how to build the Action, we need to parse the action.yml
file in the changed Action repository. This section of the workflow will tell us what needs to happen, then set it up.
An Action can be one of three things:
- Javascript, optionally with dependencies vendored.
- A container, either built on each run or pulled from a registry.
- A composite Action, which is an “action of actions”. I purposely skipped over these to force manual inspection. As best I know, there’s no limit to how deep these can nest … 😰
Lucky for us, most things fall into those top two categories. Let’s do some bash if
statements to figure out how to set these up.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# figure out type of Action
type=$(yq '.runs.using' action.yml)
if $type==docker; then
image=$(yq '.runs.image' action.yml)
if [[ $image == "Dockerfile" ]]; then
echo "This Action uses a Dockerfile, building it..."
docker build -t prior-action-image .
docker save prior-action-image -o prior-action-image.tar
untar -xf prior-action-image.tar
else
echo "This Action uses a pre-built Docker image: $image"
docker pull $image
docker save $image -o prior-action-image.tar
untar -xf prior-action-image.tar
fi
elif [[ $type == node* ]]; then
echo "This Action uses Node.js, installing dependencies..."
npm install
elif [[ $type == composite* ]]; then
echo "This is a composite Action, skipping build."
fi
From here, we now have a workspace that looks like this, complete with dependencies at each of those commits.
1
2
3
4
5
6
7
8
.
└── /tmp/
├── prior-commit/
│ ├── action
│ └── all action dependencies
└── current-commit/
├── action
└── all dependencies for it
This makes it easy to run malcontent in differential analysis mode as the next step. Create a Markdown report, then upload it as a workflow artifact.
Commenting on the PR is the easy part
From here, the next job uses that Markdown report and puts it verbatim into the pull request as a comment for a human to review. This workflow section is what does the task. GitHub supports in-line Javascript with actions/github-script , making this a short snippet’s task.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const fs = require('fs');
const filePath = 'malcontent-results.md';
if (!fs.existsSync(filePath)) {
throw new Error(`File not found: ${filePath}`);
}
const fileContent = fs.readFileSync(filePath, 'utf8');
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: fileContent,
});
Putting it all together
Put the full workflow file on the main branch in your repo. Wait for Dependabot, or anyone else, to open a PR bumping a change to a GitHub Action.
🎉 Here’s what that analysis looks like in practice:
a likely uninteresting capabilities change in one dependency’s NPM dependencies (PR shown )
Final thoughts
Now I have a bit of visibility into the changes happening upstream in my GitHub Actions and in their upstream too. However, YARA rules can be noisy, so finding meaning in that data still takes human oversight. There’s also no guarantee that the capability change in a transitive dependency is used in the Action.
But at least there’s more information about the changes going on in that little one-line automated pull request bumping my Action forward, auto-magically commented in that proposed change. 🔎
Footnotes
-
Okay, work in progress and I’ve changed most of my projects do to this. ↩