Searching for secrets in container images

Posted Apr 15, 2025 Updated Apr 18, 2025

By Natalie Somersall

7 min read

🙈 Yet another place to look for secrets? It’s common to find as teams move their workloads into containers and navigate the security challenges that come from it. Yet it’s also puzzling … how does a container scanner find an API key that isn’t in the finished image? Let’s create a container with a “secret”, then go over a few ways to extract them from a container. 🕵🏻‍♀️

A finished container looks much like this:

a container with multiple layers, and a secret used in build hidden in one layer

What you see when you shell into a running container (or mount it as a file system on your laptop) is only that top “finished” layer. As we discussed earlier when tidying our large container images, every single step of the container build becomes a layer that’s downloaded as one image. It’s why being mindful of how we’re using those layers is so impactful to the finished size. It also makes it easy to assume you’re safer than you really are when using long-lived credentials.

Let’s hide something!

If you’re not interested in making your own image with a hidden secret, jump to looking through image layers. Pull the image at ghcr.io/some-natalie/some-natalie/secret-example:latest to follow along on the next sections.

First let’s make a file that has a “secret” in it.

this is super secret serious business

actually, it's probably a boring api key

with root access, naturally

Next, make a container file that puts the file in, then removes it in the next layer.

        
FROM cgr.dev/chainguard/wolfi-base:latest

COPY secret.txt /not-a-secret.txt

RUN rm -rf /not-a-secret.txt

Build it, then let’s start to pick it apart.

Launch it and open a terminal inside of that process to verify that the file isn’t easily visible.

        
      


~ ᐅ docker run --rm -it ghcr.io/some-natalie/some-natalie/secret-example:latest
8be11d233bda:/# ls -lah /
total 12K
drwxr-xr-x    1 root     root          20 Apr 16 01:49 .
drwxr-xr-x    1 root     root          20 Apr 16 01:49 ..
-rwxr-xr-x    1 root     root           0 Apr 16 01:49 .dockerenv
lrwxrwxrwx    1 root     root           7 Mar 31 13:08 bin -> usr/bin
drwxr-xr-x    5 root     root         340 Apr 16 01:49 dev
drwxr-xr-x    1 root     root          12 Apr 16 01:49 etc
drwxr-xr-x    1 root     root          14 Jan  1  1970 home
drwxr-xr-x    1 root     root         564 Jan  1  1970 lib
lrwxrwxrwx    1 root     root           3 Mar 31 13:08 lib64 -> lib
drwxr-xr-x    1 root     root           0 Jan  1  1970 opt
dr-xr-xr-x  278 root     root           0 Apr 16 01:49 proc
drwx------    1 root     root          24 Apr 16 01:49 root
drwxr-xr-x    1 root     root           0 Jan  1  1970 run
lrwxrwxrwx    1 root     root           7 Mar 31 13:08 sbin -> usr/bin
dr-xr-xr-x   11 root     root           0 Apr 14 03:34 sys
drwxrwxrwt    1 root     root           0 Jan  1  1970 tmp
drwxr-xr-x    1 root     root          10 Jan  1  1970 usr
drwxr-xr-x    1 root     root          56 Jan  1  1970 var

Notice how that /not-a-secret.txt file isn’t present? That doesn’t mean this alert is a false alarm. Don’t dismiss this finding.

Look through image layers

There are a few tools that make visualizing changes at each layer simple. My favorite, as a text user interface, is dive (GitHub). This lets you explore what each layer’s build step was and the changes it made, extract files, and more.

dive, continuing to be awesome

But … let’s do this the manual way to really know what we’re doing here. Then, we can understand what a scanner is doing to return “found secrets” in an image.

Extracting secrets from images

Remember how OverlayFS, the foundation for a container’s filesystem, works? Containers are (basically) some JSON and file system layers wrapped up in a tarball. Crack open that tarball and explore!

        
# convert the image into a tarball
docker save ghcr.io/some-natalie/some-natalie/secret-example:latest -o secrets.tar

# open up the tarball
mkdir secret-example
tar xf secrets.tar --directory=secret-example
cd secret-example

Taking a look at the directory structure, it’s not human-friendly. The secret is hidden somewhere here though.

.
├── blobs
│   └── sha256
│       ├── 44276f079bfa2a573e96b47f2de9b619ba3af8296f2e56c8a3179aa366b620a8
│       ├── 5d9ca308799efeead16b21fa95dc3ee54d34eb6e91fc1bd5a7ba7f2f68cd52f9
│       ├── 849b2beeb8b817251ac75a4bf897ef4ff4413faac65417b9d20a502aebebaf4e
│       ├── bcfdc337a6da7ceaae240acda0d8c51d3aa796e80f1bd2f9f34c1f0c5a9f32d3
│       ├── c17fffd7182302ee1e4bbf0511d749bcd423ea45c374243577795b5a1baae8d6
│       ├── d688176b49d54a555e2baf2564f4d3bb589aa34666372bf3d00890a244004d02
│       ├── ed5970c83cd47c1004fafa5577356a1e41eb0347d4408d77f5cf01762f810d66
│       └── f8d9d69c53752e0ac2a555f782a6155f6ef77ca3a1911408f51280d8a3cc6bae
├── index.json
├── manifest.json
├── oci-layout
└── repositories

3 directories, 12 files

🗺️ To help make sense of it, let’s look at the manifest.json file. It’s a map of what all is in this archive. This is a simple image, with three layers and no extra metadata attached to it. The file has been truncated, but we now have the list of layers to look through.¹

        
      


[
  {
    "Config": "blobs/sha256/d688176b49d54a555e2baf2564f4d3bb589aa34666372bf3d00890a244004d02",
    "RepoTags": ["ghcr.io/some-natalie/some-natalie/secret-example:latest"],
    "Layers": [
      "blobs/sha256/bcfdc337a6da7ceaae240acda0d8c51d3aa796e80f1bd2f9f34c1f0c5a9f32d3",
      "blobs/sha256/f8d9d69c53752e0ac2a555f782a6155f6ef77ca3a1911408f51280d8a3cc6bae",
      "blobs/sha256/849b2beeb8b817251ac75a4bf897ef4ff4413faac65417b9d20a502aebebaf4e"
    ]
  }
]

Looking at each layer now

A small trick - each of those layers are also a tarball, just without the file extension. Now that we know which files contain layer data, it’s a simple task to untar them, then grep through the contents for the string or regex that we’re looking for. The contents of the plain text files copied in are in the layers … in plain text … secrets and all.

this is super secret serious business

actually, it's probably a boring api key

with root access, naturally

A container scanner or “secrets finder” works in a similar way to what we just did manually. It searches for many patterns of known regular expressions or high-entropy (read: very random) strings in images.

Even if our “end process” can’t see that credential, anyone that can run the image must also pull it - meaning they can see all the files that went into the build. This often means credentials or API tokens used in the build, but it’s a common exfiltration path for proprietary source code too.

Parting thoughts

Exfiltration of sensitive information and credentials in containers is an easy step to overlook in software distribution. Scanners may help find things, but they will not block them. Prevent this risk by

Not using long-lived credentials like API keys or passwords or deploy keys.
Not putting them in your build … ever.
Deleting them during the build, then squashing your final image to ship a single-layer image.
Some builders have a concept of build secrets, allowing the builder access without storing it in the image.
If you’re worried about source code leaking, use a multi-stage build. Build in one stage, then copy the finished artifacts over to another one. This also reduces the finished image size.

basically the security of every long-lived credential

Footnotes

You can also get the list of layers directly from the docker cli using a little bit of jq wizardry with this one-liner: docker image inspect ghcr.io/some-natalie/some-natalie/secret-example:latest | jq '.[].GraphDriver.Data.UpperDir + ":" + .[].GraphDriver.Data.LowerDir | split(":") | reverse' ↩

containers security

This post is licensed under CC BY-NC-SA 4.0 by the author.