Finding secrets on GitLab

Everyone’s been there before, you included something you shouldn’t have in a commit so you undo the commit and force push.

This data’s gone, right?(!)

This blog post is about finding those mistakenly published commits, and to serve as a reminder that you should always rotate potentially exposed credentials even if you think you’ve deleted them.

How can these “missing commits” still exist?

When you force pushed your branch you rewrote the branch history to no longer reference those commits. This doesn’t remove those commits though, it just detached them from the chain of commits that represents the branch. commits still exist on the GitLab server despite being inaccessible via the branches.

Here’s a writeup of how this works locally

How do we find these missing commits

In theory GitLab have an API for getting commits from a repository, and there’s a parameter for getting all commits.

As of the time of writing, this API fails to return these dangling commits https://gitlab.com/gitlab-org/gitlab/-/issues/443263 which I consider a bug.

However, every time a commit is pushed a pushed to event is generated by GitLab. An example of this is below.

{
    "id": 3182909502,
    "project_id": 55263882,
    "action_name": "pushed to",
    "target_id": null,
    "target_iid": null,
    "target_type": null,
    "author_id": 20255114,
    "target_title": null,
    "created_at": "2024-02-24T21:29:08.711Z",
    "author": {
      "id": 20255114,
      "username": "RichardoC",
      "name": "Richard Tweed",
      "state": "active",
      "locked": false,
      "avatar_url": "https://secure.gravatar.com/avatar/3af3c4bc67b146186dbd2ef852f8faa16c91d9268e81b7b292168e7dc1fdc7b6?s=80&d=identicon",
      "web_url": "https://gitlab.com/RichardoC"
    },
    "push_data": {
      "commit_count": 1,
      "action": "pushed",
      "ref_type": "branch",
      "commit_from": "31ed8bb6dd627bd38fba1aa350a15136c636c932",
      "commit_to": "7faedfa06dd7dda69ca94169c9ec01aa605da08b",
      "ref": "main",
      "commit_title": "Tidy readme",
      "ref_count": null
    },
    "author_username": "RichardoC"
}

These events reference the commit_from and commit_to commits. These can be used to generate a set of all commits that actually happened over the last 3 years (they’re only retained for 3 years)

By comparing that set with the official history (the all commits API above) we can find dangling commits, and generate the URL where they can be seen.

Very cool, where’s the code?

https://github.com/RichardoC/gitlab-secrets

An example Gitlab repo with dangling commits can be found at https://gitlab.com/gitlab-secrets/gitlab-secrets for you to test this out with

Original idea

This was based on the work by Neodyme where they used this technique to find secrets on GitHub


Copyright © 2024 Richard Finlay Tweed. All rights reserved. All views expressed are my own