Ghosts in Commits: How I Earned $64,000 from Deleted Files in Git

A security researcher built a system to scan thousands of public GitHub repositories, recovering deleted files from Git history to find leaked API keys and tokens, earning over $64,000 in bug bounties.

The Core Idea

I built a system for finding leaked secrets in public GitHub repositories by recovering deleted files from Git history. The result: hundreds of active API keys, tokens, and credentials discovered that remain in commit history despite attempts to delete them.

How the System Works

The process consists of several stages:

  • Step 1: Set up 10 servers (cloud machines, VPS, Raspberry Pi) with 120+ GB of free space
  • Step 2: Clone repositories using the gh CLI tool
  • Step 3: Recover deleted files by comparing commits through git diff
  • Step 4: Scan using TruffleHog with the --only-verified flag
  • Step 5: Send notifications to Telegram when active secrets are found
  • Step 6: Delete repositories to free up disk space

Technical Recovery Methods

Recovery via diff: By comparing each commit with its parent, we can identify deleted files and extract their contents:

git rev-list --all | while read commit; do
  parent=$(git log --pretty=format:"%P" -n 1 "$commit")
  git diff --name-status "$parent" "$commit" | grep "^D"
done

Extracting dangling objects: Git keeps orphaned objects for approximately two weeks before garbage collection. We can find them with:

git fsck --unreachable --dangling --full | grep blob | awk '{print $3}'

Additionally, .pack files contain compressed objects that may include deleted content. Even after running git filter-branch, references can persist in pack files.

Discovered Secrets and Rewards

The most valuable findings and their bounty ranges:

  • GCP/AWS tokens: $5,000–$15,000
  • Slack tokens: $3,000–$10,000
  • GitHub tokens: $5,000–$10,000
  • OpenAI tokens: $500–$2,000
  • SMTP credentials: $500–$1,000

Total reward: $64,350

Less Valuable Findings

Not everything found was worth reporting:

  • Test private keys (not used in production)
  • One-time accounts
  • Canary tokens (honeypots/traps)
  • Open Web3 API keys
  • Frontend API keys with limited permissions

Why Secrets Were Leaked

1. Lack of Git knowledge: Developers deleted files locally, but Git preserves the entire history. A simple git rm and new commit does not erase the file from previous commits.

2. Ignoring .gitignore: Binary files (.pyc, .pdb) were accidentally committed and later deleted, but the compiled code inside them contained embedded secrets — API keys, database connection strings, tokens hardcoded at build time.

3. Insufficient history rewriting: Tools like git filter-branch don't always completely remove references in .pack files. Even after "cleaning" the history, objects may persist in packed form.

Key Takeaways

Most leaks came from deleted binary files that contained compiled code with embedded secrets. Git stores such objects in .git/objects for approximately two weeks before garbage collection, and in public repositories they can be copied indefinitely.

The lesson is clear: if a secret has ever been committed to a public repository, consider it compromised. Rotate it immediately. Don't rely on deletion — rewrite history using tools like git filter-repo (not the older git filter-branch), and always use pre-commit hooks and secret scanning tools to prevent leaks in the first place.