Show HN: Tracking AI Code with Git AI

1 hour ago 2

Over the Fourth of July weekend I got curious about how much of my weekly code was AI-generated. I was hoping to find something in the data that would help me learn to collaborate with AI agents better. What kinds of tasks are ok to delegate fully? Where do I need to stay very involved? How much of my agent code ends up being thrown out and rewritten a week later?

I was surprised how hard it was to get a good answer from Cursor and Claude Code. Their dashboards count the number of lines they insert, but because I'd revert a lot of those additions the stats were never anything close to the lines that GitHub said I added in the same period.

What was missing

You can't know which code in your repo is AI-generated by counting lines. You have to follow each inserted line through to a commit, any resets/rebases/merges, through a PR and into a production build.

So I set out to build an enhanced git blame for tracking AI authorship alongside commit authorship. For every commit I wanted to know the human author, the AI agents used, and which prompts generated which lines. Critically I had to be able to cary that information forward through any future git operations, reformats, refactors, etc for as long as the code survived.

Once we had accurate attribution for every line, I reasoned that accurate stats would follow naturally (...this ended up being really hard too, but with a lot of help we got there).

Now, a few months later that weekend project has hit 1.0.

Git AI Project Goals: build the standard for tracking AI code from development to production

Multi-agent from day 0. Most teams use a combination of AI agents -- they should all work well with Git AI.
Install per-machine, not per-repo. Related: teammates without Git AI installed do not experience a degraded experience.
Work 100% offline.
No background daemon, keyloggers or filewatchers. 🤮
Avoid heuristics. Coding agents are responsible for explicitly marking code they contribute as AI generated. Git AI is responsbile for tracking that code going forward.
Unnoticeable performance impact <100ms for common commands, <1s for large rebases or resets.
Git Native and compatible with any SCM (stores AI attributions in Git notes)

tl;dr - With a lot of help from the community, we figured out how to reliably track AI code through any Git workflow. If you want to give it a try, install the Git AI binary. The rest of this post explains how we solved the problem.

Big thanks to svarlamov, mm-zacharydavidson, AtnesNess and hsethiya for their contributions! And shout out to msfeldstein at Cursor for giving us early access to Agent hooks so we could make our Cursor integration much more accurate.

Release highlights:

AI Authorship is now preserved across all major Git rewrite operations. You can merge, rebase, cherry-pick, squash and reset without losing your AI attributions.
AI Authorship is preserved when you copy / paste it to different parts of your project
Performance ~800-1000x faster and scales with commit size, not repo size (same scaling characteristics as git).
Enterprise configuration options to support rolling Git AI out via MDM

Demo Video

How it works

When you install Git AI, the script:

Configures hooks for all the supported agents on your machine. These hooks call the git-ai checkpoint command to mark their contributions as AI generated.
Wraps your git binary so Git AI can hook into commit, rebase, reset, cherry-pick, and other history changing commands, without needing to install hooks in every local repository. Performance FAQs
Adds the git-ai command to your path. This is the home of AI blame, stats and other Git AI specific commands.

Once Agent Hooks are configured and the shell path is updated to map invocations of git to the git-ai binary, you’re ready to go. Nothing about your workflow needs to change, Git AI will just start tracking AI code as you work.

Checkpoints & Authorship

Git AI uses a sequence of local checkpoints to determine who authored each line of code in a commit. Think of each checkpoint as a lightweight snapshot that tracks changes—they're similar to commits but stay entirely on your machine and never enter your Git history.

When a Coding Agent decides to edit files:

The PreEdit hook is triggered creating a human checkpoint -- basically telling Git AI that any changes since the last checkpoint were made by a human.
The agent applies its changes to the file system
Immediately after writing, the PostEdit hook is triggered -- explicitly marking all the changes the Agent just made as AI-generated.

When you commit, Git AI reduces the checkpoints to an Authorship Log that tracks conversation threads (by a hash) and the lines that each prompt added "+" and removed "-" in each file.

/path/to/file.rs 9fc943b -5 +22 +31-59 +101-103 cda9aa2 +2

Authorship logs are saved as a Git note, attached to the commit. These line ranges are valid in the context of their commit, but obviously as more code is added/removed/updated in future commits the lines will move around -- that's why we had to build git-ai blame and support for keeping track of AI code through all of Git's rewrite operations.

Git AI Blame

Checkpoints + Authorship logs track AI code within a single commit, but to track AI code across your entire repository history, you need to span many commits. Git's blame command provides the robust logic needed to track how line ranges change across commits so we built on top of it.

When you run git-ai blame file.txt:

It first runs a git blame with the porcelain output (includes the line number of the line in its original commit)
Looks up Authorship Logs for all the commits that show up in the file's blame.
Checks if any of those lines appear in the Authorship Log (using the original line number from that commit) and extracts the Coding Agent's name.
Overlays AI author on top of regular blame output and prints it to the stdout or your pager (for long files)

Today git-ai blame just shows the Author’s name but it does know the prompt that yielded each line of AI generated code. Have ideas what we should do with them...? Open a feature request issue or a PR.

Rebases, squashes, and other rewrites

History rewriting operations are part of our daily git workflows, but they are a problem for Git AI because they create new commits (without Authorship Logs) and they unwind or recombine previous ones (invalidating existing Authorship Logs).

We solve this problem by rewriting Authorship logs after history rewrites complete.

After a reset, Git AI copies AI attribution out of the old Authorship Logs and into the Checkpoints, so that the code that ended up in your working copy or index has correct AI attributions when you commit.
After a cherry-pick, Git AI uses blame to find the AI code in the files you picked and preserves those attributions in the Authorship Log (if immediately committed) or the Checkpoints (if staged).
After a rebase, Git AI walks the affected commits and rebuilds authorship commit by commit. We snapshot the AI attributions from the original branch head, then replay every new commit produced by the rebase in order. Our AI-aware blame replays the commit’s diff to find the attribution segments that appeared in that commit and writes them into the new Authorship Log. If the content was reordered, split, or tweaked during conflict resolution, we still match it back to the original AI author so nothing is lost. Dropped commits simply contribute nothing to the final authorship notes. Squashed commits have their AI attributions merged into the single squashed Authorship Log.
After a merge --squash, Git AI computes the new AI Authorship from all the squashed commits and adds Checkpoints to mark all the AI code in the working tree / index.

That's it! Lots of hard work went into making this reliable and performant, especially over the last month! Thank you to everyone who tested, opened Issues and contributed to the repo.

Download Git AI (instructions on GitHub)
Set up a call with the maintainers. We've spent the last month helping companies with thousands of developers roll out Git AI at scale.

Seems fun to work on? Here are some areas we could use some new contributions!

We want Git AI to become the standard for tracking AI code from development to production. Today we only support Claude Code, Cursor and GitHub Copilot in VSCode. Does your company build coding agents? Do you use one we don't support? Contribute the integration!
Work at GitHub, GitLab, Azure DevOps or BitBucket? Help us get the PR diff view to show reviewers which code was AI generated!
Start using Git AI with your team, help us find issues, and (if you have time) help us fix them! The project has gotten 1000x better since large teams started using it and sharing feedback.

Read Entire Article