5. Allocating responsibility for AI-authored code at the protected branch boundary

A protected branch is a contract. Whatever lands there is what the team — and the company — has agreed to ship. When a human approves an AI-authored PR, they aren't endorsing the AI; they're stepping forward as the person responsible for that code. The reviewer owns the merge.

5. Allocating responsibility for AI-authored code at the protected branch boundary
Protected code branch boundary

AI LLMs are good at writing code. AI LLMs are good at catching bugs and vulnerabilities. It does not follow that AI writes bug-free code. A bug can be a gap in the user's understanding, it can be something that AI wrote for which a complete specification was not provided to it. And anything else.

Code reviews and agile sprint retrospectives presumably don't assign blame but let us at least agree that they identify responsibility so that it doesn't happen again.

Whatever path you took to get there, a git (source repo) push to a protected branch is where the code has probably been seen by a human and reviewed. A protected branch is a contract. Whatever lands there is what the team — and the company — has agreed to ship. When a human approves an AI-authored PR, they aren't endorsing the AI; they're stepping forward as the person responsible for that code. The reviewer owns the merge.

That sounds heavy, and it is. Organizations require responsibility at various levels. An AI can't be paged at 3am. It cannot attend a postmortem. It can't be fired or promoted. So the question isn't whether a human is on the hook — that's settled — but whether the human has done enough work to deserve to be. This post is about that work: how to catch up to a change you didn't write, and how to know when to say yes, no, or "not yet."

Catching up with Code

When you write code yourself, understanding is a byproduct of typing it. Hey, you've probably memorized it in the time you spend nursing and tuning it. When you review AI-generated code, understanding is something you have to actively go get. The diff looks plausible, tests pass and the PR description is well-written. None of that means you understand it. You wrote the spec and the AI prompt so you must have some idea, but is plausibility the same as comprehension. AI-generated pull requests read smoothly because the model is good at producing smooth prose and conventional code. Human-written PRs from a tired colleague at 4pm are often messier and easier to challenge — your brain notices the seams. Smooth code lulls you into inaction. You have to deliberately puncture the smoothness and make a little mess.

Practical ways to catch up, ordered by cost:

  1. Read the diff twice. First pass: what's the shape of the change? Second pass: what could be wrong about it? The second pass is where review actually happens. Get another AI LLM to play the devil's advocate and emphasize to it the area you feel may have bugs.
  2. Re-derive the why. Without looking at the PR description, ask yourself: what problem is this solving, and is this the right shape of the solution?
  3. Trace one path end-to-end. Pick the most important code path the change touches and walk it from entry point to exit, through the new code. If you can't follow it without referring back to the PR, you're not ready to merge.
  4. Run it. Not just the tests — the actual feature. UI changes get exercised in a browser. API changes get hit with curl or Postman. Get AI to write some unit and integration tests. "CI builds" answers a different question than "does it actually work."
  5. Look one layer out. What calls this? What does this call? AI-generated changes tend to be locally correct and globally naive — the function works, but the third caller you didn't read passes a value the new code doesn't handle. The AI may not be to blame, as it may be your prompt or context that was too narrow.
  6. Imagine the rollback. If this breaks in production, what's the revert path? If revert is "just revert the commit," you're in good shape. If revert is "revert the commit and also run a script and also notify customers", then the bar for understanding thoroughly has just gone up.

If you can't do steps 1–4 in the time you have, that's a signal — not necessarily to say no, but to stop and ask what's actually going on.

When to say yes

A confident yes usually has these properties:

  • Bounded blast radius. If this is wrong, the damage is contained — a single feature, a single endpoint, a single user flow. You can name what would break.
  • Reversible. A revert undoes it cleanly. No data migration to unwind, no external API call already made, no message already sent. No apologies to customers and no meeting with the legal department.
  • You understand the change. You could explain it to a colleague without reading the PR description. You could defend the design choices.
  • The tests test the right thing. Not just "tests exist" — the tests would actually fail if the behavior regressed. AI-generated tests are particularly prone to testing the implementation instead of the behavior.
  • It matches the team's rules. Style, structure, error handling and observability. It should look like code your team would have written, because it'll be maintained as if it were. AI code-writing and code review tools provide ample opportunities to document this style and make them follow it.

When all five hold, approve and move on. Hesitation is its own cost; over-scrutinizing low-risk changes burns the trust budget you'll need for genuinely tricky ones.

When to say no

A no — or at least a "not like this" — is right when:

  • You can't predict the failure modes. If you can't list two or three plausible ways this breaks in production, you don't understand it well enough to ship it.
  • The change is one-way. Schema migrations, dropped columns, deleted data, public API changes, pricing logic, security boundaries. The AI doesn't know which doors are one-way; you do.
  • The diff is too large to actually review. A 2,000-line PR doesn't get reviewed; it gets waved through. Send it back and ask for it to be split. The author (human or otherwise) can split it; you can't un-merge it.
  • It touches something the AI can't have full context on. Cross-team contracts, undocumented secrets, organization lore like "we did it this way because of an incident in 2023." or regulatory or legal imperatives. If the change contradicts something only humans know, that's a no until a human writes down why.
  • The fix is suspiciously local. Bugs are often symptoms. A change that patches the symptom without naming the root cause is a yellow flag, and a red flag if the AI proposed it.
  • You're OKing under time pressure. "I'll just merge it, it's probably fine" is the failure mode the protected branch exists to prevent. If you don't have time to review, you don't have time to merge.

The middle ground: "not yet"

Most reviews aren't yes or no — they're not yet. Some common scenarios:

  • "Split this PR." The single highest-leverage thing a reviewer can ask for. A 400-line PR split into four 100-line PRs gets four times the review quality, not one quarter.
  • "Code coverage: add a test." Especially for AI-authored fixes. The test is the artifact that suggests the human author understood the potential failure.
  • "Explain this in the PR description, not in a comment." If the why isn't obvious, it belongs in the description. Comments rot; PR history doesn't.
  • "Show me it working." A screenshot, a curl output, a log line. Cheap to produce, expensive to fake, dramatically raises confidence.

A rule worth adopting

Make it explicit, in writing, that the merger is the author of record for protected-branch purposes. Once a team internalizes that, the review behavior changes on its own — people stop merging things they don't understand, because they understand they're signing for them.

Ergo, the team has to make that level of review possible. That means: smaller PRs, clear descriptions, sane schedules, and a culture where "I don't understand this yet" is a normal thing to say out loud. Without those, the human at the merge button becomes a non-responsible ghost — just a name on a commit that nobody actually read.

The protected branch is the line where the team's collective judgment becomes the company's code. AI can help you get to that line faster. It can't cross it for you.