Your AI-Written Codebase Is a Gift to Your Competitors
Companies shipping AI-generated code have made a trade they don’t fully understand. They carry full liability if the AI copied someone else’s work, but they gain zero IP protection for what it produced. Under settled US law, code written entirely by AI belongs to no one. It enters the public domain the moment it is generated. Every company publicly crediting AI for their codebase is handing competitors a legal argument to copy it.
Key terms used throughout this post:
- DMCA (Digital Millennium Copyright Act) – the US law that lets copyright holders send takedown notices to platforms hosting infringing content. A DMCA takedown is how companies force platforms like GitHub to remove code that violates their copyright.
- USCO (US Copyright Office) – the federal body that registers copyrightable works and issues guidance on what qualifies for protection.
- Clean-room defense – a legal strategy where one team reads the original code and writes a spec, and a separate unexposed team builds the new implementation from that spec alone.
This is not theoretical. On March 31, 2026, Anthropic accidentally leaked the entire source code of Claude Code – 512,000 lines of TypeScript. Within hours, a developer fed it into OpenAI’s Codex and produced a Python rewrite. That rewrite has 75,000+ GitHub stars. Anthropic has not taken it down. The reason they haven’t is the reason you should care about this.
The law is settled
The USCO requires human authorship for copyright protection. The courts agree at every level:
timeline
title AI Copyright Rulings — US
Aug 2023 : Thaler v. Perlmutter — "Human authorship is a bedrock requirement"
Jan 2025 : USCO Part 2 Report — Detailed prompting alone does not qualify
Mar 2025 : D.C. Circuit affirms Thaler — AI is a tool, conception is human
Jul 2025 : Bartz v. Anthropic — AI training on purchased content is fair use
Mar 2 2026 : Supreme Court denies cert in Thaler — final word, no appeals left
Mar 31 2026 : Claude Code leak — the Python port tests every open question
The authorship question closed steadily over three years, culminating in the Supreme Court’s refusal to hear the case. There is no remaining avenue for appeal.
The USCO has registered “hundreds” of AI-assisted works, but only where humans exercised “ultimate creative control” over expressive elements. Where you fall on the spectrum:
| Scenario | Copyrightable? | Why |
|---|---|---|
| Accept AI suggestion verbatim | No | No human expression in the output |
| Prompt AI, select from outputs, arrange | Maybe | Depends on whether selection rises to creative control |
| Use AI to draft, then substantially rewrite | Likely yes | Human expression exists in the modifications |
| Human writes code, AI autocompletes boilerplate | Yes | Human is the author; AI is the tool |
The table answers one question: where is the line between tool use and delegation? The answer is that the line is fact-specific and largely untested in court. Most enterprise AI usage falls in the first two rows.
The Claude Code leak proves the problem
The leak created three layers of legal mess, each harder than the last.
Layer 1: Direct mirrors. Anthropic filed DMCA takedowns – formal notices under the Digital Millennium Copyright Act that force platforms to remove allegedly infringing content – against repos hosting the raw leaked TypeScript. These are legally sound. The codebase as a whole contains human-authored elements. Copyright in a collective work doesn’t require every line to be human-written.
Layer 2: The Python port. Developer @realsigridjin fed the leaked TypeScript into Codex and produced a Python rewrite. 75k+ stars, 75k+ forks. Anthropic has not filed a DMCA against it. The traditional clean-room defense – where one team reads the original and writes a spec, and a separate unexposed team builds the new code – fails here because the developer fed the original directly into the AI. But the output may not be copyrightable anyway. Under Thaler, AI-generated output is public domain. Neither Anthropic nor the rewriter definitively “owns” it. Plagiarism Today coined the term “AI-washing copyright” for this pattern – using AI to circumvent license obligations while claiming legal compliance. Courts have not addressed it.
Layer 3: “Cleanroom as a Service”. A startup called Malus now offers automated clean-room rebuilds: one group of bots analyzes requirements and drafts specs, a second group generates code. They claim the result is “legally distinct code that you own outright”. Legal experts call this “completely untrue”. The AI-generated code is public domain. Nobody owns it, including the buyer.
Anthropic has no good option
| Action | Consequence | Downstream effect |
|---|---|---|
| DMCA the Python port | Must argue the port reproduces copyrightable expression | Could establish that AI-written codebases have no copyright |
| Do nothing | Port stays up, competitors get a blueprint | Signals any codebase can be AI-translated with impunity |
| Open-source Claude Code | Neutralizes the leak | Concedes the harness isn’t the moat |
Every row leads to a precedent Anthropic probably doesn’t want. The middle path – doing nothing – is what they’ve chosen so far, but silence is itself a signal to the industry.
The asymmetric risk you are carrying
The deeper problem, as paddo.dev frames it: companies using AI to write code face asymmetric risk. You carry full liability for infringement – if the AI reproduced copyrighted training data, you can be sued. But you gain no protection for the output – if someone copies what the AI wrote for you, you have no legal recourse.
GitHub’s own research shows ~1% of Copilot suggestions match training data verbatim. At enterprise scale, that is a significant exposure surface. Doe v. GitHub, now at the Ninth Circuit, alleges Copilot strips copyright notices from training data output – a violation of DMCA Section 1202(b) – and breaches open-source license contracts. Two claims survived dismissal. Bartz v. Anthropic (July 2025) ruled that using legally purchased content for LLM training constitutes fair use, but that helps on the training-data side, not the output-copyright side.
“But our terms of service say we own the output”
Terms of service can assign ownership, but they cannot create copyright in works the law deems uncopyrightable. A contract saying you own public-domain material has no legal force. Prompting alone does not create authorship – the USCO requires creative control over the expressive elements of the output, not just the input.
| Risk | Likelihood | Impact |
|---|---|---|
| Competitors copy your AI-generated code legally | High (if accepted verbatim) | High (no legal recourse) |
| License violation from AI-reproduced training data | Medium (~1% verbatim match rate) | High (DMCA exposure at scale) |
| Public AI-authorship claims undermine your IP | High (if leadership credits AI) | Catastrophic |
| AI-translated port of proprietary code appears | Low-Medium | High (no clear DMCA path) |
This table is the risk landscape for engineering leaders. The third row is the most overlooked: every public statement crediting AI for your codebase is potential evidence against your own copyright claims.
What to do differently
The cost of AI-assisted velocity is a narrower IP moat. That trade-off may be worth it. It should be a conscious decision, not a surprise.
If your competitive advantage depends on proprietary code – not brand, not model weights, not network effects – prioritize human authorship on code that matters competitively. Use AI freely on commodity code where copying is irrelevant. Stop telling investors that “our engineers use AI for everything”.
Watch: Doe v. GitHub at the Ninth Circuit for DMCA liability on AI output. Anthropic’s response to the Python port. Whether Congress passes anything meaningful before the 2026 midterms.
Further reading
- USCO Part 2: Copyrightability Report (PDF)
- Cleanroom as a Service: AI-Washing Copyright (Plagiarism Today)
- All the Liability, None of the Protection (paddo.dev)
- Diving into Claude Code’s Source Leak (Engineer’s Codex)
- CLEAR Act (IPWatchdog)
- Doe v. GitHub Ninth Circuit Brief (CCIA)
- NeetCode: “Claude Code’s Entire Source Code Was Leaked” (YouTube)
- Supreme Court Denies Cert in Thaler (Baker Donelson)
- Supreme Court Declines AI Copyright Case (Morgan Lewis)
- The Final Word on AI Authorship (Holland & Knight)
- Claude Code Leak (VentureBeat)
- Claude Code Leak (The Register)