Your AI-Written Codebase Is a Gift to Your Competitors // Edson Cilos

Companies shipping AI-generated code have made a trade they don’t fully understand. They carry full liability if the AI copied someone else’s work, but they gain zero IP protection for what it produced. Under settled US law, code written entirely by AI belongs to no one. It enters the public domain the moment it is generated. Every company publicly crediting AI for their codebase is handing competitors a legal argument to copy it.

Key terms used throughout this post:

DMCA (Digital Millennium Copyright Act) – the US law that lets copyright holders send takedown notices to platforms hosting infringing content. A DMCA takedown is how companies force platforms like GitHub to remove code that violates their copyright.
USCO (US Copyright Office) – the federal body that registers copyrightable works and issues guidance on what qualifies for protection.
Clean-room defense – a legal strategy where one team reads the original code and writes a spec, and a separate unexposed team builds the new implementation from that spec alone.

This is not theoretical. On March 31, 2026, Anthropic accidentally leaked the entire source code of Claude Code – 512,000 lines of TypeScript. Within hours, a developer fed it into OpenAI’s Codex and produced a Python rewrite. That rewrite has 75,000+ GitHub stars. Anthropic has not taken it down. The reason they haven’t is the reason you should care about this.

The law is settled

The USCO requires human authorship for copyright protection. The courts agree at every level:

timeline
    title AI Copyright Rulings — US
    Aug 2023 : Thaler v. Perlmutter — "Human authorship is a bedrock requirement"
    Jan 2025 : USCO Part 2 Report — Detailed prompting alone does not qualify
    Mar 2025 : D.C. Circuit affirms Thaler — AI is a tool, conception is human
    Jul 2025 : Bartz v. Anthropic — AI training on purchased content is fair use
    Mar 2 2026 : Supreme Court denies cert in Thaler — final word, no appeals left
    Mar 31 2026 : Claude Code leak — the Python port tests every open question

The authorship question closed steadily over three years, culminating in the Supreme Court’s refusal to hear the case. There is no remaining avenue for appeal.

The USCO has registered “hundreds” of AI-assisted works, but only where humans exercised “ultimate creative control” over expressive elements. Where you fall on the spectrum:

Scenario	Copyrightable?	Why
Accept AI suggestion verbatim	No	No human expression in the output
Prompt AI, select from outputs, arrange	Maybe	Depends on whether selection rises to creative control
Use AI to draft, then substantially rewrite	Likely yes	Human expression exists in the modifications
Human writes code, AI autocompletes boilerplate	Yes	Human is the author; AI is the tool

The table answers one question: where is the line between tool use and delegation? The answer is that the line is fact-specific and largely untested in court. Most enterprise AI usage falls in the first two rows.

The Claude Code leak proves the problem

The leak created three layers of legal mess, each harder than the last.

Layer 1: Direct mirrors. Anthropic filed DMCA takedowns – formal notices under the Digital Millennium Copyright Act that force platforms to remove allegedly infringing content – against repos hosting the raw leaked TypeScript. These are legally sound. The codebase as a whole contains human-authored elements. Copyright in a collective work doesn’t require every line to be human-written.

Layer 2: The Python port. Developer @realsigridjin fed the leaked TypeScript into Codex and produced a Python rewrite. 75k+ stars, 75k+ forks. Anthropic has not filed a DMCA against it. The traditional clean-room defense – where one team reads the original and writes a spec, and a separate unexposed team builds the new code – fails here because the developer fed the original directly into the AI. But the output may not be copyrightable anyway. Under Thaler, AI-generated output is public domain. Neither Anthropic nor the rewriter definitively “owns” it. Plagiarism Today coined the term “AI-washing copyright” for this pattern – using AI to circumvent license obligations while claiming legal compliance. Courts have not addressed it.

Layer 3: “Cleanroom as a Service”. A startup called Malus now offers automated clean-room rebuilds: one group of bots analyzes requirements and drafts specs, a second group generates code. They claim the result is “legally distinct code that you own outright”. Legal experts call this “completely untrue”. The AI-generated code is public domain. Nobody owns it, including the buyer.

Anthropic has no good option

Action	Consequence	Downstream effect
DMCA the Python port	Must argue the port reproduces copyrightable expression	Could establish that AI-written codebases have no copyright
Do nothing	Port stays up, competitors get a blueprint	Signals any codebase can be AI-translated with impunity
Open-source Claude Code	Neutralizes the leak	Concedes the harness isn’t the moat

Every row leads to a precedent Anthropic probably doesn’t want. The middle path – doing nothing – is what they’ve chosen so far, but silence is itself a signal to the industry.

The asymmetric risk you are carrying

The deeper problem, as paddo.dev frames it: companies using AI to write code face asymmetric risk. You carry full liability for infringement – if the AI reproduced copyrighted training data, you can be sued. But you gain no protection for the output – if someone copies what the AI wrote for you, you have no legal recourse.

GitHub’s own research shows ~1% of Copilot suggestions match training data verbatim. At enterprise scale, that is a significant exposure surface. Doe v. GitHub, now at the Ninth Circuit, alleges Copilot strips copyright notices from training data output – a violation of DMCA Section 1202(b) – and breaches open-source license contracts. Two claims survived dismissal. Bartz v. Anthropic (July 2025) ruled that using legally purchased content for LLM training constitutes fair use, but that helps on the training-data side, not the output-copyright side.

“But our terms of service say we own the output”

Terms of service can assign ownership, but they cannot create copyright in works the law deems uncopyrightable. A contract saying you own public-domain material has no legal force. Prompting alone does not create authorship – the USCO requires creative control over the expressive elements of the output, not just the input.

Risk	Likelihood	Impact
Competitors copy your AI-generated code legally	High (if accepted verbatim)	High (no legal recourse)
License violation from AI-reproduced training data	Medium (~1% verbatim match rate)	High (DMCA exposure at scale)
Public AI-authorship claims undermine your IP	High (if leadership credits AI)	Catastrophic
AI-translated port of proprietary code appears	Low-Medium	High (no clear DMCA path)

This table is the risk landscape for engineering leaders. The third row is the most overlooked: every public statement crediting AI for your codebase is potential evidence against your own copyright claims.

What to do differently

The cost of AI-assisted velocity is a narrower IP moat. That trade-off may be worth it. It should be a conscious decision, not a surprise.

If your competitive advantage depends on proprietary code – not brand, not model weights, not network effects – prioritize human authorship on code that matters competitively. Use AI freely on commodity code where copying is irrelevant. Stop telling investors that “our engineers use AI for everything”.

Watch: Doe v. GitHub at the Ninth Circuit for DMCA liability on AI output. Anthropic’s response to the Python port. Whether Congress passes anything meaningful before the 2026 midterms.