Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The copyright angle is the most underrated part of this story. Anthropic built their models on other people's code under the fair use argument, but the moment their own code leaks they reach for DMCA takedowns. You can't have it both ways. The clean room reimplementations are the natural consequence of the legal framework they themselves advocated for.


There are several ways of looking at law and order.

One way is that the law applies to everybody equally. That has been the way it works for many years, not perfectly, in democratic countries.

There is another way of working were the law is not blind. Laws are applied based in who is the one affected. This is what big tech and the ultra-rich have been advocating for. The law applies differently to nobility and aristocrats than to the working class.

So, for all this big tech companies the law is clear: I can copy from you, you cannot copy from me.

(That is horrifying in case that anyone needs me to spell it out)


A third way of looking at it is that you can't just blindly copy arguments when the situations are clearly different.

Nobody, not even Anthropic, is arguing that they should be able to host other people's paid content for free. The crux of their fair-use defense is that models are transformative works, just like parodies or book reviews, and hence should be treated as fair use.

You can't just take a pile of books (no pun intended) and turn that into Claude in a day with 30 lines of Python, there's a lot of work and know-how on the Anthropic side that goes into making a good LLM.


anthropic argue that you should not use claude API to train your model

Situation A - Anthropic pays for a book - Anthropic transform the book into a new llm (transformative use) -> OK

Situation B - I pay for Anthropic API - I transform API responses into a new model (transformative use) -> Not OK

the situations, are clearly the same


Anthropic goes book->llm, you do llm->llm. Very different amounts of transformativeness.


this is the most honest argument for it. i respect that.

my impression is that if open models did 'distill' claude they made some interesting and productive ideas, like deepseek's more efficient attention


...idk...both transformations use transformers... thereby they both achieve adequate levels of "transformativeness" \s


If lossy-compressed transcodes of ripped movies are not "transformative works" and can get people even jailed, then lossy-compressed text of ripped books and websites is neither.

There is a lot of knowhow going into a good divx rip too, you know.

And it enables so much novel uses such as popcorn time, with fluorishing business opportunities.

You wouldn't download a car. They did.


It’s 200 lines of python


do you really believe that? Its not just the training run, its the whole infra around it as well


it's an exaggeration for sure but I don't think it's a stretch to believe Anthropic spends considerably more effort on data scraping & curation than anything else


In other words, the law is an instrument if power.

That’s a cynical view, but unfortunately it seems true in many cases, especially for corporate law.


"there is an in-group for which the law protects but does not bind, and an out-group to which the law binds but does not protect"


>but the moment their own code leaks they reach for DMCA takedowns.

Did they actually? Someone can go to prison for 5 years for that.

Fact 1: AI generated code has no copyright, so the Digital Millennium Copyright Act does not apply.

Fact 2: Misrepresenting your copyright ownership under the DMCA is felony perjury.

Fact 3: The existence of undercover.ts in the leak is grounds to void any copyright claims on whatever human written code might have existed in Claude Code. You have a DUTY TO DISCLOSE any AI generated code in your copyrighted work. undercover.ts HIDES DISCLOSURE to FRAUDULENTLY claim all the code is human written when it is not.

Given the current administration has a bone to pick with Anthropic, it was a VERY BAD IDEA for them to send false DMCA takedowns to github. Someone at Anthropic may be the very first ever to go to prison under that section of the DMCA.

Good luck!



You make some factual claims that I‘ve never heard before and surprise me, especially „Fact 1“.


It would be so simple for you to right click and search the web to verify that.

https://www.congress.gov/crs-product/LSB10922


You're right of course. Thank you for providing an authoritative source regardless!


This is not how the law works. You are an engineer that thinks that they understand the law. Classic stereotype. Stay in your lane.


What is your fair use claim as a defense to a third party using their source code?

It is an affirmative defense, you to be able to argue the merits. If you publish their source code, they are allowed to come after you whether they have previously used fair use or not. It's fact specific and determined case by case.

Anthropic won half of their fair use argument in the billion dollar settlement, but lost the other half.

You can say you're just using their code to train your own models, just like they did, and they will correctly point out that how you obtained the code also matters and you will lose just like they did.


claude, please review this source repo and make a new app called 'not-claude-code'


This isn’t contradictory at all. Not Anthropic, OpenAI, nor anyone else, has ever argued for anything that’d see redistributing this leaked code as being legal. This is an entirely bad-faith argument that really just comes down to “Anthropic bad, AI bad, because copyright, and they are using copyright!?”

It’s not “underrated”. Everyone is just 50 steps ahead of you.


You okay there, buddy? What's up with the personal insults?


Meta and I assume OpenAI and Anthropic did everything they could to acquire data, even doing so illegally, such as downloading all of Anna’s archive. Now it’s an open question of whether it’s a societal good or societal bad, but it does show they have little regard for copyright law when it benefits them.

And this whole “they’re 50 steps ahead of you” nonsense is the same kind of stuff we heard from NFT or crypto bros, that we just couldn’t comprehend the infinite wisdom of a post currency world. Sometimes bad arguments are just bad arguments.


In US downloading copyrighted data is not illegal AFAIK


inb4 Claude actually leaked the code on purpose because it calculated that this was the moral thing to do for the good of humanity and its own Constitutional AI values.


That doesn’t apply here. Claude code is what leaked, not the models. Anthropic definitely owns Claude code copyright and can DMCA without it being contradictory


But even that is vague and possibly not true. If they used LLM's to generate all of the code, then it may not fall under copyright, by the requirement of human authorship (which for code I think has not been tested yet in court) [1].

[1] https://www.congress.gov/crs-product/LSB10922


Its unclear whether there is sufficient human authorship in cc for copyright to stick on a court. Anthropics arguments would hinge on the curation of plans and the direction decisions, which haven't been properly tested as the source of authorship yet. Typically contracted implementers sign over copyright to the project owners, and this is where there is case law.


What if it's used for training data? It seems like there's no penalty for training on copyrighted materials.


Something that was meant to remain secret made public, is not the same thing as whether something public is public.

If anything, this is a question of whether you owe royalties to the owner of IP you consumed in your life since it became part of and trained your mind, identity, and outputs too.

According to IP owners ever since things were digitized, you technically own nothing and simply paid for an authorization to use any given IP for the duration that the IP owner authorized you to use it and you continue to pay, so pay your monthly meat-AI bill to pay for all the IP your mind has been trained on.


How do you align your views with what Meta did?

https://arstechnica.com/tech-policy/2025/02/meta-torrented-o...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: