Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Rather OpenAI stole from individual artists and YouTube and other social media

"stole"?

They consumed publicly available material on the Internet

I am no fan of these billionaire capitalists and their henchpersons but condem them for their multitude of sins.

Consuming publicly available Internet resources is not one of them. IMO



Being publicly available does not mean that copyright is invalid. Copyright gives the holders the right to restrict USE, not merely restrict reproduction. Adaptation is also an exclusive right of the copyright holder. You're not allowed to make derivative works.


> They consumed publicly available material on the Internet

I agree that there are some important distinctions and word-choices to be made here, and that there are problems with equating training to "stealing", and that copyright infringement is not theft, etc.

That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.


> That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.

I would use the analogy of those humans learning from the material. Like reading books in the library

"regurgitate slightly-reworded copies" in my experience using LLMs (not insubstantial) that is an unfairly pejorative take on what they do


It's not that they consumed publicly available material, it's that they re-published that information, and sold it.


They stole the data just as much as a painter steals the view.


Who created the view?


The view is created by every spectator.


By that logic a copy of source code for a propriatary app that someone has stolen and placed online is immediately free for all to use as they wish.

Being on the internet doesnt make it yours, or acceptable to take. In the case of OpenAI (and Anthropic) they should be following the long held principle of the robots.txt file on sites, which can be specifically set to tell just them that they may not take your content - they openly ignore that request.

OpenAI absolutely is stealing from everyone, hence why most will have little sympathy when they complain someone stole from them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: