I don't think you're trying to claim ownership. It sounded like you were suggest...

MacsHeadroom · on Jan 17, 2024

They ingested the entirety of the internet. Everyone who has ever written anything, including our (implicitly copyrighted) HN comments and letters written 400 years ago, which is online was used to train GPT-4.

ben_w · on Jan 17, 2024

Yes, I'm saying that because there's (currently) no way to even tell how much the model was improved by my comments on HN vs. an equal number of tokens that came from e.g. nytimes.com; furthermore, to the extent that it is even capable of causing economic losses to IP holders, I think this necessarily requires the model to be actually good and not just a bad mimic[0] or prone to whimsy[1] and that this economic damage will occur equally to all IP holders regardless of whether or not their IP was used in training. For both of these reasons independently, I currently think UBI is the only possible fair outcome.

[0] I find the phrase "stochastic parrot" to be ironic, as people repeat it mindlessly and with a distribution that could easily be described by a Markov model.

[1] if the model is asked to produce something in the style of NYT, but everyone knows it may randomly insert a nonsense statement about President Trump's first visit to the Moon, that's not detracting from the value of buying a copy of the newspaper.

lp0_on_fire · on Jan 18, 2024

So because it's "difficult" for ChatGPT to pay people for what it ingested we need to change our entire economic model to accommodate their inability (I'd argue it's the lack of will) to solve this problem?

Imagine a scenario where your employer decides that it's going to go plant trees to save the environment, a laudable goal, in lieu of your paycheck. Their excuse would be "it's too difficult to process payroll and easier to plant trees. Since planting trees is good for the environment by the transitive property it's good for the employee. Thus, they should be happy."

ben_w · on Jan 18, 2024

> So because it's "difficult" for ChatGPT to pay people for what it ingested we need to change our entire economic model to accommodate their inability (I'd argue it's the lack of will) to solve this problem?

It's not merely "difficult", so far as I know the only way even in theory, and I am about to explain why this is sufficiently hard as to be impossible in practice even though it may not sound like it before you get to that explanation, is to try retraining the model from scratch with certain subsets removed.

Unfortunately, while can do this for any single source, you absolutely cannot owing to the computational limits of the universe do this for all 2^(2*10^9)) permutations of people on the internet who produced some training data (even a mere 2^(765.5) is too large to *count* within the theoretical computational limits of the universe); and those permutations are necessary because inside the model all these sources interact with each other, so you also can't (accurately) just assume that whatever Alice's contribution was, was independent of whatever Bob's contribution was — as a toy model, pretend Alice explained algebra and Bob explained geometry, there's a lot you can do with either one, but even more you can do with both.

As for "changing our entire economic model": if/when AI is good enough to make everyone permanently unemployable, what else would you suggest?

Even in the lead-in to that if/when conditional, I think we need something like UBI when there's an AI that's "only" making everyone of IQ≤85 permanently unemployable; similar economic changes, but with different potential solutions, are also needed for "only" truck drivers (as a category, I'm not making an IQ claim about them), or "only" the traditional handloom weavers[0], as in all cases the alternative is they choose between rioting and starving.

And in case you're wondering, I think the image generators may put artists in the second group (a specific role in society that is quite understandably upset about being suddenly automated out of a perfectly respectable career), while LLMs look like they might be the former (even if the question of exactly what this whole "IQ" thing means anyway is surprisingly difficult to answer).

> Imagine a scenario where your employer decides that it's going to go plant trees to save the environment, a laudable goal, in lieu of your paycheck. Their excuse would be "it's too difficult to process payroll and easier to plant trees. Since planting trees is good for the environment by the transitive property it's good for the employee. Thus, they should be happy."

Bad example. For one, as a software developer my job is to automate myself into redundancy. For another, I am saying that *all actual harm* from GenAI has to be balanced by someone else making *the same economic output*, which in the environment/trees example would have to be "my employer plants 2,400 trees per minute to make up for not paying anyone" (according to Wikipedia)… or possibly "my employer cut down literally all the trees on the planet and has decided to make up for this by planting replacements, which means they no longer have any money to pay anyone", which would leave me very confused about my financial status but I'd probably ask the government to step in to make sure they didn't go under before finishing putting the trees back?

[0] https://en.wikipedia.org/wiki/Power-loom_riots