The 2nd order effect that not a lot of people talk about is price: the fact that model scaling at this pace also correlates with price is amazing.
I think this is just as important to distribution of AI as model intelligence is.
AFAIK there are no fundamental "laws" that prevent price from continuing to fall, at least correlated with Moore's law (or whatever the current AI/Nvidia chip development cycle is called right now)- each new generation of hardware is significantly faster/cheaper than the next- so will we see a ChatGPT-5 model at half the price in a year? (yes I know that thinking models cost more, but just on a per-token basis)
You are vastly underestimating the price decline. To cherrypick one article; in the first two years since GPT 3.5, inference price for the same amount of intelligence has decreased 10x per year according to a study by Andreessen Horowitz https://a16z.com/llmflation-llm-inference-cost/. So in a stark slowdown scenario, we could still see a 1000x decrease in the next 5 years.
Price deflation is not tied to Moore's right now because much of the performance gains are from model optimization, high bandwidth memory supply chains, and electrical capacity build out, not FLOP density.
True! I just know that model optimization gains are much less guaranteed than say, FLOP density, even though model optimization has so far provided way more gains than hardware advancements.
Part of me is optimistic that when the AI bubble bursts the excess data center capacity is going to be another force driving the cost of inference down.
> I just know that model optimization gains are much less guaranteed than say, FLOP density, even though model optimization has so far provided way more gains than hardware advancements.
Performance gained from model improvements has outpaced performance gained from hardware improvements for decades.
I think this is just as important to distribution of AI as model intelligence is.
AFAIK there are no fundamental "laws" that prevent price from continuing to fall, at least correlated with Moore's law (or whatever the current AI/Nvidia chip development cycle is called right now)- each new generation of hardware is significantly faster/cheaper than the next- so will we see a ChatGPT-5 model at half the price in a year? (yes I know that thinking models cost more, but just on a per-token basis)