I don't think they are most efficient for small GPUs. I think they might only be the one which have capex and certainty required for multimillion dollar purchase of GB200 NVL72 or something of that scale.
I have many diverse friend groups. And HN is lot lot more anti AI then even the worst of the non technical groups I am in. e.g. [1] is so detached from reality and got to front page for multiple days recently. Or [2].
I have never seen a positive story(I am not talking about things like current model, just how positive AI could be like the Sam Altman post) in front page for a long time. Feel free to disprove me.
I would also suggest that if you believe your two links are "detached from reality", you may be struggling with your own bias. They're both perfectly reasonable takes that you are totally reasonable to disagree with. Try to come down from that feeling and realize that disagreeing does not necessitate that the person you are disagreeing with is insane.
Why do these kind of articles getting into front page in HN everyday. There is nothing substantive and these are the empty kind of articles that anti AI folks should be against.
No one is forcing you or pressuring you to not call friend to ask for recipe. Even AI would say that you should talk to friends.
> It is almost guaranteed that a 60-90B model can outperform current SOTA in coding tasks within 2-3 years.
I am ready to bet against this. Knowledge benchmark like SimpleQA isn't increasing for small models.
> It is far less clear that a 1.2T model will be meaningfully better enough to justify training it.
Well for one, we know for certain there is Mythos which is meaningfully better. And I think there is a lot of juice left to squeeze for Mythos class model.
Knowledge benchmarks can't really be improved upon via distillation or RL. It requires those facts be added to the training corpus and for the model to memorize them better. Neither distillation or RL really do that and thus we shouldn't expect improvements on SimpleQA unless some other interventions are being made.
Model intelligence and knowledge aren't necessarily directly related. If we can pack greater intelligence and agency at the cost of it forgetting factoids, that would actually be a good thing. We don't need LLMs to memorize facts, we need them to learn how to interact with the world such that they can find the facts that are necessary and surface them to the user.
If we could distill all of the knowledge out of an LLM and just be left with a very agentic model that only knows facts in it's context, I think some very interesting stuff would happen.
Lot of the things aren't facts that could be stated. No one can just see the dictionary or translation of words and start talking in that language.
There isn't a clear definition of what is knowledge and what is intelligence. Is being able to write in C knowledge? Is knowing undefined behaviour in that knowledge?
My point is that if I made someone "smarter" they wouldn't suddenly know "What day, month, and year was Carrie Underwood’s album “CryPretty” certified Gold by the RIAA?" which is an example of a question in the SimpleQA benchmark.
So (in my opinion) knowledge benchmarks stagnating for small models is not evidence that small model agentic coding performance improvement will stagnate soon. Small models do not struggle with syntax, the barrier is not knowledge. The barrier is long context coherence and problem solving, which I don't see a bottleneck on improvements for small models in the near horizon as we get more and more high quality reasoning traces to train upon.
> What do you mean by 3-4 orders of magnitude better? Was Einstein 3-4 order of magnitude better than us?
I'm talking about output quality compared to parameter size.
Mythos is not 4 orders of magnitude larger than Opus - it's quite possible no LLM model ever reaches that size (likely even), and it's output is only barely better...
> Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029, which is close to $2 trillion per year by the end of the period. In order to achieve “historic returns,” the providers would need to earn nearly $8.2 trillion in the same period.
The numbers are made up political correctness anyway.
Everyone's agency is 100% captured by belief in Wall Street. Too few <50 have any meaningful labor skills to blink.
We'll continue to have consent manufactured via media platforms and in 3 years no one will bat an eye at these companies being worth $12 trillion as Altman and Musk climb two ladders holding a "mission accomplished" banner.
Did you even try to verify your claims. I tested it on few translations on wikipedia articles using [1] and it takes 15-20% more tokens for Norwegian.
English performs the best because there is more data in English and high quality sources are either only in English or there is a good translation in English.
Tests I've done with NO and FI texts, for the same number of characters, with the GPT5 tokenizer I get around 2x the tokens than EN. With the older tokenizers it's more like 2x or even 3x.
> anyone can host and get almost exact same power.
This is not true at all.
And the claim about mission of the church and mission of the ai being the same is absurd. Or ai being authority. Like, the rest of that comment does not apply to ai at all.
reply