Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you knew that the model never changed it might be very helpful, but most of the big providers constantly mess with their models.


Even if you used a local copy of a model, it would still just be a semi-quantitative version of “everyone knows ‹thing-you-don't-have-a-grounded-argument-for›”


Their performance also varies depending on load (concurrent users).


Dear god does it really? That’s very funny.


Why are you surprised? It’s a computational thing, after all.


It’s not that crazy, just the architecture of differently quantized models and so on that you’d need to do that is impressive considering.


The models are the same, it's the surrounding processing like "thinking" iterations that are adjusted.


That only works for LRMs no? Not traditional LLM inference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: