Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What makes you say that? There are constant improvements in how they’re being trained and what they’re being trained with; there really isn’t any particular reason to believe we’re at a maxima. Especially with multimodality being introduced!


My understanding is that essentially they have been trained on everything (meaning the whole internet), so there is not much left except niche sources adding incremental benefit. But granted I can imagine the data being used more effectively for training, though I doubt there would be a step change in capabilities coming from that - my suspicion is that as well as the data, the techniques have reached a maximum or close to it.


There's still plenty of data out there, including in other languages and undigitised books - and that's before you get to data in other modalities, like speech and videos. Synthetic data can also be used quite effectively if you're trying to distill a model instead of trying to grow capabilities, as Phi-1.5 demonstrates.

For capability growth, well, we don't know what we don't know. There are still many unknowns when it comes to architecture, training, data, modalities, incremental learning, alignment, self-critique, and more. There's plenty of companies and governments trying to find their angle here.

Even if we're at the very peak of what LLMs are capable of -- which seems unlikely -- there's still potentially decades of research in making what we have more effective.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: