Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs works by converting your question into a list of numbers and projecting that list, like a shadow, into a high-dimensional space which was constructed through training on other lists of numbers. Where the projection lands gives a new list of numbers, which are then translated back into words.

Because of the way the model (i.e. the projection surface) was constructed, the strings returned look plausible. However, you're still just seeing the number-back-to-language translation of a vector which was guessed by statistical inference.



Wow I love this explanation. Are there any resources that explain LLMs in such accesible and easy to understand way?


3Blue1Brown has a fantastic video on GPT that I'd highly recommend called "But what is a GPT? Visual intro to transformers" [1]. You'll need to have some background in mathematical reasoning, but not too much. He does a great job of drawing pictures to show the mathematical transformations that happen under the hood.

[1] https://www.youtube.com/watch?v=wjZofJX0v4M


There are videos on youtube where you can see how the token generation happens like an algorithm

The process itself is quite simple. I think everybody is surprised how well it works because it's the power of statistics + great data

Unfortunately, it can't reason at all and would need other AI fields in order to be able to tackle some other simple puzzles


I'm not too sure. When I first encountered LLMs, I decided to try implementing one from scratch. I didn't get far, because it's difficult, and I only have finite time and energy. Still, it was the process of trying to implement some basic stuff that gave me an idea of how they work under the hood.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: