Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Large language models have taken the public attention by storm – no pun intended.

As far as I can tell, no pun made, either!



To make a pun in this domain, attention is all you need!


"Attention" is the would be pun, I think.


The pun is in "attention" because GPT uses "attention" to weigh each input token and comes up with an attention score between whatever token is currently being generated and all the input tokens then it'll take those scores to determine what weight each contributes to the output. Something along those lines... I'm no GPT expert.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: