Ask HN: Why can't ChatGPT solve Wordle-type questions?

enasterosophes · on April 30, 2024

LLMs works by converting your question into a list of numbers and projecting that list, like a shadow, into a high-dimensional space which was constructed through training on other lists of numbers. Where the projection lands gives a new list of numbers, which are then translated back into words.

Because of the way the model (i.e. the projection surface) was constructed, the strings returned look plausible. However, you're still just seeing the number-back-to-language translation of a vector which was guessed by statistical inference.

klakierr · on April 30, 2024

Wow I love this explanation. Are there any resources that explain LLMs in such accesible and easy to understand way?

Wilduck · on May 2, 2024

3Blue1Brown has a fantastic video on GPT that I'd highly recommend called "But what is a GPT? Visual intro to transformers" [1]. You'll need to have some background in mathematical reasoning, but not too much. He does a great job of drawing pictures to show the mathematical transformations that happen under the hood.

[1] https://www.youtube.com/watch?v=wjZofJX0v4M

thiago_fm · on April 30, 2024

There are videos on youtube where you can see how the token generation happens like an algorithm

The process itself is quite simple. I think everybody is surprised how well it works because it's the power of statistics + great data

Unfortunately, it can't reason at all and would need other AI fields in order to be able to tackle some other simple puzzles

enasterosophes · on April 30, 2024

I'm not too sure. When I first encountered LLMs, I decided to try implementing one from scratch. I didn't get far, because it's difficult, and I only have finite time and energy. Still, it was the process of trying to implement some basic stuff that gave me an idea of how they work under the hood.

sk11001 · on April 30, 2024

Tokenization. The tokens ChatGPT uses are longer than a single character. You're asking it to play the piano wearing oven mitts.

thiago_fm · on April 30, 2024

It's because it generates token by token based on probabilities, and it has no reasoning capabilities. Some AI experts like to name it reasoning for the purpose of benchmarking, but it isn't like what we humans think.

LLMs typically struggle to do things about the words themselves, or basic counting, some AIs like OpenAI use hacks to not have it fail in a miserable way.

stop50 · on April 30, 2024

LLMs encode words as numbers which don't have an relationship with the ltters of the word.

codegladiator · on April 30, 2024

You are not "asking" chatgpt and there is no answer, and this is the basis for all confusion.

You are transforming text using a text transformer. You have input text and output text.

You are asking why is this output text not the what you expected. That is because this particular transformer has said weights.

sprobertson · on May 1, 2024

This is the most technically correct answer - language models are at their core trying to continue a string of text. If you had trained the model on knowing exactly which letters represented the other letters in this particular puzzle, it might work.

beardyw · on April 30, 2024

What you are trying to solve is trivial if you have a suitable list of words. A solver would be easy to create and not need anything near as complex as AI. There are Scrabble solvers on the web which may work for you. Not everything is AI.

pillefitz · on April 30, 2024

But why? Is what OP wants to know

beardyw · on April 30, 2024

That's like, the sky is blue so why isn't everything else blue? I give up.

pillefitz · on May 1, 2024

There's a few answers/hypotheses already, and complete papers could be written to answer OP's questions, based on which improvements to LLMs could be made to do well on this particular problem class.