Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Why can't ChatGPT solve Wordle-type questions?
5 points by standeven on April 30, 2024 | hide | past | favorite | 14 comments
My daughter likes to give me substitution cyphers to solve, and sometimes it’s just a single word. For example, “cormorant”, but substituted so it appears as “avtfvtpwz”. If I ask ChatGPT to list every 9-letter word with the second and fifth characters the same, the third and sixth characters the same, and all other characters unique to each other, it cannot get it right. It hallucinates and tells me all sorts of other words as fitting the criteria when they don’t, at all.

I can ask it to paraphrase the rules and it totally understands, it just can’t get close to the right answer. Same with other AI chat models that I’ve tried. Any idea why this seemingly simple question is a limitation?



LLMs works by converting your question into a list of numbers and projecting that list, like a shadow, into a high-dimensional space which was constructed through training on other lists of numbers. Where the projection lands gives a new list of numbers, which are then translated back into words.

Because of the way the model (i.e. the projection surface) was constructed, the strings returned look plausible. However, you're still just seeing the number-back-to-language translation of a vector which was guessed by statistical inference.


Wow I love this explanation. Are there any resources that explain LLMs in such accesible and easy to understand way?


3Blue1Brown has a fantastic video on GPT that I'd highly recommend called "But what is a GPT? Visual intro to transformers" [1]. You'll need to have some background in mathematical reasoning, but not too much. He does a great job of drawing pictures to show the mathematical transformations that happen under the hood.

[1] https://www.youtube.com/watch?v=wjZofJX0v4M


There are videos on youtube where you can see how the token generation happens like an algorithm

The process itself is quite simple. I think everybody is surprised how well it works because it's the power of statistics + great data

Unfortunately, it can't reason at all and would need other AI fields in order to be able to tackle some other simple puzzles


I'm not too sure. When I first encountered LLMs, I decided to try implementing one from scratch. I didn't get far, because it's difficult, and I only have finite time and energy. Still, it was the process of trying to implement some basic stuff that gave me an idea of how they work under the hood.


Tokenization. The tokens ChatGPT uses are longer than a single character. You're asking it to play the piano wearing oven mitts.


It's because it generates token by token based on probabilities, and it has no reasoning capabilities. Some AI experts like to name it reasoning for the purpose of benchmarking, but it isn't like what we humans think.

LLMs typically struggle to do things about the words themselves, or basic counting, some AIs like OpenAI use hacks to not have it fail in a miserable way.


LLMs encode words as numbers which don't have an relationship with the ltters of the word.


You are not "asking" chatgpt and there is no answer, and this is the basis for all confusion.

You are transforming text using a text transformer. You have input text and output text.

You are asking why is this output text not the what you expected. That is because this particular transformer has said weights.


This is the most technically correct answer - language models are at their core trying to continue a string of text. If you had trained the model on knowing exactly which letters represented the other letters in this particular puzzle, it might work.


What you are trying to solve is trivial if you have a suitable list of words. A solver would be easy to create and not need anything near as complex as AI. There are Scrabble solvers on the web which may work for you. Not everything is AI.


But why? Is what OP wants to know


That's like, the sky is blue so why isn't everything else blue? I give up.


There's a few answers/hypotheses already, and complete papers could be written to answer OP's questions, based on which improvements to LLMs could be made to do well on this particular problem class.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: