This is an oft-repeated meme, but I’m convinced the people saying it are either ...

jazzyjackson · 2026-03-19T03:52:40 1773892360

If you have 10,000 people flipping coins over and over, one person will be experiencing a streak of heads, another a streak of tails.

Which is to say, of a million people who just started playing with LLMs, a bunch of people will get hit or miss, while one guy is winning the neural net lottery and has the experience of the AI nailing every request, some poor bloke is trying to see what all the hype is about and cannot get one response that isn’t fully hallucinated garbage

ericpauley · 2026-03-19T04:02:23 1773892943

Sure, but that doesn’t explain the volume of these complaints. I think the more likely answer is the pitiful sycophancy of some models as demonstrated in BSBench.

odo1242 · 2026-03-19T04:13:53 1773893633

Claude Opus 4.6 is the best possible model to use in this test, with the least sycophancy. OpenAI and Gemini models are bad in comparison.

mkozlows · 2026-03-19T04:38:32 1773895112

ChatGPT thinking models are very good; the instant model is bad. Gemini is always desperate to find an answer, and will give you one no matter what.

odo1242 · 2026-03-20T05:24:48 1773984288

Nope, I use GitHub Copilot (agentic mode) and I end up having to use the (more expensive) Claude model because ChatGPT never second-guesses me or even itself. Gemini is slightly worse though.

odo1242 · 2026-03-20T05:56:45 1773986205

For a less biased source, check out BSBench (where Claude dominates, and the highest rating GPT is 2x worse): https://petergpt.github.io/bullshit-benchmark/viewer/index.v...

ahofmann · 2026-03-19T07:07:20 1773904040

I have access to the ChatGPT account of my boss and it is unusable sycophancy slop, horrible to read because every information is buried under endless emojis and the like. And it is almost indistinguishable if the LLM is wrong or right, every answer looks the same, often with a "my final answer" at the end. It's a mess.

I'm using Claude Opus 4.6 and it is much calmer, or "professional" in tone and much more information and almost no fluff.

braebo · 2026-03-19T10:49:33 1773917373

Thank you for saying this.. ChatGPT is SO BAD. I suspect anyone that says OpenAI models are good are either lying or botting.

AlexeyBelov · 2026-03-21T07:02:54 1774076574

Of course. They are using it wrong, their prompts are bad and actually they should try the latest model. It's always the same.

Garlef · 2026-03-19T13:28:57 1773926937

> anthropomorphism

I think it's a topic worthy of discussion. But I would propably not leave it to Searle...

basilikum · 2026-03-19T03:02:27 1773889347

Can you share your system prompt?

reverius42 · 2026-03-19T03:52:20 1773892340

I'm seeing the described behavior with whatever the default system prompt is in Claude Code.