Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Technically the garbage in/garbage out, problem is not being hand waved away. I've seen a lot of articles on this, or sometimes called a degrading feedback loop. The more of the web that is LLM generated, then the more new models will be trained on generated data, and will fuzz out. Or 'drift'.

For a specific example. Sorry, I didn't grab screen shots at the time. It had to do with updating a datafame in pandas. It gave me solution that generated an error, I'd continue to ask it to change steps to fix previous errors, and it would go in a circle, fix it, but generate other warnings, and further changes to eliminate warnings, and it would recommend the same thing that originally caused an error.

Also. I'm a big fan. Use GPT-4 all the time. SO not waving away, but kind of curious how it sometimes fails in un-expected ways.



> The more of the web that is LLM generated, then the more new models will be trained on generated data, and will fuzz out

And yet it's so obvious that a random Hackernews independently discovers it and repeats it on every Chat GPT post, and prophesies it as some inevitable future. Not could happen, will happen. The clueless researchers will be blindsided by this of course, they'll never see it coming from their ivory tower.

And yes Chat GPT fails to write code that runs all the time. But it's not very interesting to talk about without an example.


How? It isn't exactly easy to re-produce these examples. I'd have to write a few pages to document and explain it. And scrub it to remove anything too internal, so create a vanilla example of the bug. And then it would be too long to go into a post, so what then, I'd have to go sign up to blog it somewhere and link to it.

I'm not arguing that GPT is bad. Just that it is as susceptible to rabbit wholes as any human.

I'm actually having a hard time narrowing down where your frustration is aimed.

At naysayers? At those that don't put effort into documenting? At GPT itself? Or that a news site on the internet dares have repetition ?


So someone could sabotage LLMs by writing some scripts to fill GitHub (or whatever other corpus is used) with LLM-generated crap? Someone must be doing this, no?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: