Thank you for trying! I first built it as 'detect the human' response, but that was counter to the 'slop or not' framing. Yeah I'm also observing the same based on the first few hundred people's results. The harder models seem to write almost too well and that's generally not how humans write on the internet unless it is a blog post/essay. The easier models seem to be the ones that are tripping people the most.
Ha :) I'm not building models, nor am I affiliated with any big labs. The idea is to use this to educate people how to spot tells of AI writing. Although like any data that's made open this can be used to train future models as well I suppose.
thank you so much for taking a look :) Yeah you'd be surprised how difficult it can get to spot nuances sometimes. Sometimes, there isn't any nuance at all and the AI is just as good at writing about something pretending to know about the topic.
You'd be surprised at the nuances we tend to miss :)
This time around I prompted the models not necessarily to be adversarial - i didn't ask them to try and fool the reader. But i gave them contextual info - something to the effect of "you're a user posting on hacker news"
True, if you look for all "obvious patterns" and filter those out of the dataset, not much will be left probably. Maybe the best is then to just publish as complete a dataset as possible, so all questions, all user answers, for each user the nr of questions they did, time for each question, etc. Then people using that dataset can draw their own conclusions.
Thanks for checking it out! The color signal is useful feedback. Let me think about it and rework!
Yeah there are some very obvious tells, but the models that are most capable are very good at writing like human.
Especially when the human responses for reddit or HN prompts were presumably made after reading the content of the article or the post; whilw the model is simply going off of the title.
Thanks for checking it out! The obvious ones are (hopefully) weaker models :) but yes my experience has been unless you're engaging with human written content consistently the line really blurs easily.
absolutely. May be the best way to do this would be some kind of a recipe store where the user can run (we can fuzzy match?) tshark oneliners. I'd love your thoughts on what the easiest/quickest integration would be.