Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I really like Oxide's take on AI for prose: https://rfd.shared.oxide.computer/rfd/0576 and how it breaks the "social contract" where usually it takes more effort to write than to read, and so you have a sense that it's worth it to read.

So I get the frustration that "ai;dr" captures. On the other hand, I've also seen human writing incorrectly labeled AI. I wrote (using AI!) https://seeitwritten.com as a bit of an experiment on that front. It basically is a little keylogger that records your composition of the comment, so someone can replay it and see that it was written by a human (or a very sophisticated agent!). I've found it to be a little unsettling, though, having your rewrites and false starts available for all to see, so I'm not sure if I like it.

 help



My biggest sorrow right now is the fact that my beloved emdash is a major signal for AI generated content. I've been using it for decades now but these days, I almost always pause for a second.

> I've been using it for decades now but these days, I almost always pause for a second.

Wrote about this before [0] but my 2c: you shouldn't pause and you should keep using them because fuck these companies and their AI tools. We should not give them the power to dictate how we write.

[0]: https://manuelmoreale.com/thoughts/on-em-dashes


That's not really how it works.

Gemini tells me that for thousands of years, the swastika was used as "a symbol of positivity, luck and cosmic order". Try drawing it on something now and showing it to people. Is this an effective way to fight Nazism?

I think it's brave to keep using em dashes, but I don't think it's smart, because we human writers who like using them (myself very much included) will never have the mindshare to displace the culturally dominant meaning. At least, not until the dominant forces in AI decide of their own accord that they don't want their LLMs emitting so many of them.


When you say "show it to people" I guess you don't mean the people in India, Japan, etc who still use the symbol for its original purpose?

I think it's safe to assume they meant it within their specific cultural context. They the symbol has different connotations in other cultures doesn't really change the point being made.

My point is just: if a test for what a symbol ‘really means’ depends on choosing an audience that conveniently erases everyone who uses it differently, that’s not describing intrinsic meaning, that’s describing the author’s cultural bubble and bias.

And on em dashes—most people outside tech circles see no “AI fingerprint,” and designers like myself have loved them since early Mac DTP, so the suspicion feels hilariously retroactive and very knee-jerk. So what if somebody thinks my text here is written by a bot?


> So what if somebody thinks my text here is written by a bot?

Then they might not read it at all. I often zone out as soon as I expect I'm reading slop and that's the reason try to ensure my own writing isn't slop adjacent.

I'm also not sure there is an "AI bubble." Everyone I know is using it in every industry. Museum education, municipal health services, vehicle engineering, publishing, logistics, I'm seeing it everywhere.

As mentioned elsewhere I've seen non-tech people refer to them as "AI dashes."

> if a test for what a symbol ‘really means’

There was no suggestion of such a test. No symbol has an intrinsic meaning. The point GP was about considering how your output will be received.

That point was very obviously made within a specific cultural context, at the very least limited to the world of the Latin alphabet. I'm sure there are other LLM signifiers outside of that bubble.


> I often zone out as soon as I expect I'm reading slop and that's the reason try to ensure my own writing isn't slop adjacent.

And how is this a problem someone else has to address? Some people zone out when they see a text is too long: are we supposed to only publish short form then? I have 10 years of writing on my site, if someone in 2026 sees my use of em dashes and suddenly starts thinking that my content is AI generated that's their problem, not mine.

Too many people are willingly bending to adapt to what AI companies are doing. I'm personally not gonna do it. Because again, now it's em dashes, tomorrow it could be a set of words, or a way to structure text. I say fuck that.


> And how is this a problem someone else has to address?

Where has anyone made the claim that it is?

> Some people zone out when they see a text is too long: are we supposed to only publish short form then?

No, but a good writer will generally consider if their text is needlessly verbose and try to make it palatable to their audience.

> starts thinking that my content is AI generated that's their problem, not mine.

If you want to reach them with your writing then it might become a problem. Obviously the focus on em dashes alone isn't enough but it's undoubtedly one of the flags.

> Too many people are willingly bending to adapt to what AI companies are doing.

It's bending rather to what readers are feeling. It's not following the top down orders of a corporation, it's being aware of how technology shapes readers' expectations and adapting your writing to that.


I'm not confident that the average person is aware of an em dash nor that it is widely associated with AI; I think the current culturally dominant meaning is just a fat hyphen (which most people just call a dash anyway).

My wife was working from home recently and I overheard a meeting she was having. It's a very non technical field. She and her team were working on a presentation and her boss said "let's use one of those little AI dashes here."

I find that amusing but I know somewhere an English major is crying.

> Gemini tells me that for thousands of years, the swastika was used as "a symbol of positivity, luck and cosmic order". Try drawing it on something now and showing it to people. Is this an effective way to fight Nazism?

I'm happy to change my position when some 13 million people are killed by lunatics that used the em dash as the symbol of their ideology. Until then, I'll keep using it everywhere it's appropriate.

Also, if we don't have the guts to resist even when the stakes are this low and the consequences for our resistance are basically non existent, then society is doomed. We might as well roll on our side and die.

> At least, not until the dominant forces in AI decide of their own accord that they don't want their LLMs emitting so many of them.

It's not a power I'm willing to give them. What if tomorrow they tweak something and those tool start to use a specific word more often? Or a different punctuation sign? What do we do then? Do we constantly adapt, playing whack-a-mole? What if AI starts asking a lot more questions in their writing? Do we stop asking them as a result?

You feel free to adapt and bend. I'm personally not going to do it and if someone starts thinking that I'm using AI to write my thoughts and as a result that's on them.


Hooked cross is Nazi, historians apropriated it to diffent culture to save 'cross'

For what it's worth, whatever LLMs do extensively, they do because it's a convention in well-established writing styles.

LLMs have a bias towards expertise and confidence due to the proportion of books in their training set. They also lean towards an academic writing style for the same reason.

All this to say, if LLMs write like you were already writing, it means you have very good foundations. It's fine to avoid them out of fear, but you have this Internet stranger's permission to use your em dash pause to think "Oh yeah, I'm the reference for writing style."


> For what it's worth, whatever LLMs do extensively, they do because it's a convention in well-established writing styles.

I think that's only part of the story. I think that while it's true what LLMs do is somehow represented in their corpus of training data, they also lack any understanding of how to adapt to the context, how to find a suitable "voice", and how not to overdo it, unless you explicitly prompt them otherwise, which is too much of a burden. Their default voice sucks, basically.

So let's say they learned to speak in Redditese. They don't know when not to speak in that voice. They always seem to be trying to make persuasive arguments, follow patterns of "It's not X. It's Y. And you know it (mic drop)." But real humans don't speak like this all the damn time. If you speak like this to your mom or to your closest friends, you're basically an idiot.

It's not that you cannot speak like this. It's that you cannot do it all the time. And that's the real problem with LLMs.

(Sorry, couldn't resist!)


I think that bias is not due to the proportion of books and more due to how they are fine-tuned after the pretraining.

Aren’t books massively outweighed by the crawled internet corpus?

I would doubt that because books are probably weighed as higher quality and more trustworthy than random Reddit posts

Especially if it's unsupervised training


To quote Office Space, “Why should I change? He’s the one who sucks.”

Mostly because when I see an em dash now, I assume that it was written by AI, not that the author is one of the people who puts enough effort into their product that they intentionally use specific sized dashes.

AI might suck, but if the author doesn't change, they get categorized as a lazy AI user, unless the rest of their writing is so spectacular that it's obvious an AI didn't write it.

My personal situation is fine though. AI writing usually has better sentence structure, so it's pretty easy (to me at least) to distinguish my own writing from AI because I have run-on sentences and too many commas. Nobody will ever confuse me with a lazy AI user, I'm just plain bad at writing.


> assume

There's your trouble. The real problem is that most internet users are setting their baseline for "standard issue human writing" at exactly the level they themselves write. The problem is that more and more people do not draw a line between casual/professional writing, and as such balk at very normal professional writing as potentially AI-driven.

Blame OS developers for making it easy—SO easy!—to add all manner of special characters while typing if you wish, but the use of those characters, once they were within easy reach, grew well before AI writing became a widespread thing. If it hadn't, would AI be using it so much now?


As someone who frequently posts online- with em dashes- I wonder if I am part of the problem with training llms to use them so much- and am going to get punished in the future for doing so.

I also tend to way overuse parenthesis (because I tend to wander in the middle of sentences) but they haven't shown up much in llms so /shrug.


If you’re judging my writing so shallowly, I don’t think I’m writing for you.

> If you’re judging my writing so shallowly, I don’t think I’m writing for you.

No, you are writing for people who see LLM-signals and read on anyway.

Not sure that that's a win for you.


"Seeing LLM-signals" == "reading shallowly", so I think I covered that case.

Or you're writing for the people who haven't deluded themselves into thinking that they're magical LLM detectors, which definitely does seem like a win.

> Or you're writing for the people who haven't deluded themselves into thinking that they're magical LLM detectors, which definitely does seem like a win.

What delusion? The false positive rate just on HN alone is so low it's not even a rounding error.


I don’t think I’m judging shallowly- there is no em-dash on a standard keyboard. The one way it ends up in real writing is if you use a typesetting program like LaTeX, or Word changes an en-dash with auto formatting, or the user consciously interrupts their writing flow to insert the character with a special keystroke combination or by pasting it in. The proportion of people who do any of those things in writing for the web is quite small. The number of clearly AI written posts with em-dashes is quite large. So large, that I immediately suspect AI writing when I see an em-dash and I rarely see countering evidence that suggests the author is human but meticulous about how they write.

> there is no em-dash on a standard keyboard. The one way it ends up in real writing is (…)

Then you proceed to list multiple ways to do it, but neglected to mention that by default on Apple operating systems they are inserted automatically when typing “--“. It’s something you have to explicitly turn off of you don’t want it. On Apple mobile operating systems you can also long press the hyphen to get the option. Em-dashes are trivial to type.


Both of the examples you gave both fall under "a special keystroke combination," which I did list. Typing "--" is two keystrokes compared to one for an en-dash.

The iOS example isn't just "long press the hyphen" it's "press the [123] button, long press the hyphen, and slide your finger over the em-dash" compared to "press the [123] button, long press the hyphen" for the en-dash.

If you're going to argue at least be genuine. I didn't say it was hard to type an em-dash, I showed that every way to get an em-dash into your writing takes an extra step. Taking an extra step compared to other characters means it isn't trivial.

For someone writing publication quality work, em-dashes appear and if I see an em-dash in a book I don't assume AI writing. But for comments on the internet or a blog posts that aren't meticulous everywhere else, an en-dash is a pretty good signal that the work is AI generated. When people are writing, needing an extra step to insert an em-dash is disruptive to most people's train of thought.


> I didn't say it was hard to type an em-dash

Neither did I say you said that. I only said they are trivial to type. Which they are. I do it all the time, and it doesn’t interrupt my train of thought any more than a comma. I also do the keyboard shortcuts for things like “smart quotes” and apostrophes (’). For some of those I even have my own special snippets in Alfred, like typing "" produces “” with the caret in between. I can’t even tell you what the exact shortcuts for those are without looking at my fingers, because they are so ingrained in my muscle memory. I know I’m far from alone in that.

> But for comments on the internet or a blog posts that aren't meticulous everywhere else, an en-dash is a pretty good signal that the work is AI generated.

Provably false.

https://news.ycombinator.com/item?id=45071722


To be fair, if they don’t know that they probably run Android, and are you even writing for them?

I bet their bubbles are… green. The horror!


To continue the story, the guy saying this got fired and probably wouldn't have without taking this stand.

Exactly this! I love(d) using em dashes. Now they’ve become ehm dashes, experiencing exactly that pause — that moment of hesitation — that you describe

AI never uses em dashes in a pair like this, whereas most people who like em dashes do. Anyone who calls paired em dash writing AI is only revealing themselves to be a duffer.

In my limited text generation experience, LLMs use em-dashes precisely like that, only without spaces on the sides and always in pairs in a single sentence. Here some examples from my Gemini history:

"The colors we see—like blue, green, and hazel—are the result of Tyndall scattering."

"Several interlocking cognitive biases create a "safety net" around the familiar, making the unknown—even if objectively better—feel like a threat."

"A retrograde satellite will pass over its launch region twice every 24 hours—once on a "northbound" track and once on a "southbound" track—but because of the way Earth rotates, it won't pass over the exact same spot on every orbit."

"Central, leverages streaming telemetry to provide granular, real-time performance data—including metrics (e.g., CPU utilization, throughput, latency), logs, and traces—from its virtualized core and network edge devices."

"When these conditions are met—indicating a potential degradation in service quality (e.g., increased modem registration failures, high latency on a specific Remote PHY)—Grafana automatically triggers notifications through configured contact points (e.g., Slack, PagerDuty)."

After collecting these samples I've noticed that they are especially probably in questions like explain something or write descriptive text. In the short queries there is not much text in total to trigger this effect.


Yes, the LLMs have made great progress in that regard. It wasn't too long ago that the majority of dashes seen in LLM material could have been commas, periods, or nothing at all with no loss of tone or meaning, and almost none were used to offset parenthetical phrases. It was nearly exclusively an overdramatic flourish. And now look at them. They're growing up. Just makes you want to squeeze them until they pop.

> ”AI never uses em dashes in a pair”

I wish that were true, but I feel a little bit vindicated nevertheless


(Claude Opus 4.6 does use double hyphens to simulate em-dashes, in code comments.)

Embrace the double hyphen -- it's still attested in Garner's ;)

We're in the brief window of time when AI's writing style is the weirdness. It's an artifact of the production process, like JPG blur, MP3 distortion, autotune's rigidity. And it didn't take long for those things to become normalized, in fact for them to become artifacts that people proudly adopted and embraced. DJs release tracks built from MP3s samples instead of waves. Autotune is famously a 'sound' that was once something to be subtly added and never confessed to, but which now genres and artists lean into rather than away from.

Long story short: I think emoji in headings and lists, em dashes, and the vile TED Talk paragraph structure of "long sentence with lots of words asking a question or introducing a possibility. followed by. short sentences. rebutting. or affirming." are here to stay. My money is that it gets normalized and embraced as "well of course that's how you best communicate because I see it everywhere."


Short sentences were popularized in writing only in the last hundred and fifty years. Styles change.

Yes, but it's kinda sad, isn't it, that this robotic way of writing in turn teaches a new generation of people how to write?

Also, you forgot the extremely enervating: "It's not X. It's Y. <Clincher>."


> "well of course that's how you best communicate because I see it everywhere."

These assumptions might also change though. Up until now any writing you saw "everywhere" was probably written by someone who studied and loved written communication and was brining their artisanal care to the table. That's no longer the case.

It's called slop for a reason. When I come across a GitHub README written by AI I don't feel put off just because the author used AI to write it, I feel frustrated because it's genuinely poorly communicating with me. Fill of extraneous details, artifacts from the conversation, and stuff I already know ("uses GitHub to share the source democratically!").


I've gone back to using two dashes--LLMs typically don't write them that way.

I'm going to propose that we name this the --gnu-long-form :)

I used to enjoy the literate usage of the word "literally".

You'll get over it.


Using literally to mean figuratively goes back hundreds of years

Not to mention "seriously", "really", "truly", "very", "verily", etc. There's a long history of using words related to truth as intensifiers in English.

Also, unfortunately I have in my global instructions to never use em dashes...

Maybe I'll get over it eventually.

What I do – and I know this isn't conventional style – is use ex dashes. (Or, you could use spaces between em dashes, as incorrect as it is.)

Chicago says to format dashes like this—and ellipses . . . like this. . . .

AP says to format dashes like this — and ellipses ... like this. ...

Who's "correct"?


I've noticed that LLMs generated text often has spaces around em dashes, which I found odd. They don't always do that, but they do it often enough that it stood out to me since that isn't what you'd normally see.


> Or, you could use spaces between em dashes, as incorrect as it is.

That's the normal way of using them in British English. Though they also tend to be the (slightly shorter) en-dashes too.

I feel that style is often pretty common on the "old" internet - possibly related to how they can be so easily be replaced by a hyphen back when ascii was a likely limitation.


> Or, you could use spaces between em dashes, as incorrect as it is.

It's a matter of style preference. I support spaces around em-dashes — particularly for online writing, since em-dashes without spaces make selecting and copying text with precision an unnecessary frustration.

By the way,what other punctuation mark receives no space on at least one side?Wouldn't it look odd,make sentences harder to read,and make ideas more difficult to grok?I certainly think so.Don't you? /s


I use it to trigger false positives in haters – why not?

I don't think someone who doesn't want AI slop filtering out someone who gets mad at that to the point of calling them haters is really a false positive.

My history teacher thought me to use "8==3" instead, the Romans used it to sign their graffities.

This is the modern day "I can tell that's photoshopped because I've seen some 'shops in my day." The sooner we stop glorifying the people who think they're magical LLM detectors, the better, frankly.

It doesn't have to be a perfect filter to be a good heuristic. And unless you have a better suggestion how people can avoid slop then it'll keep being used.

The correct thing to do is to use an en-dash with spaces. ;)

You can still use them — it’s just that they have a new purpose; getting things ignored by AI detection or AI;DR.

Now you can ask for outlandish things at work knowing your boss won’t read it and his summariser will ignore it as slop — win.


You’re absolutely right. I hate AI writing — it’s not that I hate AI, it’s that it makes everything it says sound a specific combination of smug and authoritative — No matter the content. Once you realize it’s not saying anything, that’s the real aha moment.

\s


The problem with Ai writing is that its a waste of everyones time.

It’s literal content expansion, the opposite of gzip’ing a file.

It’s like a kid who has a 500 word essay due tomorrow who needs to pad their actual message up to spec.


Well, LLMs can be either side of that. They can also be used to turn something verbose into a series of bullet points.

I agree that reading an LLM-produced essay is a waste of time and (human) attention. But in the case of overly-verbose human writing, it's the human that's wasting my time[1], and the LLM is gzip'ing the spew.

[1] Looking at you, New Yorker magazine.


> I agree that reading an LLM-produced essay is a waste of time and (human) attention. But in the case of overly-verbose human writing, it's the human that's wasting my time[1], and the LLM is gzip'ing the spew.

gzip is lossless, LLM summaries are not.


We can run each LLM-produced essay through an LLM to reduce it to bullet points.

We can then run it through another LLM to reduce it to a single bullet point.

Then we can run it through a final filter, which reduces it to "ai;dr".


Right we are headed towards LLM generated slop summarized by another LLM. Wire format is expanded slop.

I like the idea that various communications media have implicit social contracts that can be broken. In my opinion, power point presentations break an implicit social contract that is held in handwritten talks: if it's worth you displaying a piece of information, so that I the listener feel the need to take it in or even copy it down, it has to be worth your time to actually physically write it on the board. With power point talks this is not honored, and the average power point talk is much, much worse than the average chalk talk. I bet there are lots of other examples.

Go thee to the land of government contracting and see thou how well thine ideas hold up.

I actually have worked in this space and it, uh, has not shaken my belief that powerpoint talks are bad.

Well the thing with powerpoint presentations is that the listener doesn't have to write them down and can instead use the copy you share with them if they need a future reference.

And it should still be worth for them to listen if you don't suck at presenting and don't just read the text from the slides.


In 2020 at the start of covid, I did an experiment I called Project 35 where, for 35 days straight before my 35th birthday, I wrote 3 times per day, for 10 minutes each, I livestreamed it and whatever I wrote I would put directly into a book with no edits. While I didn't invite many people to join the calls (maybe fear, maybe just not wanting to coordinate it all), I found the process to be more raw, more human, and less perfect than 10x edited writing. It also helped me get better at typing in the moment and not rewriting everything, especially for social media, HN, and other places.

Anyway, it's at https://www.jimkleiber.com/p35/ if you wanna check it out, all sessions posted as blog posts, I think there's a link to the ebook (pay-what-you-want) and there may be audio (I recorded myself reading the writing right after each session).

If you check it out, please let me know :-)


LLM-generated prose undermines a social contract of sorts: absent LLMs, it is presumed that of the reader and the writer, it is the writer that has undertaken the greater intellectual exertion. (Cantrill)

The amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it. (Brandolini)

https://en.wikipedia.org/wiki/Brandolini's_law


Years ago I wrote something similar to test a biometric security piece that used keystroke timings (dwell and stroke) to determine if the person typing the password is the same person who owns the account. Short version of a long story is that it would be trivial to get data for AI to reproduce human typing. Because I did it years ago using something only slightly more sophisticated than urandom.

Man failing that device's check because I'm sleep deprived or drunk would be a world of pain lol

> https://seeitwritten.com

Fun, I'd make playback speed something like 5x or whatever feels appropriate, I think nobody truly wants to watch those at 1x.


I had a take on this same thing a number of years ago. Much simpler, but the idea was just to see it at a glance. https://miniatureape.github.io/sprezzatura/

yeah the idea is not new at all:

https://news.ycombinator.com/item?id=557191

I can't believe etherpad lost this item...

edit: oh, I found the one I was looking for: https://byronm.com/13sentences.html


You could totally make a believable timing generation model from a few (hundreds) recordings of human writing. Detecting AI is hard...

Based on the programs I was nudged to as a child, it was a surprise to no one but me that I scored higher verbal on the SATs than I did math, which I would have told you was my favorite subject. Despite the fact that French was my easiest subject. I can still picture the look on my french teacher’s face if I’d have mentioned this in front of him.

There are a lot of people like me in software. I’m tempted to say we are “shouted down”, but honestly it’s hard to be shouted down when you can talk circles around some people. But we are definitely in a minority. There are actually a lot of parallels between creative writing and software and a few things that are more than parallel. Like refactoring.

If you’re actually present when writing docs instead of monologuing in your head about how you hate doing “this shit”, then there’s a lot of rubber ducking that can be done while writing documentation. And while I can’t say that “let the AI do it” will wipe out 100% of this value, because the AI will document what you wrote instead of what you meant to write, I do think you will lose at least 80% of that value by skipping out on these steps.


I don't like AI writing because it's bad writing. It's convoluted and inefficient and doesn't get to the point. If someone writes something that feels like AI, it doesn't matter if it was or not because it's still bad writing. I'm not talking about having a cliche here and there, but rather when the text is just incredibly inefficient.

I like the idea, but personally I would rather be thought a bot than show that I’m a human idiot who takes three tries to spell basic words.

I respect Oxide a lot. And here, too, with their non adoption of the marketing term (AI) for LLMs, ML.

ai;dr is the new "looks shopped, I can tell by the pixels".

Re: unsettling Perhaps it could replace any characters that will go on to be deleted with astrisks.

This can only be fixed by authors paying humans to read instead of the other way around.

To be fair, Oxide is a joke.

They want all this artisnal hand written prose under the candle light with the moon in the background. And you are a horrible person for using AI, blablabla.

But ask for feedback? And you get Inky, Blinky, Pinky, and Clyde. Aka ghosted. But boy, do they tell a good story. Just ain't fucking true.

Counter: companies deserve the same amount of time invested in their application as they spend on your response.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: