Hacker Newsnew | past | comments | ask | show | jobs | submit | dspillett's commentslogin

The same could be said of robots.txt

And anything else that might tell them not to access something.


robots.txt predates the modern web though

My point was that llms.txt not working is no different from them ignoring everything else that came before and probably everything that is yet to come.

If they want it, they will take it, polite directives in text files will have no effect.


> it was blocked by captcha

If there is a captcha there to block the automated creation of accounts, that it was difficult for you to autonomously create accounts is completely intentional.

> and consume all my tokens

Your bad coding/config (or your agent's bad coding/config) allowing run-away token use, is a problem for you, not a problem for everyone else.

> but we are in agentic era now.

Enabling autonomous account creation will enable a huge pile of spam and scam account creation, we've been in the spam/scam era for decades and it will not be over in the foreseeable future, your agents will have to put up with that just like us humans do.

> that feature is a friction for agents

[reaches for microscope to aid playing the appropriate violin]

> this validates the thesis and narrative of agentic first git platform

Perhaps your next thesis should be the result of a project where you design and fully verify a system that will solve the spam/scam problem that enabling your use case will also enable, instead of just expecting the world to make all the effort of rearranging itself around the way you want to work.

Automated account creation is quite likely a direct breach of their AUP anyway, maybe you should instruct your agents to check such things and try not to do things that are in direct contravention of the rules of the tool they are trying to use/abuse.

--------

If your response to this is “Fine, our agents just won't use your systems then.”, then that is great: The system works!


https://archive.is/pcRNR for those who prefer minimal stalking as they travel the web.

[even when the top-level tracking preferences look full off, if you dig down you'll find some “part” on, and you can't set them full-off (you are blocked from disabling tracking by Amazon at least)]

[Mental note to self: add “windowslatest.com” to “are you really sure you want to go there?” DNS greylist]


£5.50+ is for the better “main” options, some of which used to be in the base deal, in most places that do a meal deal.

> often Conservatives

Almost always conservatives, the key exception in recent history being Tony “Tory Lite” Blair's time in office (who pretty much ignored many years of promises to undo the direction Thatcher and Major had taken NHS and university/student funding should Labour be returned to power, greatly irritating many of us who voted for them that time around). Unfortunately this is a common pattern: parties like Labour get control and realise how hard it is going to be to fight what has been set in motion so do too little or actively push on in the existing direction (just applying a little lipstick to the pig for public appearances). The current lot are trying to do better in that regard, but are failing so impressively elsewhere that they likely won't have a second term and one term is not enough to build momentum, so their replacement will just put a stop to any good that has actually been achieved. The scary thing is that their replacement (assuming Refrom don't rip themselves apart from the inside between now and the next election, which is something there is still hope of happening) might make the old Tories look extremely moderate.


> they will be public as paid training data.

Your data is already training data. If they promise to delete everything from their models or those elsewhere that they made the data available to, even if you pay, I'd call them liars.


> I think this move is subjectively scummy

I'd argue that it is objectively scummy.


If they can't, then why did they offer to (or at least give the impression that they were going to)?

That is just restarting the problem: democracy breaks when money easily overrides the needs/desires of the people.

We also have votes, but unfortunately getting people to consider this sort of issue while casting their ballot¹ is rather difficult. Getting people to vote for the bigger picture for their benefit and that of us all doesn't work well as they are far more likely to vote on a single issue that has been in the news recently and whoever they vote for will u-turn on once in control anyway…

--------

[1] or even getting them to care at all, in the case of European elections [not that this is directly relevant here because of spit brexit].


Another possibility for why it needs to be done that way is dealing with error conditions.

I've not looked at the code (or even the man pages) and it is a long time since I touched anything that low level, so this might be completely wrong, but if there is an error before the next 64KiB (including just hitting EOF) then the semantics could be different. Asking for 1x64KiB I would expect to just error as there aren't the requested number of bytes. Asking for 64Ki lots of 1 byte might simple error just the same, or it might at least populate the buffer with what it can read, or if the meaning of 1,65536 is actually “up to 64Ki lots of 1B” then it would populate the buffer as far as possible and return the amount read rather than an error condition.

If the per-byte option is slow but still fast enough, and dealing with the semantics is less faf, then people will go for that because the tiny time loss is worth the larger effort reduction. Of course this assumes the underlying system doesn't change, as with the “making local code to run as on-demand networked code” example higher in the thread which changes the relative performance characteristics of the two calling methods significantly.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: