Ahh, but the cookie is created client-side using some Javascript which does some computationally intensive stuff for a second. Doesn't bother you as an end-user, but if you're writing a crawler and you're not driving a headless browser (expensive) then you probably don't trivially have the ability to run arbitrary Javascript code (or else, you have the work of integrating Deno or something to do that part for you).
Either way it means you can't just curl the webpage and get it. That's obviously the point when defeating DDOS attacks is the use case, but it doesn't work for crawlers, many of which are legitimate "users" like in the article.
These services should offer some other easier proof-of-work mechanism.
> computationally intensive stuff for a second. Doesn't bother you as an end-user
Actually, it does. They waste the CPU cycles of my devices for absolutely no good reason. It's very environmentally unfriendly as well. It shifts the cost onto the end user, not a very nice way to go about it. They probably don't describe the drawbacks to their own customers, either, so, everyone simply opts-in thinking there's no drawbacks; but the end result in a diminished user experience and tonnes of extra CO2 emissions throughout the world.
I agree 100% and wish there was a better mechanism to prove you're not an attacker, but it's hard to think of one that isn't annoying like a traditional CAPTCHA is.
The bigger question that noone's asking is the cost to generate the page:
Does it take them 1 second of CPU time to generate the page?
* If not, isn't that a disproportionate amount of time for the client to do some silly throw-away work?!
* If yes, why don't they improve their infrastructure such that static pages could be properly cached as they should be, and a slightly stale versions could be served to everyone at a lower total cost than if you require even a few select users with "abnormal" parameters to solve the captchas?
At the end of the day, all these DDoS protections are placed in front of pages that by all accounts should be cacheable static pages, which should take less time to produce and consume than the repeated 5-second JavaScript captchas that they replace these static pages with.
The underlying issue is that one solution could be sold as a standalone one-size-fits-all product, but the other one can not, so, that's why we have to face daily disappointment if our browsing setup is "abnormal" in any way.
Ahh, but the cookie is created client-side using some Javascript which does some computationally intensive stuff for a second. Doesn't bother you as an end-user, but if you're writing a crawler and you're not driving a headless browser (expensive) then you probably don't trivially have the ability to run arbitrary Javascript code (or else, you have the work of integrating Deno or something to do that part for you).
Either way it means you can't just curl the webpage and get it. That's obviously the point when defeating DDOS attacks is the use case, but it doesn't work for crawlers, many of which are legitimate "users" like in the article.
These services should offer some other easier proof-of-work mechanism.