Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Uh, no. We're literally doing the opposite. We used to have our own caching infrastructure for "Always Online" and we're getting rid of it and using archive.org instead.


Thanks, so maybe this page is outdated where it mentions your own crawler with user-agent? Or does the Internet Archive use it for these crawls? https://www.cloudflare.com/always-online/


How do you handle robots.txt? The previous incarnation of Always Online didn't care about robots.txt, while archive.org does.


https://blog.cloudflare.com/cloudflares-always-online-and-th...

We tell archive.org about the URI, they crawl it. They handle robots.txt.


archive.org doesn't handle robots.txt in any meaningful way (see my comment above at https://news.ycombinator.com/item?id=24516875 ). If that's changed recently, I'd like to know more.


Note that archive.org stopped respecting robots.txt since 2017. [1]

In my experience, the site owner must email archive.org support to be excluded from its crawler and archiving.

[1]: https://boingboing.net/2017/04/22/internet-archive-to-ignore...


And thank god for it. Trying to explain to end users why their site was not, in fact, always online on account of the creaking behemoth that plodded along in IAD barely managing to successfully cache and serve anything ever was never any fun.

The original Always Online infra was long unloved and probably kept on life support far too long for lack of want to deprecate an early feature.


"We're literally doing the opposite."

How does what you do now contradict what you will do in the future? What legal assurances are there that you won't do hat when you leave? (See Facebook/Oculus "no Facebook account promise")


Wait... so you think Cloudflare's master plan is to roll this new thing out to get people to accept it as normal, and then suddenly make a big shift to.... what they currently have?

Why don't they skip this step and just keep what they have now, then? No one seems to be up in arms that they currently provide their customers offline caching...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: