Uh, no. We're literally doing the opposite. We used to have our own caching infr...

Nemo_bis · on Sept 30, 2020

Thanks, so maybe this page is outdated where it mentions your own crawler with user-agent? Or does the Internet Archive use it for these crawls? https://www.cloudflare.com/always-online/

jorams · on Sept 17, 2020

How do you handle robots.txt? The previous incarnation of Always Online didn't care about robots.txt, while archive.org does.

jgrahamc · on Sept 17, 2020

https://blog.cloudflare.com/cloudflares-always-online-and-th...

We tell archive.org about the URI, they crawl it. They handle robots.txt.

AnonHP · on Sept 18, 2020

archive.org doesn't handle robots.txt in any meaningful way (see my comment above at https://news.ycombinator.com/item?id=24516875 ). If that's changed recently, I'd like to know more.

AnonHP · on Sept 18, 2020

Note that archive.org stopped respecting robots.txt since 2017. [1]

In my experience, the site owner must email archive.org support to be excluded from its crawler and archiving.

[1]: https://boingboing.net/2017/04/22/internet-archive-to-ignore...

throwaway56909 · on Sept 17, 2020

And thank god for it. Trying to explain to end users why their site was not, in fact, always online on account of the creaking behemoth that plodded along in IAD barely managing to successfully cache and serve anything ever was never any fun.

The original Always Online infra was long unloved and probably kept on life support far too long for lack of want to deprecate an early feature.

KingOfCoders · on Sept 17, 2020

"We're literally doing the opposite."

How does what you do now contradict what you will do in the future? What legal assurances are there that you won't do hat when you leave? (See Facebook/Oculus "no Facebook account promise")

cortesoft · on Sept 17, 2020

Wait... so you think Cloudflare's master plan is to roll this new thing out to get people to accept it as normal, and then suddenly make a big shift to.... what they currently have?

Why don't they skip this step and just keep what they have now, then? No one seems to be up in arms that they currently provide their customers offline caching...