archive.is has both things from Patri’s LiveJournal:
“Obama is a Muslim :). Seriously!”
“Yer buttons, they got pressed!”
(Unlike archive.org, archive.is does not, IIRC, respect robots.txt.)
Gwern Branwen has a page on link rot and URL archiving.
Why does archive.is not obey robots.txt? Because it is not a free-walking crawler, it saves only one page acting as a direct agent of the human user.
Why does archive.is not obey robots.txt?
Because it is not a free-walking crawler, it saves only one page acting as a direct agent of the human user.
--archive.is faq
A few months ago we stopped referring to robots.txt files on U.S. government and military web sites [...] As we have moved towards broader access it has not caused problems, which we take as a good sign. We are now looking to do this more broadly.
--archive.org blog, 2017-04-17
archive.is has both things from Patri’s LiveJournal:
“Obama is a Muslim :). Seriously!”
“Yer buttons, they got pressed!”
(Unlike archive.org, archive.is does not, IIRC, respect robots.txt.)
Gwern Branwen has a page on link rot and URL archiving.
--archive.is faq
--archive.org blog, 2017-04-17