Comments on: How can I have my site removed from the Wayback Machine? http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/ Thu, 08 Dec 2011 17:37:11 +0000 hourly 1 http://wordpress.org/?v=3.4.2 By: Ichabod Mudd http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-822 Ichabod Mudd Thu, 08 Dec 2011 17:37:11 +0000 http://wayback.blog.archive.org/?p=23#comment-822 Brilliant answer, on the fly checks for robots. nice design too. i tested it, and it works. Brilliant answer, on the fly checks for robots.
nice design too.
i tested it, and it works.

]]>
By: wayback http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-746 wayback Wed, 30 Nov 2011 19:29:37 +0000 http://wayback.blog.archive.org/?p=23#comment-746 Hi dnm, I think perhaps you've misunderstood how the robots.txt block works. When someone tries to view a site through the Wayback Machine, *before* we display the archived site to them we first go to the live web site and check the live robots.txt file to see whether it tells us to block showing the site. The "answer" from your live site may be saved for up to 24 hours, so changing your robots.txt file isn't instantaneous but it should take effect for blocking content from the Wayback within about 24 hours. Thanks, Alexis (IA) Hi dnm,
I think perhaps you’ve misunderstood how the robots.txt block works. When someone tries to view a site through the Wayback Machine, *before* we display the archived site to them we first go to the live web site and check the live robots.txt file to see whether it tells us to block showing the site. The “answer” from your live site may be saved for up to 24 hours, so changing your robots.txt file isn’t instantaneous but it should take effect for blocking content from the Wayback within about 24 hours.
Thanks,
Alexis (IA)

]]>
By: dnm http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-742 dnm Wed, 30 Nov 2011 06:29:52 +0000 http://wayback.blog.archive.org/?p=23#comment-742 Lawrence's point still stands -- the robots.txt file only helps for NEXT TIME you bother to crawl the site. In some sites' cases, that can be months. What if we want our sites removed <i>right now</i>? That should be an automated process we can perform, not something we have to e-mail about, nor wait until the next crawl. Even a, 'check my site now' button would help so that the robots.txt can be picked up for the new sites and wipe out old content. Lawrence’s point still stands — the robots.txt file only helps for NEXT TIME you bother to crawl the site. In some sites’ cases, that can be months. What if we want our sites removed right now? That should be an automated process we can perform, not something we have to e-mail about, nor wait until the next crawl.

Even a, ‘check my site now’ button would help so that the robots.txt can be picked up for the new sites and wipe out old content.

]]>
By: wayback http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-343 wayback Thu, 07 Jul 2011 18:58:22 +0000 http://wayback.blog.archive.org/?p=23#comment-343 Hi Lawrence, Placing a robots.txt file on your site does exclude historically collected pages from the Wayback Machine. From above: <blockquote>By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled in the future <strong>as well as exclude any historical pages</strong> from the Wayback Machine.</blockquote> If you are unable to do this, please email info@archive.org. Thanks, Alexis Hi Lawrence,

Placing a robots.txt file on your site does exclude historically collected pages from the Wayback Machine. From above:

By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled in the future as well as exclude any historical pages from the Wayback Machine.

If you are unable to do this, please email info@archive.org.

Thanks,
Alexis

]]>
By: Lawrence http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-339 Lawrence Thu, 07 Jul 2011 12:50:04 +0000 http://wayback.blog.archive.org/?p=23#comment-339 Dear Wayback Team, The title of this article is "How can I have my site removed from the Wayback Machine?" but it provides only instructions on excluding your pages *from now on*. What about *pages we published in the past*, which we do not want archived? Even if there's nothing embarrassing or hazardous about them at all, we may want them completely e-shredded. I suspect this wish is what brings most visitors to this page. Please set up a page with clear instructions on doing just that. Dear Wayback Team,

The title of this article is “How can I have my site removed from the Wayback Machine?” but it provides only instructions on excluding your pages *from now on*.

What about *pages we published in the past*, which we do not want archived? Even if there’s nothing embarrassing or hazardous about them at all, we may want them completely e-shredded.

I suspect this wish is what brings most visitors to this page. Please set up a page with clear instructions on doing just that.

]]>
By: wayback http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-8 wayback Wed, 19 Jan 2011 01:11:07 +0000 http://wayback.blog.archive.org/?p=23#comment-8 Hi dhs, That appears to be what we actually archived from that domain when it was crawled. The current, live site at http://sarahpalin.com/ just reads, "This page intentionally left blank." so it doesn't seem likely there was ever much real content there. Thanks, Wayback team Hi dhs,

That appears to be what we actually archived from that domain when it was crawled. The current, live site at http://sarahpalin.com/ just reads, “This page intentionally left blank.” so it doesn’t seem likely there was ever much real content there.

Thanks,
Wayback team

]]>
By: dhs http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comment-5 dhs Thu, 13 Jan 2011 03:43:03 +0000 http://wayback.blog.archive.org/?p=23#comment-5 Forbidden You don't have permission to access / on this server. Above is the error I get when trying to access ANY page of sarahpalin.com. Can you fix? Thanks Forbidden

You don’t have permission to access / on this server.

Above is the error I get when trying to access ANY page of sarahpalin.com.
Can you fix?

Thanks

]]>