Wayback Machine Beta FAQs http://faq.web.archive.org Tue, 16 Aug 2011 19:46:36 +0000 en-US hourly 1 http://wordpress.org/?v=3.4.2 How can I find all the pages from one site? http://faq.web.archive.org/find-all-the-pages-from-one-site/ http://faq.web.archive.org/find-all-the-pages-from-one-site/#comments Tue, 16 Aug 2011 19:45:37 +0000 wayback http://faq.web.archive.org/?p=121 You can search for all of the pages archived from a particular site by adding an asterisk (*) to the end of the URL.  For example, instead of searching for the page archive.org, you could search for all saved archive.org pages by entering archive.org* in the search field:

http://wayback.archive.org/web/*/archive.org*

Additionally, if you are looking for [...]]]> You can search for all of the pages archived from a particular site by adding an asterisk (*) to the end of the URL.  For example, instead of searching for the page archive.org, you could search for all saved archive.org pages by entering archive.org* in the search field:

  • http://wayback.archive.org/web/*/archive.org*

Additionally, if you are looking for a particular type of file (PDF, JPG, etc.) that we may have saved you can filter the search results by file extension.  See example below:

filtering results

]]> http://faq.web.archive.org/find-all-the-pages-from-one-site/feed/ 0
How can I view a page without the Wayback code in it? http://faq.web.archive.org/page-without-wayback-code/ http://faq.web.archive.org/page-without-wayback-code/#comments Thu, 11 Aug 2011 18:56:29 +0000 wayback http://faq.web.archive.org/?p=109 Pages in the Wayback Machine have been rewritten to preserve the browsing experience.  For example, when you look up the home page of archive.org in the Wayback Machine from 2005 and click a link there, we take you to that new page as we archived it in 2005 (or as close as we can find).  [...]]]> Pages in the Wayback Machine have been rewritten to preserve the browsing experience.  For example, when you look up the home page of archive.org in the Wayback Machine from 2005 and click a link there, we take you to that new page as we archived it in 2005 (or as close as we can find).  If we did not rewrite the links to point to archived pages, you would be bounced out of the archive to a live, current version of the page (if it still exists).  We also add code to render the navigational toolbar that you see at the top of archived pages.

If you want to view a page from the Wayback Machine that does not have all of the Wayback rewritten code in it, you can view the bare, archived page by adding “id_” to the end of the date in the URL.

Page with rewritten links and other Wayback code in it:

  • http://web.archive.org/web/20051001001126/http://www.archive.org/

Page rendered exactly as it was archived:

  • http://web.archive.org/web/20051001001126id_/http://www.archive.org/
]]>
http://faq.web.archive.org/page-without-wayback-code/feed/ 0
Can I get a copy of this web page? http://faq.web.archive.org/can-i-get-a-copy-of-this-web-page/ http://faq.web.archive.org/can-i-get-a-copy-of-this-web-page/#comments Thu, 11 Aug 2011 18:48:54 +0000 wayback http://faq.web.archive.org/?p=105 Internet Archive’s terms of use specify that users of the Wayback Machine are not to copy data from the collection.

Many people also ask us if they can get copies of entire web sites.   Our terms of use do not cover backups for the general public. However, you may use the Internet Archive [...]]]> Internet Archive’s terms of use specify that users of the Wayback Machine are not to copy data from the collection.

Many people also ask us if they can get copies of entire web sites.   Our terms of use do not cover backups for the general public. However, you may use the Internet Archive Wayback Machine to save archived versions of a site to which you own the rights. We can’t guarantee that a site has been or will be archived, and Internet Archive does not provide a site packaging service.  However, Old Dominion University has a tool called Warrick that may work for this purpose.  Please keep in mind that this is a third party tool and we can not promise results.

]]> http://faq.web.archive.org/can-i-get-a-copy-of-this-web-page/feed/ 0
The page I want redirects now – how can I see the old versions? http://faq.web.archive.org/the-page-i-want-redirects-now-how-can-i-see-the-old-versions/ http://faq.web.archive.org/the-page-i-want-redirects-now-how-can-i-see-the-old-versions/#comments Sat, 22 Jan 2011 01:44:13 +0000 wayback http://faq.waybackmachine.org/?p=90 When you search from the front page or the toolbar of the new Wayback BETA, we take you to the latest capture of a site by default.  If that site had a redirect on it when we crawled it, we’ll automatically redirect you to the new site as well (we’re replaying the behavior of the [...]]]> When you search from the front page or the toolbar of the new Wayback BETA, we take you to the latest capture of a site by default.  If that site had a redirect on it when we crawled it, we’ll automatically redirect you to the new site as well (we’re replaying the behavior of the site as it was when we archived it).  You can easily get around this.  From the front page, enter the URL of the site you want to visit and press the “Show All” button.  That will take you to a page that lists all of the captures for the URL over time.  To see the site as it was, just select a date from before the site started redirecting!

]]>
http://faq.web.archive.org/the-page-i-want-redirects-now-how-can-i-see-the-old-versions/feed/ 0
How should I report issues? http://faq.web.archive.org/how-should-i-report-issues/ http://faq.web.archive.org/how-should-i-report-issues/#comments Wed, 19 Jan 2011 02:49:28 +0000 wayback http://faq.waybackmachine.org/?p=68 Thanks for trying out the BETA test version of the Wayback Machine.  Your feedback will help us iron out any remaining issues, so we encourage you to report problems by contacting us directly, or by leaving comments on this post.  You may also want to check out the known issues page to see [...]]]> Thanks for trying out the BETA test version of the Wayback Machine.  Your feedback will help us iron out any remaining issues, so we encourage you to report problems by contacting us directly, or by leaving comments on this post.  You may also want to check out the known issues page to see if we’re already aware of the problem.

When you report issues, please try to include the following if possible:

  • the full web address (URL) of the page you requested
  • the URL of the page you ended up at, if different
  • a description of the problem
  • any information you can think of that will help us recreate the problem, such as what you were doing leading up to the issue (i.e. “I went to this URL, pressed this button, then saw the problem”)
  • the browser you are using (Internet Explorer, Firefox, Safari, etc.), including the version number if you know it

Examples

This problem would be HARD for us to track down and resolve:
“NPR is messed up.”
We won’t know what page on NPR had a problem, or on what archived date, and we won’t know what problem to look for even if we manage to figure it out.

This problem would be EASY for us to track down:
“I entered npr.org in the search box on the front page, which took me to http://replay.waybackmachine.org/20091014164421/http://www.npr.org/.  That page looked fine.  But when I clicked on the graph in the toolbar, I ended up at http://replay.waybackmachine.org/20040701042039/http://www.npr.org// and got a page that keeps ‘blinking’ at me. I’m using Firefox version 3.6.13.”
This report gives us all of the information we need to find and reproduce the issue.

So, the more descriptive you can be, the more likely it is that we will be able to find and fix the problem.  We may contact you directly by email to get more information if necessary.  You can use the contact form on this site to let us know about issues, or feel free to post a comment here.  Thanks for taking the time to let us know about any issues you find.

]]>
http://faq.web.archive.org/how-should-i-report-issues/feed/ 0
Can I get just one page archived? http://faq.web.archive.org/can-i-get-just-one-page-archived/ http://faq.web.archive.org/can-i-get-just-one-page-archived/#comments Wed, 19 Jan 2011 01:32:06 +0000 wayback http://faq.waybackmachine.org/?p=62 If you want to make sure we capture one single page of a site, you can follow the directions below.  We will only capture the one page you enter, and it does not get added to any sort of list to be crawled regularly, so doing this will not ensure that your site will be [...]]]> If you want to make sure we capture one single page of a site, you can follow the directions below.  We will only capture the one page you enter, and it does not get added to any sort of list to be crawled regularly, so doing this will not ensure that your site will be captured regularly or with any completeness.  (If you want your entire site archived, please read the My site’s not archived! How can I add it? FAQ.)

To archive a single page:

  • Go to the BETA test version of the Wayback Machine at http://web.archive.org
  • Enter the URL of the page in the search box
  • You will either go to our last indexed capture of the site (in which case the page is archived and indexed already), or you will go to the live version of the site with a toolbar at the top that tells you we’ve just archived it.

When you see this toolbar, we will have just captured a copy of this page for the archive (if robots.txt allows us to). As explained in the toolbar, you will not be able to access this crawled version through the Wayback Machine immediately – we have to add it to the index first, and we only update the index every few months. Again, we only capture that one page and we only capture it one time; the site will not be archived beyond this.  If you want a full archive of your site, please read the My site’s not archived! How can I add it? FAQ.

]]>
http://faq.web.archive.org/can-i-get-just-one-page-archived/feed/ 0
What’s the difference between the classic Wayback Machine and the new BETA test version? http://faq.web.archive.org/whats-the-difference-between-the-classic-wayback-machine-and-the-new-beta-version/ http://faq.web.archive.org/whats-the-difference-between-the-classic-wayback-machine-and-the-new-beta-version/#comments Tue, 21 Dec 2010 23:32:02 +0000 wayback http://wayback.blog.archive.org/?p=6 The Wayback Machine (WM) was first launched in 2001 using proprietary software written by Alexa Internet.  A few years ago, the Internet Archive wrote our own version of the Wayback and made that software open source.  We have been hosting smaller collections using the newer, open source software but had not attempted to [...]]]> The Wayback Machine (WM) was first launched in 2001 using proprietary software written by Alexa Internet.  A few years ago, the Internet Archive wrote our own version of the Wayback and made that software open source.  We have been hosting smaller collections using the newer, open source software but had not attempted to transfer the entire, web wide collection to use the new software… until now!

The BETA test software and index are designed to be faster than the classic WM.  The two other biggest differences you’re likely to notice are the new toolbar on archived pages, and the new look of the calendar of page captures.

The new toolbar will hopefully make it easier for you to know what date you’re looking at, move through time more quickly using the sparkline of captures over time, and do new searches without having to leave the page you’re looking at.

The new calendar page is just cooler.  If you have suggestions, let us know!

]]>
http://faq.web.archive.org/whats-the-difference-between-the-classic-wayback-machine-and-the-new-beta-version/feed/ 12
What is the Wayback Machine? http://faq.web.archive.org/what-is-the-wayback-machine/ http://faq.web.archive.org/what-is-the-wayback-machine/#comments Tue, 21 Dec 2010 22:12:14 +0000 wayback http://wayback.blog.archive.org/?p=1 The Wayback Machine is an historical archive of preserved web pages.  Type in a URL and start surfing through time!

Most societies agree that it is important to preserve artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now [...]]]> The Wayback Machine is an historical archive of preserved web pages.  Type in a URL and start surfing through time!

Most societies agree that it is important to preserve artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. Internet Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars. The Archive collaborates with institutions including the Library of Congress and the Smithsonian.

The archive of pages goes back to 1996.  The original Wayback Machine interface was released in 2001 with about 10 billion pages.

]]> http://faq.web.archive.org/what-is-the-wayback-machine/feed/ 2
What are the known issues with the BETA test version? http://faq.web.archive.org/what-are-the-known-issues-with-the-beta-version/ http://faq.web.archive.org/what-are-the-known-issues-with-the-beta-version/#comments Wed, 22 Dec 2010 01:18:27 +0000 wayback http://wayback.blog.archive.org/?p=28 We’ve tried to make sure the BETA test version works just as well as the classic, but we’ve probably overlooked an issue or two.  We would very much appreciate your feedback on the site, particularly if you run into something that looks like a bug.  Take a look at the How should I [...]]]> We’ve tried to make sure the BETA test version works just as well as the classic, but we’ve probably overlooked an issue or two.  We would very much appreciate your feedback on the site, particularly if you run into something that looks like a bug.  Take a look at the How should I report issues? page to learn more about making your problem report as helpful as possible.

Graph on top of calendar page shifts sometimes.
The “calendar” page that shows all captures has a graph across the top, like the one on the toolbar.  We’ve seen the contents of the graph shift to the right sometimes when a year is clicked.  If you see this refreshing the page will fix the problem.  (Also, clicking the years below the graph still works – that will still display the captures from that year, even if the graph does move.)  This problem appear to be transitory – sometimes it happens, sometimes it doesn’t – but we’ll try to nail this down and fix it as soon as possible.

Archived page code sometimes interferes with toolbar.
The toolbar shown at the top of the archived page capture is inserted into the original page code… and occasionally the code on that archived page tries to override our toolbar code.  We caught many of these issues during QA, but there are likely to be more out there in the wide, wild web.  If you find an example for us, please send us a link to the archived page where the toolbar is displaying strangely.  We’ll take a look to see if we can conquer that code.  Please keep in mind, some display glitches are actually an accurate reflection of the archived site, and not a “bug” – but we’ll take a look to make sure!

Occasional display glitches on the search results page.
We’ve noticed that sometimes the javascript on the search results page doesn’t play nicely (here’s an example).  Everything on the page works, but it might not always achieve optimum prettiness.  We’re working on a fix for this.

Gaps in collection or display of resources.
Usually when a particular page, image, or other resource is not available it’s because we didn’t capture it when the site was crawled.  For example, if you’re looking at a page that we captured in 1998 and an image is missing, it’s probably because we just didn’t save the image way back in 1998.  We would not be able to fix that omission after the fact.  These sorts of problems are not likely to be an issue with the Wayback Machine software, it’s more likely an artifact of how well we crawled the site.  Gaps might be due to crawl limits that were in effect at the time, changes in our ability to collect certain types of media, or other limitations or issues.

]]> http://faq.web.archive.org/what-are-the-known-issues-with-the-beta-version/feed/ 0
How can I have my site removed from the Wayback Machine? http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/ http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/#comments Wed, 22 Dec 2010 00:30:14 +0000 wayback http://wayback.blog.archive.org/?p=23 The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled in the future as well as [...]]]> The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled in the future as well as exclude any historical pages from the Wayback Machine.

Internet Archive uses the exclusion policy intended for use by both academic and non-academic digital repositories and archivists. See our exclusion policy.

Here are directions on how to automatically exclude your site. If you cannot place the robots.txt file, opt not to, or have further questions, email us at info at archive dot org.

If you are emailing to ask that your website not be archived, please note that you’ll need to include the url (web address) in the text of your message.

]]>
http://faq.web.archive.org/how-can-i-have-my-site-removed-from-the-wayback-machine/feed/ 7