Poster:
|
Bakeneko |
Date:
|
October 09, 2011 01:40:46am |
Forum:
|
web
|
Subject:
|
Re: We were unable to get the robots.txt document to display this page |
I've been having the same problem for 2+ weeks on the archive page for this site:
http://beevee.topcities.comI know this is a well archived site because I visit it fairly regularly and have done for years since it fell off the web.
On top of the "Cannot display Robots Text" error, there's a smaller line stating that it was a server timeout - though it takes less than a second for the error page to come up.
I've read that server timeouts are usually fixed within two weeks. It's been a little over two weeks, though, and no dice.
I know the servers are being maintenanced this weekend, so had hoped to stay patient and that the page would come back up when it's over. But reading this now, and a few other threads in the forums that are complaining of the same thing, I am worried it won't.
What's causing this, and how do we work around it? I notice the link you're using is different, was there a migration that went bad?
FWIW, I tried the archive part of the links you're posting for my link, and it still doesn't work. =(
http://wayback.archive.org/web/*/http://beevee.topcities.comor
http://web.archive.org/web/20050206033713/http://beevee.topcities.comAny help is greatly appreciated.
-edit-
Am reading about robots.txt on defunct site domains being changed by the new owner or re-seller. Blocks the whole history of that domain from access. Wow, that seems like a rather glaring oversight.
Regardless, I went and checked the robots.txt archive for both topcities and geocities, and the most recent crawl of each is not blocking archive.
I can't see the server for geocities archives because of maintenance, but topcities hasn't been crawled since July, and there was a HUGE spike of tracking in July too, compared to the whole previous history back to 2001.
So, most recent record of robots.txt is not blocking archive. Archive can see that and all earlier robots.txt. Try and search a page under those domains and you get the error that the server timed out and could not find robots.txt.
Geocities has a parking space online on Yahoo's pages.
Topcities server can't be found.
Yikes, sounds like this is the problem for the one site I really want. I'm going to be very sad, if that's the case and the Archive can't show it because of no more robots.txt.
This post was modified by Bakeneko on 2011-10-09 08:40:46