I expected one of my sites to be on archive.org, originally it was , but not now Looking at robots.txt file (Currently ) User-agent: * Disallow: /cgi-bin/ Disallow: /cgi/
(dec 2012 from archive.org ) # Default /robots.txt File for all Community Architect Partner pages
User-agent: * Disallow: /cgi-bin/
I looked at the wiki on this , but does not explain what is/not involved in "/cgi/" statement
I assume because its an ads paid for site and they get no revenue from such caches, no remote linking allowed presumably for same reason.
1/ If I have access to that directory, I doubt I do, would changing it back to as before corrupt operations? robots.txt has not appeared on any directory listing I've made , while on upload/download access to it.2/ On another site , via www access , there is no robots.txt, does that mean it should turn up on archive.org (if it is aware of the site + page that is) ?