OT: robots.txt query

- N
- N_Cook
  
  Contact options for registered users
posted
7 years ago

Fri, Sep 2, 2016 11:33 AM

I expected one of my sites to be on archive.org, originally it was , but not now Looking at robots.txt file (Currently ) User-agent: * Disallow: /cgi-bin/ Disallow: /cgi/

(dec 2012 from archive.org ) # Default /robots.txt File for all Community Architect Partner pages

User-agent: * Disallow: /cgi-bin/

I looked at the wiki on this , but does not explain what is/not involved in "/cgi/" statement

I assume because its an ads paid for site and they get no revenue from such caches, no remote linking allowed presumably for same reason.

1/ If I have access to that directory, I doubt I do, would changing it back to as before corrupt operations? robots.txt has not appeared on any directory listing I've made , while on upload/download access to it.

2/ On another site , via www access , there is no robots.txt, does that mean it should turn up on archive.org (if it is aware of the site + page that is) ?

Loading thread data ...

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.