|
|
小站被yandex直接爬挂了,写了robots,不过生效太慢,然后网上找了htaccess屏蔽代码,如下
RewriteCond %{HTTP_USER_AGENT} “Bingbot|MSNbot|Webdup|AcoonBot|AhrefsBot|Ezooms|EdisterBot|EC2LinkFinder|jikespider|Purebot|MJ12bot|WangIDSpider|WBSearchBot|Wotbox|xbfMozilla|Yottaa|YandexBot|Jorgee|SWEBot|spbot|TurnitinBot-Agent|mail.RU|curl|perl|Python|Wget|Xenu|ZmEu” [NC]
RewriteRule !(^robots\.txt$) http://en.wikipedia.org/wiki/Robots_exclusion_standard [R=403,L]
RewriteRule !(^robots\.txt$) http://en.wikipedia.org/wiki/Robots_exclusion_standard [R=403,L] 这段有点不理解,是只要这些爬虫就让他们访问 robots么,后面的网址是什么意思啊。懂的大佬指点下吧 |
|