phpBB SEO
Boards
Directory  
SEO  
Downloads
  phpBB SEO : Search Engine Optimization, Directory, Forums  
Index
Forums
Annuaire
Référencement
Télécharger
 
  Search Rechercher
    Register
Username :  Password :  Log me on automatically each visit  
S'enregistrer  
 
   
blocking bad bots from .htaccess

 
Post new topic   Reply to topic    phpBB SEO » SEO Forum  » roBots
::  
Author Message
lavinya
PR1
PR1


Joined: 24 Jul 2006
Posts: 159
Location: Turkey

blocking bad bots from .htaccessPosted: Thu Sep 14, 2006 6:41 pm    Post subject: blocking bad bots from .htaccess

hello all.

this code true or wrong ??

Code:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} *FrontPage* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *httrack* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *Teleport* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *webzip* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebStripper* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *NetMechanic* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *CherryPicker* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *EmailCollector* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *EmailSiphon* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebBandit* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *EmailWolf* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *ExtractorPro* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *SiteSnagger* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *Cheese* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *Quester* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebZip* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *moget* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebSauger* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebCopier* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WWW-Collector* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *InfoNavi* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *Harvest* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *Bullseye* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *LinkWalker* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *LinkextractorPro* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *Proxy* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *BlowFish* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebEnhancer* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *TightTwatBot* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *LinkScan* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *WebDownloader* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *BruteForce* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *BruteForce* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lwp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lwp-* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} *anonym* [NC,OR]
RewriteRule !^403.html$ - [F,L]


if your reply me message I will be happy . thanks.
Back to top
Visit poster's website
dcz
Administrateur - Site Admin
Administrateur - Site Admin


Joined: 28 Apr 2006
Posts: 13354

blocking bad bots from .htaccessPosted: Thu Sep 14, 2006 9:33 pm    Post subject: Re: blocking bad bots from .htaccess

Well, besides the wild-cards (*, if you don't set where to begin and end, the pattern will be searched for anywhere in the UA string) to be not very useful, it's correct to ban all those User Agents.

But it's just on level above robots.txt, as real bad bots do not use any static User Agent, to go through such walls.

This is not or next week, but I am working on a solution which will allow us to fight back against bad bots and known exploits.

++

_________________
Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche
Back to top
Visit poster's website
lavinya
PR1
PR1


Joined: 24 Jul 2006
Posts: 159
Location: Turkey

blocking bad bots from .htaccessPosted: Fri Sep 15, 2006 8:35 am    Post subject: Re: blocking bad bots from .htaccess

thanks dcz. ok.
Back to top
Visit poster's website
lavinya
PR1
PR1


Joined: 24 Jul 2006
Posts: 159
Location: Turkey

blocking bad bots from .htaccessPosted: Fri Sep 15, 2006 9:40 am    Post subject: Re: blocking bad bots from .htaccess

new rule. redirect with 302 to robotstxt.org. But

eg.

RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR]
not all blocked httrack all version. only blocked old version or "httrack".


Code:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR]
RewriteCond %{HTTP_USER_AGENT} ^asterias [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Black.Hole [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlowFish [OR]
RewriteCond %{HTTP_USER_AGENT} ^BotALot [OR]
RewriteCond %{HTTP_USER_AGENT} ^BuiltBotTough [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bullseye [OR]
RewriteCond %{HTTP_USER_AGENT} ^BunnySlippers [OR]
RewriteCond %{HTTP_USER_AGENT} ^Cegbfeieh [OR]
RewriteCond %{HTTP_USER_AGENT} ^CheeseBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^CopyRightCheck [OR]
RewriteCond %{HTTP_USER_AGENT} ^cosmos [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DittoSpyder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^EroCrawler [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Foobot [OR]
RewriteCond %{HTTP_USER_AGENT} ^FrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^Harvest [OR]
RewriteCond %{HTTP_USER_AGENT} ^hloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^httplib [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^humanlinks [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InfoNaviRobot [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^JennyBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Kenjin.Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Keyword.Density [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^libWeb/clsHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkextractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a.Unix [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mata.Hari [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIIxpc [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister.PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^moget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/2 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^NPBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline.Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^ProPowerBot/2.14 [OR]
RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^QueryN.Metasearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey [OR]
RewriteCond %{HTTP_USER_AGENT} ^RMA [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpankBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^spanner [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^suzuran [OR]
RewriteCond %{HTTP_USER_AGENT} ^Szukacz/1.4 [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR]
RewriteCond %{HTTP_USER_AGENT} ^The.Intraformant [OR]
RewriteCond %{HTTP_USER_AGENT} ^TheNomad [OR]
RewriteCond %{HTTP_USER_AGENT} ^TightTwatBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Titan [OR]
RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR]
RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR]
RewriteCond %{HTTP_USER_AGENT} ^True_Robot [OR]
RewriteCond %{HTTP_USER_AGENT} ^turingos [OR]
RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot/1.5 [OR]
RewriteCond %{HTTP_USER_AGENT} ^URLy.Warning [OR]
RewriteCond %{HTTP_USER_AGENT} ^VCI [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEnhancer [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebmasterWorldForumBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website.Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster.Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZip [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWW-Collector-E [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu's [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^(.*)$ http://www.robotstxt.org/
Back to top
Visit poster's website
dcz
Administrateur - Site Admin
Administrateur - Site Admin


Joined: 28 Apr 2006
Posts: 13354

blocking bad bots from .htaccessPosted: Fri Sep 15, 2006 6:28 pm    Post subject: Re: blocking bad bots from .htaccess

Actually, only
Code:

RewriteCond %{HTTP_USER_AGENT} WWWOFFLE [OR]


is needed, no need to use the ^ anchor, as the test string could be anywhere int eh UA string.

++

_________________
Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche
Back to top
Visit poster's website
lavinya
PR1
PR1


Joined: 24 Jul 2006
Posts: 159
Location: Turkey

blocking bad bots from .htaccessPosted: Fri Sep 15, 2006 7:28 pm    Post subject: Re: blocking bad bots from .htaccess

hello. thanks dcz.

dcz I give the code that I don't know it true. can you give me the example about it. one line is enaugh . thanks.
Back to top
Visit poster's website
dcz
Administrateur - Site Admin
Administrateur - Site Admin


Joined: 28 Apr 2006
Posts: 13354

blocking bad bots from .htaccessPosted: Fri Sep 15, 2006 8:10 pm    Post subject: Re: blocking bad bots from .htaccess

oki doki Wink

_________________
Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche
Back to top
Visit poster's website
linus
PR0
PR0


Joined: 02 Jul 2006
Posts: 54

blocking bad bots from .htaccessPosted: Mon Dec 04, 2006 6:26 pm    Post subject: Re: blocking bad bots from .htaccess

GOOD Shocked Wink

_________________
...
Back to top
Visit poster's website
arch stanton
PR1
PR1


Joined: 04 Oct 2006
Posts: 113

blocking bad bots from .htaccessPosted: Thu Feb 01, 2007 12:57 pm    Post subject: Re: blocking bad bots from .htaccess

Does it matter where in the .htaccess file you put this script?

At the moment, I have an anti-hotlink script, a mod rewrite script and Google sitemaps rewrite script.

Is it better to put the bad bots blocker above or below these, or does it make no difference?

Also, I would suggest adding ConveraCrawler to the list...
Back to top
Visit poster's website
dcz
Administrateur - Site Admin
Administrateur - Site Admin


Joined: 28 Apr 2006
Posts: 13354

blocking bad bots from .htaccessPosted: Thu Feb 01, 2007 11:50 pm    Post subject: Re: blocking bad bots from .htaccess

You should put this kind of code before any rewriterule, there is no need to work more in case we have to deny access, and some rewriterules with the [L] tag would cut the mod rewrite thus the deny when matching.

++

_________________
Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche
Back to top
Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    phpBB SEO » SEO Forum  » roBots
Page 1 of 1

Navigation Similar Topics

Jump to: