Main Menu
Home
About
Archive
Zen Kernel
Downloads
Satellite
Dish Keys
SURGE
Links
Search
Search Bible
Feed Me!
 

 Subscribe

Add to Google

Add to Pageflakes

Subscribe in Bloglines

Add to My AOL



 







One .htaccess File to Rule Them All
Monday, 16 April 2007
For those of you who don't know what .htaccess is, it is a file used by servers running Apache to configure individual servers. In my case, I use it to block a large volume of spammers, scrapers, bad and rude robots, and other Internet evils. When dealing with .htaccess files, you need to make sure your host allows you to use them, and that your server runs Apache. Besides that, you just need to be careful because you can easily get a "500 Internal Server Error" if you aren't careful. Note: at the bottom of this article is a link to a copy and pasteable example that will secure your site very well.

That said, lets get on to the good stuff! To block an IP address with your .htaccess file, simply add the following:

Example: deny from 000.000.000.000

You can even block using partial IP addresses, if you so desire:
 
Example: deny from 000.000.000

The above methods are good for blocking bots, domains, and unruly users. I know that many people use them to block governments, as well as the RIAA and MPAA from their sites.

Another use is to use mod_rewrite to block common exploits. These exploits, though common, can often be used to take your site over in the event you aren't secured. (duh)...

Example(# denotes a comment):
 # Block out any script trying to modify a _REQUEST variable via URL
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})

Some rules are complicated looking, although they do their purpose in the end. Such filters as [0-9] and [A-Za-z] simply mean that the number 0-9 and letters A-Z, a-z are being looked for. In the below example, a series of these are used in the remote address to determine if a certain bad bot is attempting to crawl, and if it finds it to be true, blocks it.

Example:
RewriteCond %{REMOTE_ADDR} "^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$" [OR]  # Cyveillance spybot

These methods are useful for blocking certain user-agents too!

Example:
RewriteCond %{HTTP_USER_AGENT} TurnitinBot [OR] # Turnitin spybot


You can even block single words coming from referers, such as spammy domains.

Example:
RewriteCond %{HTTP_REFERER} viagra [NC,OR]

Finally, for the last nail in these baddies' coffins, just use this at the end of your blocklists and create a noindex.html page. This will really tick them off. If you really want to hurt them, write an infinite php loop in there to confuse and crash whoever or whatever bot it may be.

Example:
ReWriteRule ^.*$ /noindex.html  [L]

An example file of everything discussed here, with blocks for hundreds of baddies can be found by clicking this sentence.
Don't forget to rename it .htaccess!

For an official manual to the .htaccess file, see this Apache tutorial.

Next time, I will write a brief article about the robots.txt file.






Comments
Add New RSS
Write comment
Name:
Email:
 
Website:
Title:
 
:angry::0:confused::cheer:B):evil::silly::dry::lol::kiss::D:pinch:
:(:shock::X:side::):P:unsure::woohoo::huh::whistle:;):s
:!::?::idea::arrow:
 
Please input the anti-spam code that you can read in the image.

3.25 Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."

 

© Matt Parnell's Brain: Plugged In!