Mod_rewrite: September 2006 Archives

2006 Sep 13

Spam is a problem, posting entries to blogs with email address should be avoided. There are lots of automated programs used to collect email addresses; other than spam, bandwidth may also be an issue for these programs reads your entire website. If you only have a small bandwidth allocated to your site then you will be seeing that Bandwidth Limit Error in due time.

What I did? Blocking all unwanted robots out of my site using mod_rewrite by apache. First, you need to examine your access log file ang try to google on the robots that has visited your site if they are safe or just they are just scrapers. Just be carefull not to block those major search engine spiders like googlebot, inktomi slurp, msnbot or ask jeeves. Unless you don't want them crawl your website.

You need to modify your .htaccess file to block unwanted robots from scraping your website by:

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link[Ww]alker [OR]
RewriteRule ^.* - [F]
</IfModule mod_rewrite.c>

The above code tells the spiders Siphon and LinkWalker that they are not allowed on our website by returning a 403 Forbidden Error.

There are also good robots, most of them are used for link checking, so redirecting them to the proper areas would be a better solution.

<IfModule mod_rewrite.c>
RewriteCond %{HTTP_USER_AGENT} reciprocalman [OR]
RewriteCond %{HTTP_USER_AGENT} LinksManager.com_bot
RewriteRule ^$ /resources/
</IfModule mod_rewrite.c>

The code above tells the reciprocalman and the LinksManager.com_bot to go directly to the resources directory.

2006 Sep 7

Bandwidth is precious, and seeing a bandwidth limit exceeded on your website is just so frustrating. Blocking unwanted referrers from your site may be your best option. If you are using apache as your webserver then you can take advantage of its mod_rewrite module to block unwanted referrers.

You need to modify your .htaccess file to block access to large files such as, images, mpeg, avi, etc. :

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://([-a-z0-9]+\.)?domain\.com [NC]
RewriteRule .*\.(jpg|gif|avi|wmv|mpg|mpeg)$ http://www.domain.com/nohotlink.jpg [R,NC,L]
</ifModule>

About this Archive

This page is a archive of entries in the Mod_rewrite category from September 2006.

Mod_rewrite: August 2006 is the previous archive.

Mod_rewrite: December 2006 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Recent Activity

Friday

  • tildemark tweeted, "im so sleepy. Zzzzzzzz"

Sunday

  • tildemark tweeted, "some of my scipts are not working with godaddy. but works fine on the others. not mention their poorly coded admin page"

Today

  • tildemark tweeted, "so many pending tasks i need to finish. need more coffee !!!"
  • tildemark tweeted, "@gmtristan i dont think that is true."

Today

  • tildemark tweeted, "how does godaddy subdomain behaves? i have some problems with it on my scripts. it does not seem to accept query strings.."

Monday

  • tildemark tweeted, "i had a hard time removing the error messages generated by surf side kick. i ended up uninstalling most of my applications."

Sunday

  • tildemark tweeted, "i got hit by surf side kick and im getting numerous error messages on my screen. tskkkkk"

Saturday

  • tildemark tweeted, "check boxes, i didn't know they can also be complex"
  • tildemark tweeted, "this smart bro internet speed is depressing, i thinking of filling a complaint to the DTI next week."

Friday

  • tildemark tweeted, "the seminar turned out to be leadership training. it was fun, learned alot. i have already attended numerous seminars but this is different."