Spammer Hammer

03/20/05

Permalink 05:45:23 pm, by Paul Oliver, 1250 words, 44589 views   English (US)
Categories: News

Spammer Hammer

Wowee. Ever since I used a blogging package to handle all the fancy blogging features for me (RSS feeds, comments, auto-archiving, stats, etc.) I have come to the realization that if a medium can be spammed, it will be spammed. It's really disgusting the stuff I have to block from my blog every 2-3 days.

[More:]

The stats on the right-hand side (top referers, recent referers) are auto-generated. When someone visits my site by clicking on a link from Google, then Google will become a recent referer and will also receive a point as far as ranking is concerned. So if I get a lot of hits from Google, it may become the top referer. Why is this useful? Well, if you like my blog, maybe you can find other blogs you like by going to some that refer to mine. Plus it's neat for me to see where people are coming from.

Unfortunately, spammers have jumped all over this. They can hack their browsers, or write a program that pretends to be a browser, to visit my site with a bogus referring link. For example, if Spamford McGhee wanted people to come to his site selling the Billy Bass, he could make a program that makes it look like he's visiting paultastic.com from billybass.com. Make it run a couple hundred times a day, and voila! A link shows up in my blog to Spamford's site, because billybass.com has become the top referrer! Do this on enough websites, and the search engines will boost billybass.com's page rank because of all these sites which link back to it.

The blogging package I use, b2evo, has some nice anti-spam tools. For example, I can delete and block any reference to "billybass.com" on my site. Billybass will disappear forever from my blog and will never return. Even cooler yet, b2evo has a centralized database of blocked website keywords that the users share with each other. A guy in Rumania might not get spammed by billybass.com today, but when he updates his local list of spammers, he won't ever get spammed by that particular keyword.

Spam!

Unfortunately, this method only works if I visit every day and prune my spam links regularly. I don't have time for this!

So Google's come up with a wonderful idea. It will ignore any links that have

rel="nofollow"

in them. What does this mean? Well, spammers will get no benefit from spamming your site from Google (and most likely the other search engines) They can spam all they want but what's the point?

Well, today I logged in and there were about 20 sites that were strangling out all the legitimate referrers to my site. I won't name them, as to promote their page rank, but it took me 10 minutes to clean them all out. As I write this, my web server is getting hammered from the little spam programs that are pretending to visit my site from these spammer domains. I was mad enough to do something about it.

First, I wanted to implement the "nofollow" feature, so my site will never benefit these spammers again. I visited here and here, which tell you how to edit b2evo to put in the "nofollow" attribute. It was only about 8 edits total, and now all the links on my site will not actually help the spammers...even if they slip through the cracks.

Going through my logs, I noticed that there were four IP addresses that accounted for the 20 spammer domain links I was getting today. So, after researching the problem (hosts.allow does not work for apache unless its compiled with tcp wrappers on FreeBSD) I found that the best way was to use mod_access.

I added the following lines to my httpd.conf file:


<Directory "/usr/local/www/paultastic4/blogs">
Order Allow,Deny
Allow from all
Deny from marketscore.com
Deny from block.alestra.net.mx
Deny from 207.127.0.2
Deny from 170.224.224.149
</Directory>

Those are the actual domains and IPs that were spamming me. Then I did an apache graceful restart, so it would honor the new settings:

$> apachectl graceful

And tada! The spammers won't even get to talk to my blog anymore. So if they're spamming with 20 more domains tomorrow, I don't even have to re-prune the new spam domains.

How to tell if its working

Well a great way is to put in the IP address of another computer for testing purposes. Go to that machine, and see if you can pull up a page from that domain.

Better yet, check your apache logs. Before:

host-148-244-150-58.block.alestra.net.mx - - [05/Mar/2005:12:25:30 -0600] "GET /blogs/index.php HTTP/1.0" 200 41930 "http://www.billybass.com" "Mozilla/4.0 (compatible; MSIE 4.0; Symbian OS/1.1.0)"

This guy requested /blogs/index.php and got a 200 code. That's the HTTP code meaning the request succeeded. The number after 200 is the number of bytes served up. The following field is the referring site (billybass.com!). After that is the browser the guy is using.

Here's what I see after I implemented the Apache httpd.conf fixes:

host-148-244-150-58.block.alestra.net.mx - - [20/Mar/2005:16:56:48 -0600] "GET /blogs/index.php HTTP/1.0" 403 288 "http://www.billybass.com/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5) Gecko/20031007 Firebird/0.7"

As you can see, it's coming from the same IP address (this is an actual spammer address), 15 days later. The major difference now is that I'm serving up an HTTP 403 status code--which means forbidden! It's working. This guy cannot spam me anymore, which should cut down on the number of spam entries in my blog....hopefully.

Update 3/22/2005

I'm having more and more dudes spam me. Those four domains cut out a lot of spam hits, but they're still rolling in. For those of you running Linux/Unix/BSD, I came up with a few shell scripts to help automate this:

/usr/local/bin/paulsGrabSpamIPs:


#!/usr/local/bin/bash

grep $1 /usr/local/www/logs/paultastic_access.txt | awk {'print $1'} | sort | uniq >> /root/denyIPs.txt

NOTE: the "grep" line is all on one line.

/usr/local/bin/paulsGrabSpamIPs2:


#!/usr/local/bin/bash

cat /root/denyIPs.txt | awk '{print "Deny from " $1}' | sort | uniq > /root/denyIPsHttpdConf.txt

NOTE: the "cat" line is all on one line.

For example:


$> paulsGrabSpamIPs billybass.com
$> paulsGrabSpamIPs "-gambling-"

The first command will put all the IP addresses that referenced billybass.com into /root/denyIPs.txt. The second one will grab all the IP addresses and hosts that referenced the phrase "-gambling-" into /root/denyIPs.txt. Next we run


$> paulsGrabSpamIPs2

This command will take all of the past and current IP addresses into /root/denyIPsHttpdConf.txt. They will be sorted, with redundant IP addresses stripped out. In addition, each one will have the phrase "Deny from " typed in front of it so you only need to paste it into your httpd.conf file. It will look like this:


<Directory "/usr/local/www/paultastic4/blogs">
Order Allow,Deny
Allow from all
Deny from 134.75.217.55
Deny from 193.159.244.70
Deny from 160.109.67.43
Deny from 164.58.25.34
Deny from 170.224.224.149
Deny from 193.95.70.74
Deny from 195.229.241.184
Deny from 195.229.241.187
Deny from 196.203.64.2
Deny from 200-207-168-133.dsl.telesp.net.br
Deny from planetlab2.ru.is
Deny from pncccache4.palaunet.com
Deny from ppp-203.144.197.194.revip.asianet.co.th
Deny from tgvarna.pro-lan.net
Deny from webshield.sulanet.net
Deny from xdsl-264.lubin.dialog.net.pl
</Directory>

Of course, now all you have to do is an apachectl graceful to restart Apache gracefully. This will reload the configuration file so you can block all those stinkin' spammers. Good luck!

Comments, Pingbacks:

Comment from: Sadie [Visitor]
Thank you so much for all your hard work!

I had a few suspect hosts in my logs which I was able to tell for sure were bad news after looking at your block file.

You've saved me hours of searching Google :)
PermalinkPermalink 10/15/05 @ 10:39
Comment from: Paul Oliver [Member] · http://www.paultastic.com
Thanks for your comments. Check out the evolution of this idea here.
PermalinkPermalink 10/18/05 @ 15:26
Comment from: Nahoo [Visitor] · http://dahost.net
There are errors in your conf file:


Syntax error on line 3747 of /usr/local/apache/conf/spammers.httpd.conf:

The specified IP address is invalid.
Syntax error on line 24067 of /usr/local/apache/conf/spammers.httpd.conf:
The specified IP address is invalid.
PermalinkPermalink 03/31/06 @ 07:55
Comment from: Paul Oliver [Member] · http://www.paultastic.com
Thanks for the heads up Nahoo. The syntax works on my Apache (1.3.33), but perhaps your apache doesn't like the netmask shortcut my file is using.

I changed it from

Deny from 200.6.20/24

to

Deny from 200.6.20.0/24

and did the same thing on line 24067. I'll bet it works now.
PermalinkPermalink 03/31/06 @ 09:09
Comment from: Nahoo [Visitor] · http://dahost.net
I'll wait until you have updated the conf file with these changes as I received the same messages...

Syntax error on line 3755 of spammers.httpd.conf:
The specified IP address is invalid.

Syntax error on line 24090 of spammers.httpd.conf:
The specified IP address is invalid.


I can't believe how many 403 errors I get now... its amazing to see that many spam-bots looking at my sites.

Thanks for you attention and keep up the good work!
PermalinkPermalink 04/02/06 @ 18:51
Comment from: Alan Doherty [Visitor] Email · http://hosting.alandoherty.net/
i have a similar system on my server {hosting multiple sites}
but it uses hidden links {denied by robots.txt to stop google following them} to catch ip's of harvesters/spammers etc
it runs two php scripts when followed first adds the deny from IP to the .htaccess thats in the root below all directories {except the directory containing the error pages and php scripts for blacklisting}
and also it then runs a script from projecthoneypot.org that feeds them some tagged addresses and logs the ip to monitor for mail folowing the harvesting of the unique addresses

its quite sucessfull and immediatly kills the malicious bots access
also in case of accidental listing or innocent user being idssued the ip later the custom 403 page offers a captch based de-list your ip option

but the downside i'm seeing is proxies
apache seems to have no way of blacklisting ip's except for direct connects {i'd like to be able to have ability to list the ip behind the proxy {for many big public and usefull proxies} rather than all users of the proxy
{currently it lists them but dosn't block the proxy {connecting ip} because when i do i have bots getting proxy listed and legit users de-listing it all day

would be great if apache offered a
setenvif header=(line in file} blocked=yes

so i could add blocked ips to file and then just test the headers
REMOTE_ADDR = proxy IP/your IP
HTTP_VIA = proxy IP/your IP
HTTP_X_FORWARDED_FOR = your IP

etc as well as connecting ip

but atm to do this it would require making my htaccess many times greater {as i have i think 5 headers to check total atm so would require a 5 times greater file to parse for each request

any ideas?

btw my malicious spider trapping code and htpd.conf available on request if you want a gander
PermalinkPermalink 11/12/07 @ 13:07
Comment from: Paul Oliver [Member] · http://www.paultastic.com
Great idea Alan! I'm very interested in seeing the code.
PermalinkPermalink 11/13/07 @ 21:42
Comment from: Alan Doherty [Visitor] Email · http://www.alandoherty.net/info/webservers/phpscripts/
sorry for delay drop me an email to organise stuff i can email them over too you
PermalinkPermalink 05/03/09 @ 18:51
Comment from: Spielautomat spielen [Visitor] · http://www.spielautomaten-spielen.de
it runs two php scripts when followed first adds the deny from IP to the .ht access that's in the root below all directories {except the directory containing the error pages and php scripts for blacklisting h
PermalinkPermalink 07/02/09 @ 03:00

This post has 3 feedbacks awaiting moderation...

Leave a comment:

Your email address will not be displayed on this site.
Your URL will be displayed.

Allowed XHTML tags: <p, ul, ol, li, dl, dt, dd, address, blockquote, ins, del, span, bdo, br, em, strong, dfn, code, samp, kdb, var, cite, abbr, acronym, q, sub, sup, tt, i, b, big, small>
(Line breaks become <br />)
(Set cookies for name, email and url)
(Allow users to contact you through a message form (your email will NOT be displayed.))

Paultastic Musings

I'll put my thoughts here. You can comment. We can all shoot lasers with our elbows.

July 2010
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  
Search

Categories

Misc
XML Feeds
What is RSS?
Popular Pages
FreeBSD Partition Types Block the SPAM Referrers from your Apache About Me My Band Yatta!
Deep Thought
I don't say that the bird is "good" or the bat is "bad." But I will say this: at least the bird is less nude.
Sponsored Links

All content © 1997-2009 . All Rights Reserved. Privacy Policy