A simple Bot trap using robots.txt

Setup a simple Bot Trap for your website or web application that can enhance the security of your server and prevent malicious bots or script kiddies from poking around by luring them into a trap and banning their a**es for x amount of minutes, or else.

Introduction

Every web accessible site should have a robots.txt file that tells friendly bots and spiders which directories to crawl and which ones to leave alone. The below robots.txt example tells all Spiders to crawl everything besides the /controlpanel/ directory.

User-agent: *
Disallow: /controlpanel/

The “good” bots such as Search engines will skip your /controlpanel/ directory. The “bad” bots will go straight for it and see what they can find, hoping for maybe a login form to try and exploit.

Setting up the trap

  1. Upload or modify your robots.txt file so it contains the Disallow: /controlpanel/ Line.
  2. Create the /controlpanel/ Directory inside your Document Root
  3. Upload a single index.php File into it with the below content
<?php
 
	$log_to_syslog = true;
	$log_to_file = true;
	$file = "blacklist.txt";
	$enable_expiry = true;
	$expiry_timeout = 300;
	$timestamp = time();
	$ip_address = $_SERVER['REMOTE_ADDR'];
 
	if($log_to_file){
		if(!file_exists($file)){
			touch($file);
		}
		if(!is_writeable($file)){
			die("Cannot write to logfile");
		}
	}
 
	if($enable_expiry){
		if(($timestamp - filemtime($file)) > $expiry_timeout){
			$data = file($file);
			$new  = "";
			foreach($data as $row){
				$row = trim($row);
				$row = explode("|",$row);
				if(($row[1]+$timestamp)<$timestamp){
					$new .= "$row[0]|$row[1]\n";
				}
			}
			file_put_contents($file,$new);
		}
	}
 
	if($log_to_syslog){
		error_log("Bottrap banned IP $ip_address");
	}
 
	if($log_to_file){
		$data = file_get_contents($file);
		if(!stristr($data,$ip_address)){
			file_put_contents($file,$ip_address."|".$timestamp."\n",FILE_APPEND);
		}
	}
?>
<html>
	<head>
		<meta http-equiv='content-type' content='text/html; charset=UTF-8'>
		<title>Control Panel Login</title>
<style>
 
body {
	font-family:'Lucida Grande', Tahoma, Arial, Verdana, sans-serif;
	background: #d2d2d2;
	font-size: 12px;
}
 
div {
	background-color: white;
	width: 600px;
	padding: 30px;
	border: 3px solid #7e7e7e;
	color: #757575;
	margin: 0 auto;
	display: block;
	margin-top: 20px;
}
 
h2 {
	color: #20499a;
	margin: 0px;
	margin-bottom: 10px;	
}
</style>
 
</head>
 
<body>
 
<div class='message'>
	<h2>Browser Cert Error - unable to verify your Identity</h2>
	<?php
		if(isset($_SERVER['GEOIP_COUNTRY_NAME'])){
			$country = $_SERVER['GEOIP_COUNTRY_NAME'];
		} else {
			$country = '';
		}
	?>
	<?php if($country != '' && strlen($country)>2):?>
		<p>Your Browser Certificate could not be validated, but it is required for accessing the Control Panel.<br />
		Your IP (<?=$_SERVER['REMOTE_ADDR'];?>, <?=$country;?>) has been recorded and temporarily banned.</p>
	<?php else:?>
		<p>Your Browser Certificate could not be validated, but it is required for accessing the Control Panel.<br />
		Your IP (<?=$_SERVER['REMOTE_ADDR'];?>) has been recorded and temporarily banned.</p>
	<?php endif;?>
	<br />
	<h3>Trouble logging in?</h3>
	<ul>
		<li>Make sure your Identification Certificate Key is installed correctly into your Browser.</li>
		<li>If you continue to have problems accessing your Control Panel please contact Customer Service</li>
	</ul>
</div>
</body>
</html>

I could have kept the above index.php much simpler, but i wanted to give the impression that there is actually a Login here, except you need to have a browser certificate installed etc. Hopefully makes people looking for low hanging fruit look elsewhere.

Now what will happen is that if “bad” bots poke around your site and access the /controlpanel/ directory, we will record their IP into a file and we will send a Log entry into the apache error log.

# Visiting /controlpanel/ with your browser gets you an apache error log entry like this
[Sat Nov 24 15:37:46 2012] [error] [client 51.111.25.150] Bottrap banned IP 51.111.25.150

Besides logging to apaches error.log we also log the Bot’s IP address into the blacklist.txt file, therefore it would be trivial to hook this into your web application and basically deny them any further access to your site. I like to take it one step further and create a fail2ban filter that blocks them at the firewall level and for any service.

Integrating with fail2ban

You remember fail2ban right? We talked about this Intrusion detection framework in our recent Post

First, append the below to your /etc/fail2ban/jail.conf file to setup a new jail

[apache-bottrap]
enabled = true
banaction = pf[localhost=127.0.0.1]
port = 80
filter = apache-bottrap
logpath = /var/log/apache/error.log
maxretry = 1

Secondly, create a new file called /etc/fail2ban/filter.d/apache-bottrap.conf with the below contents:

[Definition]
# Note that there are no spaces around HOST - if i don't put them in this site won't show it!
failregex = [[]client < HOST >[]] Bottrap banned IP .*
ignoreregex = 

That will make fail2ban watch your Apaches error log and ban any IP found to access our trap. Restart the fail2ban service, and try to visit your websites /controlpanel/ directory. If you get banned – it worked. If not, check the relevant logfiles to start troubleshooting

Finally, enjoy. While simple to understand and not effective against someone determined, over the years it made me smile time and time again when someone ran into the bot-trap. Its quite easy to extend this idea to other areas as well – which we will talk about in a future post; if you have any ideas, please comment and share!

Speak Your Mind

*