Redirect Stupid Bots to Existing Resources

In case you hadn’t noticed, I’m on another one of my posting sprees. Going through the past year’s worth of half-written drafts and collected code snippets, and sharing anything that might be useful or interesting. Here is a bit of .htaccess that brings together several redirection techniques into a singular plug-&-play code snippet. Help stupid bots reach their destination Most websites are swarming with bot activity. Good bots find useful resources and are on their way. Stupid bots are too […]


This content originally appeared on Perishable Press and was authored by Jeff Starr

In case you hadn’t noticed, I’m on another one of my posting sprees. Going through the past year’s worth of half-written drafts and collected code snippets, and sharing anything that might be useful or interesting. Here is a bit of .htaccess that brings together several redirection techniques into a singular plug-&-play code snippet.

Help stupid bots reach their destination

Most websites are swarming with bot activity. Good bots find useful resources and are on their way. Stupid bots are too stupid to follow links and instead make requests for resources that don’t even exist. As in 404 “Not Found” errors draining server resources 24/7. For common, easily found URLs. For example:

  • Bots requesting login.php on a WordPress site
  • Bots requesting favicons.png and similar files
  • Bots requesting robots.txt in weird locations
  • Bots requesting xmlrpc.php in weird locations

Observing the crawl behavior of such bots, it’s clear they’re not actually looking for the login page, site favicon, robots.txt, and so forth. Instead they’re looking for irregularities and inconsistencies, in order to exploit for nefarious purposes. Or maybe they actually are trying to find the site’s robots file, but are just too stupid (read: badly programmed) to find it.

Fortunately, such suspect behavior is easy to remedy with a touch of .htaccess. To give you an idea, here is a code snippet that helps misguided bots reach their apparently intended destinations.

<IfModule mod_rewrite.c>
	
	# LOGINS
	RewriteCond %{REQUEST_URI} !/wp/wp-login.php [NC]
	RewriteCond %{REQUEST_URI} (wp\-login|login)\.php [NC]
	RewriteRule .* https://example.com/wp/wp-login.php [R=301,L]
	
	# FAVICONS
	RewriteCond %{REQUEST_URI} !^/favicon.ico$ [NC]
	RewriteCond %{REQUEST_URI} !/images/favicons.png$ [NC]
	RewriteCond %{REQUEST_URI} /favicon(s)?\.?(png|gif|ico|jpg)?$ [NC]
	RewriteRule .* https://example.com/favicon.ico [R=301,L]
	
	# ROBOTS
	RewriteCond %{REQUEST_URI} /robots\.txt$ [NC]
	RewriteCond %{REQUEST_URI} !^/robots\.txt$ [NC]
	RewriteRule .* https://example.com/robots.txt [R=301,L]
	
	# XMLRPC
	RewriteCond %{REQUEST_URI} !/wp/xmlrpc.php$ [NC]
	RewriteCond %{REQUEST_URI} xmlrpc.php$ [NC]
	RewriteRule .* https://example.com/wp/xmlrpc.php [R=301,L]
	
</IfModule>

This code snippet may be added to your site’s public/root .htaccess file (or add via server config). Remember to replace each instance of https://example.com with your actual site URL. Or you can simply remove to just use relative URLs, like /robots.txt and /wp/xmlrpc.php for example.

Once in place, the above code will redirect requests for non-existent resources to the actual file. Note that some of these rules are intended for WordPress sites, so remove the LOGIN and XMLRPC for sites not running WordPress.

Regardless of the site, the main goal of the above code sample is to give you an idea of how to better manage traffic. With a few well-crafted Apache/.htaccess rules, you can help wayward bots find what they’re looking for, which in turn improves traffic quality and helps minimize exposure to any irregularities.

Related Posts

I’ve written tons of articles related to this topic. To read more, you can browse the archives and/or visit some of these choice posts:



This content originally appeared on Perishable Press and was authored by Jeff Starr


Print Share Comment Cite Upload Translate Updates
APA

Jeff Starr | Sciencx (2023-01-19T22:33:42+00:00) Redirect Stupid Bots to Existing Resources. Retrieved from https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/

MLA
" » Redirect Stupid Bots to Existing Resources." Jeff Starr | Sciencx - Thursday January 19, 2023, https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/
HARVARD
Jeff Starr | Sciencx Thursday January 19, 2023 » Redirect Stupid Bots to Existing Resources., viewed ,<https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/>
VANCOUVER
Jeff Starr | Sciencx - » Redirect Stupid Bots to Existing Resources. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/
CHICAGO
" » Redirect Stupid Bots to Existing Resources." Jeff Starr | Sciencx - Accessed . https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/
IEEE
" » Redirect Stupid Bots to Existing Resources." Jeff Starr | Sciencx [Online]. Available: https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/. [Accessed: ]
rf:citation
» Redirect Stupid Bots to Existing Resources | Jeff Starr | Sciencx | https://www.scien.cx/2023/01/19/redirect-stupid-bots-to-existing-resources/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.