Forum

UrlToolkit, extended Match

JanH
22 October 2010, 13:05
Hello,

I'd like to setup some UrlToolkit rules for the User-Agent part of the request. For example:
# stopping bad hackers (http://0entropy.blogspot.com/2010/09/phpmyadmin-scans.html) :
MatchBrowser ZmEu Ban 1000
# stopping bot connections :
MatchBrowser YandexBot Ban 1000


Matching other parts of the request might be interesting too. Would that be possible?

Thanks


Hiawatha version: 7.3
Operating System: Debian Lenny
Hugo Leisink
22 October 2010, 16:43
In my opinion, URL rewriting should only be used for having nice looking URL's. If you use it for something else, the change is too big that you are creating something that's part of the business logic of your website. In my opinion, all business logic should be inside the website code, not in the webserver configuration.

For that reason, I will not extend Hiawatha's URL toolkit with any other part of the request to match with. Keep all your website business logic in one place and let the webserver do what it should only do: pass on data to the client.
JanH
22 October 2010, 20:15
For me, Hiawatha is not just software used only to "pass on data to the client". I could make any webserver to do that. I use Hiawatha becuse it's "the world's most secure and advanced webserver", built with security in mind from the start.

As paranoid admins like me go along, they misuse your parameters (meant to just make nice looking URLs) to secure it even harder. I like to use rules like this:
Match /conf\w*/ DenyAccess
Match ^/(php|pma|round|web|squirrel) Ban 600
That's what Ban and DenyAccess are for.

What hosts will be served?
What aliases they have?
What environment variables the webapps need?
What folders should be served, which of them must be password protected?
In my opinion, webserver configuration plays an important role in the website logic. It's often not possible to keep the business logic in the code, without changing the webserver configuration. It's an integral part of it and cannot be separated.

Often, third party open source software is being used online. Sometimes it's not even properly updated. I cannot rely on the security of that code. I rely on clean, secure setup of the firewall and webserver. It is much more reliable than trying to secure the third-party code. I don't think that there is anything wrong with that.

Please, correct me if I'm wrong with any of the statements mentioned. My approach may not be the best, I'm open to all your suggestions.
Hugo Leisink
23 October 2010, 16:38
Being a secure webserver has nothing to do with keeping the business logic of your webapplication in one place.

The Ban in the URL toolkit is to discourage people from attacking your webserver while they are scanning for present files. Anyone requesting /phpmyadmin/ while it's not there is very likely up to no good. So, you want to stop him from doing any further scanning by banning him.

The DenyAccess in the URL toolkit is there because many people who call themselves web developers are not able to write a decent piece of software and specially not a secure piece of software. So, they only reason the DenyAccess exists is because other people's lack of programming skill needs to be compensated.

So, the existing options in the URL tookit are there for security reasons. That's totally different from including features that could also be done in the website code. Splitting up the business logic of your website is dangerous. Specially if part of the security of your website depends on it.

URL rewriting is not something your website can do, so that should be done by the webserver. Selecting the output based on certain headers in the request can easily be done in the website code, so do it there.
JanH
25 October 2010, 11:08
You are right, blocking connections by User-Agent at the webserver level is nowhere near any standard, nor recommended.
But it turns out to be very effective and popular method:
http://perishablepress.com/press/2009/03/29/4g-ultimate-user-agent-blacklist/
As this blog states: "...serious reduction in wasted bandwidth, stolen resources, and comment spam." This may be true even for Hiawatha, not just Apache. In my experience, the bad User-Agent cannot be stopped just by BanOnFlooding, BanOnMaxPerIP, or another method.

You're right, "..selecting the output based on certain headers in the request can easily be done in the website code..", but I can't secure the code which I did't wrote.
I'm just hosting webapps like TWiki, TRAC, OTRS, Roundcube, Hobbit, Squirrelmail, phpMyAdmin, Horde, Joomla, Typo3... I don't have the time and resources to make sure that all that code is secure enough. I'm even having a hard time to keep all that stuff up to date. Every single security oriented webserver option is a huge help here.

Bad User-Agent should be stopped from requesting anything as soon as possible. My logfiles are filled with entries like:
... User-Agent: Made by ZmEu @ WhiteHat Team - www.whitehat.ro ...
... User-Agent: Toata dragostea mea pentru diavola ...

These attempts are clearly malicious, there is no reason why the webserver or webapplication should even respond to this. It creates unnecesary load on the server, exhausting my allowed connections limit. Even if all applications were absolutely secure, I'd still wish to have the "MatchBrowser" option as a security/performance feature.

Anyway, thank you for making Hiawatha simply the best webserver out there
Hugo Leisink
25 October 2010, 12:46
If you only want to ban clients based on their User-Agent, you should take a look at the DenyBot option. It was made to block annoying searchbots but can also be used to block annoying scanners.
VirtualHost {
...
DenyBot = ZmEu:/
DenyBot = Toata:/
}


I agree with you that it's nearly impossible to secure other people's code. I'll see what I can do with the UrlToolkit to make your life easier.
JanH
25 October 2010, 16:19
Thank you, this is cool, I somehow overlooked that DenyBot option
However, compared to UrlToolkit there are some drawbacks:
- no regexes
- can't Ban the client
- it has to be listed for each virtualhost separately

Anyway, I'll use it for now, and let's see how it will preform. This is what I was looking for

Hiawatha is making my life easier already. Thanks again for developing this great webserver.

Hugo Leisink
26 October 2010, 10:15
Another interesting option for you might be the Hiawatha Monitor. With that, you can see when a vulnerability scanner has scanned your website (you'll get a lot of 404's in your log). Please note that the Hiawatha Monitor is not a Google Analytics replacement, its only purpose is to monitor your webserver and websites in order to track down bugs and other problems.
This topic has been closed.