Weblog

The responses to my previous weblog article made me think about centralized configuration a lot. I've decided not to implement database support in Hiawatha. That's simply too much work, adds too much complexity and makes Hiawatha too much bloated. I also don't like the idea of constantly connecting to a database for every request. The whole idea simply doesn't feel right.

So, what else? Before explaining my new idea, I just want to make clear that the original way of loading the configuration from a file will always be available. That will never change. Any centralized configuration storing will just be an extra option. Now, back to my new idea. Instead of using database connections, I can make use of the type of connections Hiawatha already knows best: HTTP. What if the local configuration file only contains a single option which tells Hiawatha to which webserver it can connect to to retrieve its configuration? Hiawatha connects via HTTP(S) to that webserver, downloads the configuration file and handles it like it was loaded from a file. That's not hard to implement.

Now the changing of the configuration without restarting the webserver. Simply reloading the configuration is not an option. It's simply too complex. What to do with the old configuration upon reload? It's very likely still in use for the current connections. What to do with changes in binding configuration? Kicking all clients after configuration reload is the same as a restart. This whole configuration reloading is simply asking for trouble, so I won't go that way.

My biggest question in all this is: how bad is restarting a webserver really? Those 2 seconds offline time, does it really hurt anybody? And how fast does a new website need to be online? What can be done is a cronjob that checks every night for a configuration change and restarts the webserver if so. Is that acceptable? If so, the solution is simply. It's like I just described. If not, there are some options, but I haven't figured out which I think is best.

One option is to use some sort of dynamic virtual host configuration. A special configuration option points to a directory, in which each subdirectory is named after its virtual host. So, simply adding a new directory adds a new virtual host. This works only if a new website is comparable to the other websites (same URL toolkit, same FastCGI daemon, etc). Otherwise, adding the extra configuration options for that new virtual host makes it all more complex.

Another option I thought of is to make the centralized configuration server use some sort of POST request to the webserver, with the new virtual host or URL toolkit as the request body. This makes it more flexible than the first option, but making sure the configuration that has been pushed to the webserver is in sync with the one stored in the database adds extra complexity.

Both options can be implemented as an extra feature of the Hiawatha Monitor. That way you have a centralized tool to maintain your webserver configurations and to keep track of their performance and status.

Although there is still a lot to think of and to work out, I like the general idea of it a lot more than the previous idea (using database connections). This simply feels more right. I'm really interested in your thoughts about this idea.

Pål Sollie
28 July 2015, 10:56
If people are really running as many webservers as they claimed in the previous post, and are not already using something like Salt or Puppet, they're doing something wrong.

Let your configuration management system do what it does best, manage configs and reload services as needed. No need to add more complexity to Hiawatha.
Jeff
28 July 2015, 10:58
I'm hosting around 60+ websites with hiawatha and most config changes are related to vhost, adding or editing them. I have a separate vhost directory (/etc/hiawatha/vhosts) included in main hiawatha.conf file which has {domain}.{tld}.conf files for each vhost. So basically everytime I need to change a website configuration I edit programmatically (or manually) the proper vhost file and restart hiawatha, haven't got any issues doing this so far, except if a person is uploading files or submitting a form.

So maybe your idea of monitoring a directory, could be of vhost configuration files, monitoring for changes in md5sum. If something changed it could be read to replace a vhost configuration value. I haven't properly read hiawatha source but a hash table could be used to store every vhost configuration file, when a vhost is accessed its configuration values could be write locked, so the configuration reload needs to wait before attempting to replace old conf with new one.

Also instead of monitoring a directory (which would be redundant and consuming) maybe a cli switch like -r|--refresh should be added that only scans for vhost configuration changes/additions on demand.

With that on mind, a directory like /etc/hiawatha/conf.d could be used for configurations that should be read dynamically/on demand, and limit the type of configurations that can be modified on the fly.
Hugo Leisink
28 July 2015, 11:08
@Pål Sollie: How webserver independent are tools like Salt and Puppet? Do they explicitly have to support Hiawatha or is it capable of generally maintaining configuration files and restarting services?
David Johnson
28 July 2015, 11:36
A couple of thoughts:

(I've not worked with the source much, so the practicality/usefulness of these ideas may be limited)

1) Working with your idea of having hiawatha pull its config from a remote http resource - could this be implemented so that the config can be split over multiple remote urls - like multiple include files in a config file. This would allow different ways of organizing the config to suit different setups. For instance, it might be useful to split into vhosts.

2) As a webserver restart appears to be the only uncomplicated way to reload the config, perhaps time/effort would be better spent trying to improve startup/restart time. Where are the bottlenecks in the process? Can the process be reordered so that not everything has to be torn down and rebuilt? Can that 2s restart time be reduced to 1s or 0.5s.

Does the process of attaching to privileged ports and then dropping privileges take significant time - can this part be avoided if the listener/global part of the config is unchanged?

Is the parsing of the config file slow? Would it help to "compile" the config into a form that can be loaded more quickly. In this case, if the config can be parsed and compiled off line ahead of a restart, would this speed up the restart significantly? Could a restart then be organised as: compile the config whilst continuing to serve, when new config ready, stop serving and load new config.

DJ
Hugo Leisink
28 July 2015, 11:43
@David:
1) That can be done, but that makes the checking for a correct configuration more complex. Checking the correctness of one configuration part can only be done when the entire configuration is checked, because one part can depend on another part (VirtualHost using a URL toolkit or FastCGI daemon). And if parts can be used in multiple configurations, it becomes even more complex. So, I don't know if this is such a good idea.

2) The loading of the configuration during a restart is not the bottleneck. Starting Hiawatha takes less than half a second (on my server). It's the shutdown of the currently running process that takes time. Simply killing the process is not desirable. What is done during a shutdown is closing the sockets for new incoming connections, signal all the client threads to shutdown, give them time (causes most of the delay) to finish up uploading files / executing CGI and to terminate. After that, the remaining threads are killed and the server terminates.
ZEROF
28 July 2015, 12:20
For me Hugo you don't need to spend time on extra futures, but making small administration panel will be cool. Just few options inside. Start, stop and set vhosts. After that you will see other devs will jump in. Using bootstrap as base is good solution. Just my 2 cents.
Pål Sollie
28 July 2015, 12:29
@Hugo:
My personal experience is mainly with Saltstack, but I believe it to be equally true for most of the other config management systems.

Salt doesn't care which service you throw at it. It knows how to start/stop/reload/restart a service based on which distro/os you're on. You can make it watch a file/directory/package and reload/restart if it detects changes in any of those. You could make the restart depend on "service hiawatha check" exiting with 0.

It would need to be implemented for your setup, but Salt defaults to YAML for its states, so it's pretty easy to get the hang of.
Kapageridis Stavros
28 July 2015, 16:25
IMHO i think that there is need something like a repository with configuration files for hiawatha (better if is seperated at levels : default, secure, insane), toolkits(wp,joomla...etc) so when i create a new vhost there will be only needed to set some parameters(urls) on a config file.
As for all the these and also for the vhosts creation and management i strongly believe that this can be done inside the monitor panel(expand this panel as hiawatha manager.)

I know that this needs a lot of work and do not now how safe it can be , but i believe that it will make more administrators turn to Hiawatha.

In few words the administrator will connect to his Hiawatha manager and manage everything as we do now from command line. Also he will have all the prefered option and suggestions as they applied in the future ready for deployment.

Imagine it like : Login to Hiawatha Administration Panel, go to vhosts, create new vhost, give what i need there (domain name...), check the options i like for this vhost by ticking checkboxes or fill textboxes (toolkits, max connections..etc) and press "Create vHost".

I do not see the restarting of the webserver as a problem cause it last a few seconds. No database is needed, i like the way as hiawatha works.

For anyone that have the same thought with me i will be happy to analyze it more.
Michael
28 July 2015, 16:34
As far as changening the config for current users.
1. When a server senses there is a change by https it confi file start to do connection tracking on it
2. Stop any new IP' from connecting to it via hiawatha or creating an iptables rul
3. As the current users session stop, and since you do such a good job of detecting and blocking attacks, this will occour soon then on most web server. Wait untill it goes quiesent then load new config.
4. Open that sever to new connections again.
5. Save the config and date it. Why not? Always good to roll back unless limited storage
The above works in that we believe the load balancer that is on top of an array of hiawtha servers is doing connection tracking tight and keeping traffic from same sources localized on the same server out of many it manages.
Nere
28 July 2015, 18:13
Hello Hugo,

I think what's much more important is Python 3 support in Hiawatha.
Do you think that you could do something for that?
Mark
29 July 2015, 07:03
I agree with the comment if your doing any of this by hand you're doing it wrong.

I think Hiawatha should concentrate on being a web server, and being the best one (cf https://github.com/h2o/h2o and http://blog.kazuhooku.com/2015/07/h2o-version-140-released-with.html). It is a specialist service app in a stack. Build (packer.io, Vagrant), Installation(Chef), Configuration(chef), Orchestration(Terraform, Karamel), service discovery (Consul) are also specialist tasks. Hiawatha could maybe provide out the box defaults/examples for slotting into these stacks, e.g. salt.

Hope that helps.
Hugo Leisink
29 July 2015, 08:41
@Nere: What is needed to support Python 3?
Alexander S
29 July 2015, 10:59
I used some paralell shell thing to restart/update the webserver and configs which included the webserver nodes IP with an include of a seperated, untouched file.

I usually don't use single point of failures or additional attack surfaces.
I mainly did a cdn thing, so the cfg was not changed too frequently, but when, the editing took some time, yes.

That https config thing could really be interesting though, I like the idea of setting up a php script that listens to specific POSTs and sends back a configuration.

Running python3 with hiawatha? Interesting.

Heres a python doc on how to use python in the web, that might be helpful or gives an idea:
https://docs.python.org/3/howto/webservers.html
Paolo
3 August 2015, 14:15
I think too that a contol panel would be a boost feature.
Excluding a DB solution, what about the possibility to save config file in JSON format?
Easy to encode and decode ---> easy creation of a web panel in any language available.
My 2c.
faelar
11 August 2015, 21:00
I work as a system administrator for a ISP. We try to avoid restarting apache but do not fear to do so when a simple "reload" is not sufficient. Maybe the best solution is to allow the reloading of "simple" (if there is such a thing) changes, and require a full restart for more sensible stuff.

I don't care about a web interface/panel, hiawatha configuration is more than clear to me and we only use CLI for that kind of administration.

On important point you should target if you aim for production, is to add the ability to easily generate as many metrics as possible. I never tried hiawatha monitor, but we already have many tools and services (cacti, nagios, crappy-internal-app...). I can't install a full LAMP stack to have yet another thing to look at while working. I would like to see something as clean as the rest of hiawatha : an option to drop data in a easily parsable file.

On a side note, the keyword for python support is "WSGI"
Rune
12 September 2015, 17:29
If the configuration has been changed and the webserver needs to be reloaded then maybe it could be possible to start a new web server that will take care of new connections and when all old connections are closed then the old instance is closed down.
Hugo Leisink
12 September 2015, 17:31
That won't work, because then both instances need to bind to the same ports. That's not possible. But thanks for the idea anyway.