Manoj Kasichainula, Collab.Net
manojk+ac2k@io.com, manoj@collab.net
Apache 1.3 on Unix is a preforking web server. This means that it maintains a pool of processes that are responsible for handling connections. Each child process deals with a single HTTP connection as it arrives, and after that connection is handled, the process hangs around waiting for another connection to process.
This method is robust; the death of a single process affects only a single connection. But, it's not very scalable. A web server handling 5000 incoming connections needs 5000 processes running to deal with them. Each of these processes can potentially live a long time, since today's traffic is full of low-speed modem users.
To alleviate the problem somewhat, Apache 2.0 will have support for threads on Unix systems that have a pthreads interface. This is done through multiprocessing modules (MPMs) that are responsible for managing processes and threads while passing the actual handling of a connection to the Apache core.
In addition for support for the original preforking model, there are also MPMs that use a single thread per connection. There can be a single process that contains all the connection-handling threads, or these threads can be split between different processes to improve reliability.
For the administrator, the addition of MPMs can add some complication. Each MPM can have its own configuration directives, because they behave differently. For example, a simple preforking MPM naturally wouldn't have any sort of threading configuration. We hope to simplify the situation before the final 2.0 release, but for now, here is the layout. Note that the names of these MPMs probably will change.
MinSpareServers
and
MaxSpareServers
are gone, and replaced with:
ThreadsPerChild
MinSpareThreads
MaxSpareThreads
MinSpareThreads
and specifies
the upper threshold for idle threadsDexter's directives are a more radical shift from 1.3 and the prefork MPM. Almost all the process-management directives have been replaced.
NumServers
StartThreads
MinSpareThreads
MaxSpareThreads
MinSpareThreads
and specifies
the upper threshold for idle threads in a processApache 2.0's support for platforms other than Unix should be far better as well. The MPMs described above allow each platform to have its own module for managing threads and/or processes. There are MPMs written for Windows, OS/2, and BeOS, all taking advantage of special features and quirks of their platforms.
We have also based the web server on a new library called the Apache Portable Runtime (APR). APR provides a mostly platform-independant wrapper around platform-dependant system services. This allows Apache to avoid using OS-provided POSIX-emulation layers, which can severly hurt performance and stability.
The build system from 1.3 has been rewritten. It is now based on autoconf and uses libtool. This makes the process of building Apache similar to that for other open source packages, and will hopefully allow us to expend less effort on build configuration and more on cool features. We may provide a configuration interface similar to that used since the early days of Apache. Many people prefer using a text file rather than a command line for build-time configuration
Apache now has some of the infrastructure in place to support serving multiple protocols. mod_echo has been written as an example. In theory, any protocol that runs over a single TCP connection should be implementable, and many multi-connection protocols (FTP, for example) should be possible.
Apache 1.3 used a table of calls into a module to allow the module to take over processing of an HTTP request at various stages. This wasn't very flexible, and a misordering of modules in the configuration file was troublesome.
Modules for 2.0 will instead call a function to register their hooks. For example, mod_auth calls:
ap_hook_check_user_id(authenticate_basic_user,NULL,NULL,HOOK_MIDDLE);
to register its interest in the check_user_id
stage.
With this change, modules get more control over how and when they
are called. For example, a module can specify
HOOK_FIRST
or HOOK_LAST
to specify that
it needs to be called before or after all over modules in that
particular stage of processing.
Modules can also specify that a certain hook must not be allowed to run before or after another module's hook. This functionality is used in mod_mime_magic, to make sure that it only gets called to check a MIME type if mod_mime fails.
This topic is further discussed in Ryan Bloom's talk "Migrating Apache 1.3 modules to Apache 2.0."
There is also a new process_connection
hook that's
used by modules that provide support for protocols other than HTTP.
APR allows Apache to avoid many details of platform independance. APR should also be used by any modules developed for Apache 2.0. The 2.0 APIs have changed to use APR types instead of POSIX types. So at minimum, modules will need to use the translation functions to convert to APR types and back. However, using the APR types throughout the module can be beneficial, because it will improve portability of your module to many different platforms.
This topic is further discussed in Ryan Bloom's talk "APR: What is it, and why we use it in Apache."