Apache vs Others

Interesting comment in Homo-Adminus Blog. Will copy it here just to not disapear. Meditate on this I will (Yoda-style speaking ;) ).

Apache doesn’t fork a new instance for every connection.
Rather it “preforks” some instances, and all those wait for connections. When there is an incoming connection, a random instance processes it. Since selecting this process is an O(1) operation, it actually scales better than single-process servers which have to allocate data structures for processing the connection when receiving it, this depends on the data structures used but it’s usually O(log n) [In the best case?!]. (Because the preforked process’ stack is already allocated)
All the instances share the same memory segment for code and data, only when modifying data is it copied and then written to. So every instance only uses minimal memory, and only while processing connections. This is the same with single-process servers, processing connections takes at least as much memory for them too (or more: counting the poll arrays they can very well take more).
Since “zero” copy file sending is more efficient than copying from kernel to user-space and back again, efficient servers will want to use sendfile or equivalent. But sendfile is synchronous, so the kernel has no chance to reorder disk accesses when a single process is sendfileing from scattered places from all over the disk(s). When multiple processes sendfile however, the kernel can efficiently reorder disk access, because the threads block and yield. Therefore sending the file will have more latency, but higher throughput, and therefore overall higher performance. The disk has an upper limit to throughput, for example with a modern raid array this can be 300MB/s for sequential access, and 20MB/s when seeking randomly. Depending on the size of the site served (sites needing to pay attention to httpd performance can to be huge), the disk and in-memory cache can saturate, and single-process servers will hit the 20MB/s throughput limitation because of seeking, while multi-process servers will only hit the sequential throughput limit which is much higher (say 300MB/s).
Therefore, “preforking” servers such as Apache, xs-httpd or AOLserver are theoretically, in every respect i can think of, higher performance than single-processing servers such as lighttpd, nginx, thttpd or Zeus.