Discussion:
stress testing of Apache server
Sergey Ten
2005-05-03 20:08:46 UTC
Permalink
Hello all,

SourceLabs is developing a set of tests (and appropriate workload data) to
perform stress testing of an Apache server using requests for static HTML
pages only. We are interested in getting feedback on our plans from the
Apache server community, which has a lot of experience in developing,
testing and using the Apache server.

Information available on the Internet, as well as our own experiments, make
it clear that stressing a web server with requests for static HTML pages
requires special care to avoid situations when either network bandwidth or
disk IO become a limiting factor. Thus simply increasing the number of
clients (http requests sent) alone is not the appropriate way to stress the
server. We think that use of a special workload data (including httpd.conf
and .htaccess files) will help to execute more code, and as a result, better
stress the server.

We anticipate that our tests will cover 80% of code (using test and code
coverage metrics) for the following Apache modules (as most commonly used):
. mod_rewrite;
. mod_auth and mod_auth_ldap;
. mod_ssl.

We would like to know your opinions and feedback on:
. What are the stress scenarios where you have had problems?
. What are the modules (including external) you have stressed which
are missed from our list?
. What are the modules (including external) you would be interested in
stressing?

Additional feedback on the validity of our test plan, ranking modules
according to their importance, etc would also be much appreciated.

Best regards
Sergey Ten,
SourceLabs

Dependable Open Source Systems
Sergey Ten
2005-05-04 00:36:38 UTC
Permalink
Hello Paul.

Thank you very much for your time and suggestions.

My comments are inline.

Thanks,
Sergey
-----Original Message-----
Sent: Tuesday, May 03, 2005 1:52 PM
Subject: Re: stress testing of Apache server
Post by Sergey Ten
Hello all,
SourceLabs is developing a set of tests (and appropriate workload data)
to
Post by Sergey Ten
perform stress testing of an Apache server using requests for static
HTML
Post by Sergey Ten
pages only. We are interested in getting feedback on our plans from the
Apache server community, which has a lot of experience in developing,
testing and using the Apache server.
Information available on the Internet, as well as our own experiments,
make
Post by Sergey Ten
it clear that stressing a web server with requests for static HTML pages
requires special care to avoid situations when either network bandwidth
or
Post by Sergey Ten
disk IO become a limiting factor. Thus simply increasing the number of
clients (http requests sent) alone is not the appropriate way to stress
the
Post by Sergey Ten
server. We think that use of a special workload data (including
httpd.conf
Post by Sergey Ten
and .htaccess files) will help to execute more code, and as a result,
better
Post by Sergey Ten
stress the server.
http://httpd.apache.org/test/flood/
We are using flood to test Apache. Functionality it provides is adequate for
our needs; there are a couple of things missed (for instance flood does not
allow to verify that http request returns a given code: verify_resp
functions treat any return code different from 200 and 3xx as a failure) but
they can be easily fixed.
I am not 100% sure on the purpose of your test. If you want to tune
apache for the highest performance, one of the first things is to remove
.htaccess files. Doing hundreds of mod_rewrite rules inside an
.htaccess file for example has a *huge* penalty.
Post by Sergey Ten
We anticipate that our tests will cover 80% of code (using test and code
coverage metrics) for the following Apache modules (as most commonly
. mod_rewrite;
. mod_auth and mod_auth_ldap;
. mod_ssl.
Which mod_auth? File? DBM? Basic? Digest?
A mixture of File and DBM via virtual hosts and .htaccess. We are using
Basic authentication. In order to use Digest flood sources need to be
modified (currently flood can include only basic auth info into http
header). The change is simple and I will probably make it when have cycles.
Different MPMs? Worker, Event, Prefork, Leader.. etc.
It would be nice if you could do 1.3, 2.0 and 2.1. Some things also
depend greatly on the OS... For the threaded MPMs on linux, using 2.6
and NPTL can make a huge difference.
We are currently testing Apache 1.3, on RHEL 3 on 32-bit x86. We plan to run
tests against Apache 2.x pretty soon, although have not finalized details
yet (MPMs is one of them). We are scheduling additional resources to run
tests on different platforms/OS. Which platforms/OS will be interesting for
you?
Post by Sergey Ten
. What are the stress scenarios where you have had problems?
Running out of RAM. If the machine runs out of RAM, performance will
tank. The reason many 'benchmarks' aren't realistic is that they do not
simulate real 'Internet' load. On the Internet you have a huge range of
clients, from 28.8 modems with 1 full second of latency, to DSL and
everything in between. Many of the benchmarks are done over 100mbit or
greater LANs, that do not accurately simulate how traffic hits a real
server.
Post by Sergey Ten
. What are the modules (including external) you have stressed which
are missed from our list?
. What are the modules (including external) you would be interested in
stressing?
mod_disk_cache
mod_proxy
It would also be interesting compare proxying for Tomcat, and test
mod_proxy_ajp vs mod_jk vs mod_proxy_http.
The effects of mod_defalte? (Gzip content compression... look at how
much server CPU for how much less bandwidth it uses? Most servers of
static content are bandwidth limited, not CPU, so in most cases it makes
sense to use mod_deflate.)
I will take a look at these modules. Thank you.
Post by Sergey Ten
Additional feedback on the validity of our test plan, ranking modules
according to their importance, etc would also be much appreciated.
It would be cool to have more details on how you intend to do the
benchmark, including the configuration files, and what client
application...
We will make all our tests, including the configuration files (for both
server and clients), available to the open source community.
Paul Querna
2005-05-03 20:51:55 UTC
Permalink
Post by Sergey Ten
Hello all,
SourceLabs is developing a set of tests (and appropriate workload data) to
perform stress testing of an Apache server using requests for static HTML
pages only. We are interested in getting feedback on our plans from the
Apache server community, which has a lot of experience in developing,
testing and using the Apache server.
Information available on the Internet, as well as our own experiments, make
it clear that stressing a web server with requests for static HTML pages
requires special care to avoid situations when either network bandwidth or
disk IO become a limiting factor. Thus simply increasing the number of
clients (http requests sent) alone is not the appropriate way to stress the
server. We think that use of a special workload data (including httpd.conf
and .htaccess files) will help to execute more code, and as a result, better
stress the server.
Which tools are you planning to use? Flood might be useful:
http://httpd.apache.org/test/flood/

I am not 100% sure on the purpose of your test. If you want to tune
apache for the highest performance, one of the first things is to remove
.htaccess files. Doing hundreds of mod_rewrite rules inside an
.htaccess file for example has a *huge* penalty.
Post by Sergey Ten
We anticipate that our tests will cover 80% of code (using test and code
. mod_rewrite;
. mod_auth and mod_auth_ldap;
. mod_ssl.
Which mod_auth? File? DBM? Basic? Digest?

Different MPMs? Worker, Event, Prefork, Leader.. etc.

It would be nice if you could do 1.3, 2.0 and 2.1. Some things also
depend greatly on the OS... For the threaded MPMs on linux, using 2.6
and NPTL can make a huge difference.
Post by Sergey Ten
. What are the stress scenarios where you have had problems?
Running out of RAM. If the machine runs out of RAM, performance will
tank. The reason many 'benchmarks' aren't realistic is that they do not
simulate real 'Internet' load. On the Internet you have a huge range of
clients, from 28.8 modems with 1 full second of latency, to DSL and
everything in between. Many of the benchmarks are done over 100mbit or
greater LANs, that do not accurately simulate how traffic hits a real
server.
Post by Sergey Ten
. What are the modules (including external) you have stressed which
are missed from our list?
. What are the modules (including external) you would be interested in
stressing?
In 2.x:
mod_disk_cache
mod_proxy

It would also be interesting compare proxying for Tomcat, and test
mod_proxy_ajp vs mod_jk vs mod_proxy_http.

The effects of mod_defalte? (Gzip content compression... look at how
much server CPU for how much less bandwidth it uses? Most servers of
static content are bandwidth limited, not CPU, so in most cases it makes
sense to use mod_deflate.)
Post by Sergey Ten
Additional feedback on the validity of our test plan, ranking modules
according to their importance, etc would also be much appreciated.
It would be cool to have more details on how you intend to do the
benchmark, including the configuration files, and what client application...
Paul A. Houle
2005-05-04 15:30:39 UTC
Permalink
Post by Sergey Ten
Hello all,
SourceLabs is developing a set of tests (and appropriate workload data)
to
perform stress testing of an Apache server using requests for static
HTML
pages only. We are interested in getting feedback on our plans from the
Apache server community, which has a lot of experience in developing,
testing and using the Apache server.
Although Apache is hardly the fastest web server, it's fast enough at
serving static pages that there are only about 1000 sites in the world
that would be concerned with it's performance in that area...

Ok, there's one area where I've had trouble with Apache performance,
and that's in serving very big files. If you've got a lot of people
downloading 100 MB files via dialup connections, the process count can
get uncomfortably high. I've tried a number of the 'single process' web
servers like thttpd and boa, and generally found they've been too glitchy
for production work -- a lot of that may involve spooky problems like
sendfile() misbehavior on Linux.
Post by Sergey Ten
Information available on the Internet, as well as our own experiments,
make
it clear that stressing a web server with requests for static HTML pages
requires special care to avoid situations when either network bandwidth
or
disk IO become a limiting factor. Thus simply increasing the number of
clients (http requests sent) alone is not the appropriate way to stress
the
server. We think that use of a special workload data (including
httpd.conf
and .htaccess files) will help to execute more code, and as a result,
better
stress the server.
If you've got a big working set, you're in trouble -- you might be
able to get a factor of two by software tweaking, but the answers are:

(i) 64-bit (or PAE) system w/ lots of RAM.
(ii) good storage system: Ultra320 or Fibre Channel. Think seriously
about your RAID configuration.

Under most circumstances, it's not difficult to get Apache to
saturate the Ethernet connection, so network configuration turns out to
be quite important. We've had a Linux system that's been through a lot of
changes, and usually when we changed something, the GigE would revert to
half duplex mode. We ended up writing a script that checks that the GigE
is in the right state after boot completes and beeps my cell phone if it
isn't.

==================

Whenever we commission a new server we do some testing on the machine to
get some idea of what it's capable of. I don't put a lot of effort into
'realistic' testing, but rather do some simple work with ApacheBench.
Often the answers are pretty rediculous: for instance, we've got a site
that ranks around 30,000 in Alexa that does maybe 10 hits per second at
peak times... We've clocked it doing 4000+ static hits per second w/
small files, fewer hits per second for big files because we were
saturating the GigE.

What was useful, however, was quantifying the performance effects of
configuration changes. For instance, the Apache documentation warns that
"ExtendedStatus On" hurts performance. A little testing showed the effect
was minor enough that we don't need to worry about it with our workload.

Similarly, we found we could put ~1000 rewriting rules in the httpd.conf
file w/o really impacting our system perfomance. We found that simple PHP
scripts ran about 10x faster than our CGI's, and that static pages are
about 10x faster than that.

We've found tactical microbenchmarking quite useful at resolving our pub
table arguments about engineering decisions that effect Apache performance.

Personally, I'd love to see a series of microbenchmarks that address
issues like

* Solaris/SPARC vs. Linux/x86 vs. Mac OS X/PPC w/ different MPMs
* Windows vs Linux on the same hardware
* configuration in .htaccess vs. httpd.conf
* working set smaller/larger than RAM
* cgi vs. fastcgi vs. mod-perl
* SATA vs. Ultra320 SCSI for big working sets

and so on... It would be nice to have an "Apache tweakers guide" that
would give people the big picture of what affects Apache performance under
a wide range of conditions -- really I don't need precise numbers, but a
feel of around 0.5 orders of magnitude or so.

It would be nice to have a well-organized website with canned numbers,
plus tools so I can do these benchmarks easily on my own systems.

===============

Speaking of performance, the most frustrating area I've dealt with is
performance of reverse DNS lookups. This is another area where the Apache
manual is less than helpful -- it tells you to "not do it" rather than
give constructive help in solving problems.

We had a server that had heisenbug problems running RHEL 3, things
stabilized with a 2.6 mainline kernel -- in the process of dealing with
those problems, we developed diagnostic tools that picked up glitches in
our system that people wouldn't really notice during operations. (People
expect 'the internet' to be a little glitchy, so we don't get howls when
the system is sporadically unavailable for a minute.)

We found out our system was 'seizing up' and becoming unavailable for
about two minutes every three hours because our DNS provider reloads the
tables on our DNS servers around that time. We found out that nscd, with
out of the box settings for RHEL 3, was making the problem worse,
because it was set to use 5 threads -- resolving 100 or so unique
addresses a minute, it's not hard to block 5 threads.

Problems like this are obscure, and it would be nice to seem them talked
about in an "Apache tweakers guide"
Dirk-Willem van Gulik
2005-05-04 19:51:14 UTC
Permalink
Post by Paul A. Houle
Ok, there's one area where I've had trouble with Apache performance,
and that's in serving very big files. If you've got a lot of people
downloading 100 MB files via dialup connections, the process count can
get uncomfortably high. I've tried a number of the 'single process' web
My expericen with dialup or satcom links is that such is mostly a function
of the OS and its TCP stack. Apart from upping the obvious it can help to
disallow (or intentionally) allow large buffers on OS level and/or tune
the moment the lingering starts to let apache off the hook again. The
behavioural differences between BSD, Linxu and Solaris are enourmous for
this type of workload.

Dw

Loading...