2007.01_Ready for Traffic-Tips for Optimizing Apache, Postfix, Oracle, Mysql, and Samba.pdf

(658 KB) Pobierz
Ready for Traffic - Tips for optimizing Apache, Postfix, Oracle, MySQL, and Samba - Linux Magazine
Tips for optimizing Apache, Postfix, Oracle, MySQL, and Samba
Ready for Traffic
Your homepage was just linked by Slashdot, a new email campaign goes out tonight, and you need the
database to deliver survey results. We'll show you how to help your servers survive the strain.
By Badran F. arwati, Peer Heinlein, Ralf Hildebrandt, Charly Kühnast, and Volker Lendecke
www.fotolia.de, andyb1126
Server optimization is a question of survival: a server's efficiency drops when the load increases. And if
incoming requests push the load too high, the system could crash. To avoid bringing down critical services,
you need to be proactive. Faster hardware might not be an option, but a better configuration and some simple
optimization steps might help. In this article, five experts provide an inside look at how to maximize
throughput on your web, mail, database, and file servers.
A Healthy Start
No matter what kind of servers you manage, there are a few basic rules that every system administrator should
know:
1. Pay attention to the mix! One factor often determines the overall system performance: CPU, I/O, or
memory usage. It makes sense to distribute services to avoid running two services with the same
performance-defining factor on the same server (Figure 1).
Figure 1: Two services with completely different CPU cycle, memory, and I/O load requirements leverage the
power of a server far better than applications that compete for resources.
Ready for Traffic
1
594249287.002.png 594249287.003.png 594249287.004.png
For example a mail server's speed is typically restricted by disk and network latency. Of course, the integrated
virus scanner will keep the CPU busy on some servers, but if not, the CPU on a server that does nothing but
distribute mail should have plenty of time to handle other tasks.
2. Monitor everything! The only way of discovering the key performance factors is to monitor operations over
an extended period; this will give you values for the CPU, I/O, or memory usage during normal operations.
vmstat and sar disk functions analyze disk throughput; top , htop , uptime , or sar can help you monitor the
CPU; ps , top , or sar can help you track memory consumption.
You can extrapolate a full load scenario from the values for normal load. SNMP based monitoring (using
Nagios, for example) will warn you of imminent disaster. After identifying the bottlenecks, don't forget to
disable unneeded logging.
3. Excessive swapping is every system administrator's nightmare. Many back-end services let you configure
an upper threshold for the maximum number of active instances. As swapping memory in and out drastically
impacts other disk access, setting these thresholds too high will leave more and more processes waiting for the
system to handle their I/O requests, which in turn leads to even more swapping - a vicious circle.
All the services taken together should not be allowed to spawn more instances than the server machine's
physical memory can hold.
RAM or Disk?
4. Never waste your server's memory. Many back-end services can be streamlined; make sure you only load
the modules you need, or rebuild packages to match your needs. This gives you more space for instances in
RAM. By default, Apache will typically load a bunch of unnecessary services; if you have Postfix, you might
consider building binaries without MySQL, TLS, or LDAP support. The memory footprint of CDB is much
smaller than that of the Berkeley DB. With read-only maps, changing the map type will often save memory.
5. Clearly separate partitions for data, system, and logs! Separate partitions give you the ability to select the
best filesystem for the job (for example, Ext 3 for the system partition, and XFS for the data partition). Some
disk or RAID systems avoid hard disk thrashing in case of competing, concurrent access.
6. Log the actions your services perform! Without logs, you will not have any data to analyze or troubleshoot.
A - prefix in the logfile name in /etc/syslog.conf enables asynchronous writing and reduces the load on the
filesystem.
Keep Safely
7. Back up your current configuration before you make a change. Small changes might affect your server's
performance in a completely unprecedented way, and finding out why can take up a lot of your valuable time.
A system administrator should thus use version control for configuration files, or at least create a backup
before implementing changes. This gives you the ability to react if customers complain of sudden
performance hits.
8. Store everything that doesn't change in a cache! Caches can be a big help in many situations: reverse
proxies (such as Squid) upstream of database-based CMS systems can reduce the load on the database.
Caching DNS servers (such as Dns-Cache, or Bind) remove the need for log analyzers and mail servers to
perfom DNS lookups. The internal cache in the Amavisd New virus scanner prevents repeated analysis of the
same content.
Use a Doorkeeper
9. Get rid of uninvited guests as early as possible! Users without access to the server can't create load on the
server. A firewall, access controls, and smtpd_*_checks in Postfix send uninvited guests packing before they
have a chance to generated unwanted system load. The Anvil server in Postfix [1] will additionally restrict the
number of messages the server accepts over a unit of time to a level that will prevent performance loss due to
Ready for Traffic
2
queueing. In a similar way, Cband [2] supports bandwidth limits on the Apache web server.
10. Knock softly! Port knocking is a resource-saving way of keeping the firewall completely tight but still
allowing trusted users to log on. Measures such as using one-time passwords, or relocating services such as
SSH to unknown ports, are good for security, and they'll prevent uninvited guests from stressing your CPU.
Figure 2: Classic crash scenario: if the number of processes under load increases so drastically that the
machine starts to swap, the drop in throughput will increase the load even more.
Web Servers
If your URL is published on a high-profile site, you can expect a dramatic increase in visitors. The following
steps will help your servers handle the load.
1. Take care to select the right multi-processing module! The prefork MPM forks a number of identical
Apache processes and is best suited to machines with up to two CPUs. The more CPUs your web server has,
the more likely it is that the worker MPM, which uses multiple threads per process, is the better choice.
2. Make good use of the cache! Apache has two mechanisms, mod_disk_cache and mod_mem_cache , for
caching frequently requested content. If you have a lot of RAM (and, after all, there is no replacement for
RAM), mod_mem_cache [3] is your best option.
Ditch the Ballast
3. Ditch your ballast! The Htaccess mechanism may be useful, but it is also a performance killer. So, get rid of
it if you don't need it. AllowOverride None will remove the need for time-consuming parsing of .htaccess .
4. Ditch even more ballast! Sysadmins will also want to remove symlinks ( Options -FollowSymLinks ) and any
modules they don't need. The perfect solution is to build a static version of Apache with everything you need,
and not to load any modules at all at runtime.
5. Do without lookups! Hostname lookups will slow down even the fastest nameserver. HostnameLookups off
removes the bottleneck. If you really need this information, you can perform any lookups you need later when
you review your logs with a tool such as Webalizer.
6. Honor your clients, and don't make them wait. The MaxClients directive is critical for web performance. If
you set too low a value for MaxClients , not all clients will be serviced in a timely fashion; if the value is too
high, your clients will be forced to wait in the TCP queue. The only way to discover the right value is load
testing.
7. Get rid of any logfiles you don't need! Logging costs time. Even a single logfile that nobody needs is one
too many. If you log on external disks, make sure you use an extremely fast SAN; NFS will tend to be a
bottleneck.
Ready for Traffic
3
594249287.005.png
8. Always use sendfile! Sendfile is a system call that delegates the passing of files from network sockets to the
kernel. This saves memory (by doing without a read buffer), and is quicker at the same time. Apache will use
sendfile if you enable EnableSendfile .
9. Be aware of MMAP! MMAP support, via the mod_mmap_static module, gives Apache the ability to access
files like contiguous memory space, which in turn is good for performance.
10. Don't use internal server monitoring! Apache's self-monitoring ability ( SetHandler server-status... ) is
useful for tests and debugging, but make sure you disable it after completing your tests.
Figure 3: As applications such as spam or virus filters on mail servers need constant access to files, it typically
makes sense to invest in a RAM disk for the filter software working directory on the server.
Mail Servers
If your mail server is threatening to buckle under the load, you definitely need to sort our your priorities. First
of all, you have to make sure that the system stays stable and works effectively despite the heavy load. Don't
even think of optimizing for more speed until you have achieved stability. Some practical tips will help you
manage the spikes.
1. Limit the number of instances! The default value in the Postfix master.cf file sets the maximum instance
count to 100. Depending on the version and built-in capabilities, an instance can consume about 3 MB RAM,
quickly leading to an out-of-memory condition on a server with restricted resources, and thus to a system
crash. Spam and virus filters also increase memory requirements if you have multiple parallel instances
running.
If swapping is slowing your server down, it makes sense to reduce the number of instances. Many parallel
instances on an overloaded system will just interfere with one another, severely affecting the data throughput.
2. Help the spam filter with a RAM disk! A spam and virus filter on mail servers, such as Amavisd New or
Spamassassin, often cause bottlenecks. They generate a heavy CPU and I/O load, and thus impact the total
throughput. In this case, swapping /var/spool/amavis/tmp out to a RAM disk might help. The improved
performance means that the server can now handle 14, instead of the seven instances recommended by the
vendor [4].
Avoid Roundabout Routes
3. Cache DNS requests! Mail servers rely on DNS, and they have to handle countless requests. Which are the
MX servers in the domain? Does the sender's domain really exist? Is the client on your RBL list? A caching
DNS server entry in /etc/resolv.conf will save valuable milliseconds in high volume scenarios.
4. Avoid roundabout routes! The typical approach is to forward mails from Postfix to the spam and virus
filters, which in turn hand them back to Postfix. Add more rounds if you use other appliances.
Ready for Traffic
4
594249287.001.png
It is preferable to avoid handing messages from virus or spam filters back to Postfix and to pass them on to the
next appliance instead. If the mail chain handles only incoming mail, the last appliance in the chain can
forward the emails directly to an internal mail server instead of handing them back to Postfix.
Check Responsibilities
5. Only respond if it is your responsibility! If you accept mails that you can't deliver, you are forced to bounce
and return the messages; this is a clear waste of resources. Use local_recipient_maps and
relay_recipient_maps to tell Postfix to accept mails for existing accounts only. This avoids unnecessary load
when spammers are just trying out addresses.
The same thing applies to source addresses: if the domain specified in the header does not exist, the message
must be spam. And there is no way you can respond. To improve total performance,
reject_unknown_sender_domain performs a DNS lookup to validate the source domain before the server
accepts the message.
6. Use only local files in Postfix lookup tables! No matter how convenient MySQL or LDAP-based user or
domain management may be, the effect it has on Postfix performance is negative. A lookup table in hash or
preferably btree format is far quicker. It makes sense to code a script that writes the updated user data from
the MySQL or LDAP table to a local file on the server every thirty minutes.
7. Keep pesky clients at bay with rate limiting! If you discover that an individual client is overtaxing the mail
server, or if an attack is in progress, rate limiting via the smtpd_client_connection_rate_limit parameter will
prevent this from affecting your mail traffic. A firewall can restrict the maximum connect count, and thus
prevent hackers on compromised clients tunneling back hundreds of connections.
8. Don't waste time with problematic mails! If you have an outgoing mail traffic jam, this could be due to
Postfix wasting resources on a large volume of undeliverable mail. maximum_backoff_time sets the time that
Postfix will wait before attempting to redeliver. Increasing this value gives you more cycles to complete a first
try, instead of launching into a series of repeats that are likely to fail. As an alternative, you could set the
fallback_relay parameter, which swaps problematic messages out to another machine that does the dirty work
for the mail server.
Database Servers
The quality of the SQL queries, the design of the database, and the server configuration can considerably
influence database performance. The following tips will help you boost your database server's performance.
1. Choose the right indexes! The indexes are one of the most important things about a database; the server's
response times depend to a great extent on their quality. A B*TREE index (the default index type for many
databases) should be used if the indexed column can hold many different values. The search tree for this index
type will grow more slowly than any other. For columns with just a few different values (such as product
groups), a bitmap type index is preferable for Oracle, or a similar type for other databases.
For tables with just a few rows, TABLE ACCESS (FULL) (or FULL TABLE SCAN in MySQL) will be faster
than index-based access. If many queries use functions such as UPPER(column xyz) , an index of the function
results will give you improved performance, assuming the database engine you are using supports
function-based indexes [5].
2. Always delete unneeded indexes! The Oracle Optimizer does not use indexes that are not required by a
statement. No matter whether an SQL statement uses them or not, the SQL engine will still load every single
index that you define for a table. This costs I/O resources and CPU load.
3. Avoid fragmented indexes! B*TREE indexes are prone to fragmentation over time, due to table updates or
inserts, and this really slows your queries down. You can issue an ANALYZE INDEX index name VALIDATE
STRUCTURE in Oracle to discover the fragmentation status. ALTER INDEX index name REBUILD ONLINE
Ready for Traffic
5
Zgłoś jeśli naruszono regulamin