Caching, practical caching.

As I go along managing a few sites, managing a few servers before, most of my time online were spent studying on socializing, information security, server administration (security, optimization), and others. ‘Others’ might contribute to a bigger portion, but just to emphasize that one part of server administration is optimization, to make a software work better in our case.

One major part of optimization is to use caching. One good example of caching is in reverse-proxy, which will sit infront of your server, and watching all requests passing through. Static contents will be cached in the first request, and the following request will be served from the cache. Previously, squid have been a good option for this. Afterwards, nginx comes into place, where it has proven itself to be a better candidate with small memory footprint, and capability to better handle static contents and a lot of request. Remi from WebFaction have done a benchmark where Nginx is serving static content at 10k requests per second. Cool!

Last few month I have read about varnish, and never had much more reading into it, until last few weeks where someone have come out with a case study on how varnish helps them in static content caching. I have done some tests on varnish, and it is not that easy to deploy one in your environment to best utilize it capability. It need to be tuned in accordance to your application, because you need to tell what to cache and what not to. On top of that, varnish is sensitive to cookies, you need to manage all the cookies as well. In an environment where you have unpredicted application deployment, such as hosting company, it would not be as affective as a dedicated configuration for a single application. Thats the compromise that you have to make. However, this is a good option to have, at least to reduce the requests that you’re getting on the server.

Content Delivery Network, or CDN do implement caching as well for all static contents that it is serving. The whole CDN idea is about caching as well, retrieving the same contents from the nearest node, and to reduce the loads from the main server. there are a few options of CDN nowadays. Cloudflare provide and easy interface to start using CDN. You just need to change your NS record of your domain, and you can start using much more features offered by Cloudflare such as DDOS protection and application firewall. Another option is MaxCDN, and Aflexi. Aflexi offers CDN software, for anyone to start offering CDN service to their clients. You can apply one from Exabytes, which do offer CDN harnessed by Aflexi.

Talking about application side caching, if you’re using wordpress, wordpress have some plugins which will do caching, such as wp-cache, wp-supercache. I personally prefer wp-cache, which works for me last time I tried it. One thing to note that, wp-cache will cache the whole page generated by wordpress, and will keep it for a pre-determined duration, in configuration section. Besides that, Jeff Star have written an article on how to make WordPress faster, basically by turning on some internal variables, which will skipped database queries for certain information by harcoding them in wp-config.php file.

For example, defining blog and site URL:
define('WP_HOME', ''); // blog url
define('WP_SITEURL', ''); // site url

Hardcode stylesheet and template path
define(‘TEMPLATEPATH’, ‘/absolute/path/to/wp-content/themes/H5’);
define(‘STYLESHEETPATH’, ‘/absolute/path/to/wp-content/themes/H5’);

And defining encryption key for internal data in wordpress. You can generate it from secret-key service.

If you are a programmer, consider using memcache. Memcached is a lightweight in-memory object caching server, which can store object data from your application, to be retrieved again faster. It was developed by LiveJournal to harness their web operation until now. Detail of LiveJournal setup was entailed in this article, Distributed Caching with Memcached, by Brad Fitzpatrick. The article describe technically how memcached works, and how it is scalable, to be implemented site-wide. You will get the idea, on why the same web server can host memcached as well, possible more than 1 instance of memcached, and how your application will make use of the whole memcached cluster. In LiveJournal case, they have 28 instance of memcached running, holding 30GB of popular data.

One more thing, install this PHP script, to monitor your memcached cluster. Written by Harun Yayli, the script will be password protected, and enabled you to view information from each memcached instances configured. Hit rate, miss rate, uptime, version and data size.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.