" /> Kazuho at Work: September 2007 Archives

« July 2007 | Main | October 2007 »

September 27, 2007

KeyedMutex - a mutex for web services

Yesterday, I wrote:

Normally, a cache entry has a single lifetime. The problem is that if a cache entry is read frequently and if it takes time to update the entry, a situation known as thundering herd would occur on expiration; i.e. many cache consumers will detect expiration and send same update requests to the backend, causing a performance decline. There are two solutions to the problem:
  • use a proxy that combines identical update requests as a single request
  • initiate an update request prior to expiration, when lifetime being left gets below a certain threshold

Kazuho at Work: Swifty 0.05 and the Thundering Herd

As a complement for the second appoarch I took in Swifty 0.05, I wrote a tiny server that enforces exclusive access to databases. Actually it is not a proxy but works much like a mutex object between processes, but has a slightly different interface to fit into the realities of web services. Here comes the sample code.

# start the server in shell
% keyedmutexd >/dev/null &
# perl source code
use KeyedMutex;

my $km = KeyedMutex->new;
...
until ($value = $cache->get($key)) {
    if ($km->lock($key)) {
        # locked, read from DB
        $value = get_from_db($key);
        $cache->set($key, $value);
        $km->release;
        last;
    }
}

As you can see, it is very simple to use KeyedMutex. When calling the lock function, it either returns immediately telling you that a lock for given key has been acquired, or returns false after some other client has stored the result to the cache.

This kind of approach is not always necessary, but is essential for services that need to issue heavy queries to the backend.

September 26, 2007

Swifty 0.05 and the Thundering Herd

This is the release announcement of swifty 0.05.

Changes from previous version (0.03) are:

  • added code to support PowerPC and SPARC (not tested at all)
  • added internal checksum routine to guaranty integrity of cached data
  • added refresh_before property

About the refresh_before property

Normally, a cache entry has a single lifetime. The problem is that if a cache entry is read frequently and if it takes time to update the entry, a situation known as thundering herd would occur on expiration; i.e. many cache consumers will detect expiration and send same update requests to the backend, causing a performance decline. There are two solutions to the problem:

  • use a proxy that combines identical update requests as a single request
  • initiate an update request prior to expiration, when lifetime being left gets below a certain threshold

In version 0.05, as I did in Cache::Adaptive, I implemented the latter approach*1 to let applications start updating cache entries prior to their expiration. When the refresh_before property is set to nonzero, swifty would set the do_refresh flag when a cache entry is read for the first time after its left lifetime becomes below refresh_before.

The application logic can check if the flag is being set, and start updating the cache entry. Here's the example code.

my $swifty = Cache::Swifty->new({
    dir            => ...,
    lifetime       => 300, # cache lifetime is 5 minutes
    refresh_before => 10,  # one of the consumer will be notified when left lifetime<10
});

(snip)

$entry = $swifty->get(...);
if ($swifty->do_refresh) {
    # start updating the cache entry
}
*1: The best way to solve the problem depends on each application, but there is nothing a cache system can help when the former approach is being taken.

September 18, 2007

Swifty 0.03

This is the release announcement of swifty version 0.03 and its perl binding Cache-Swifty 0.03.

To install swifty, run configure, make, and make install. To install Cache-Swifty, run perl Makefile.PL, make, and make install. Changes from the previous version are:

  1. change cache model from direct map to n-way set associative (max. of n is 64)
  2. bug fixes
  3. optimizations in perl binding

Below are the performance numbers on an Opteron server running CentOS 5. The left column lists the cache modules used. Swifty_direct shows the numbers when directly calling the XS interface of Swifty. Hash is a OO-style wrapper for perl's ordinally hash. Interesting points are:

  • Cache::FastMmap is comparatively faster than on a Mac OS X
  • on reads, swifty_direct is only from 19% to 28% slower than perl's ordinally hash
I wonder if someone could rewrite all of the Cache::Swifty interface in C, so that we can see a similar performance when using swifty with an OO interface.

$ perl -Iblib/lib -Iblib/arch -Ilib benchmark/all.pl 500 hash Cache::FastMmap Cache::Swifty swifty_direct

Write (16 bytes):
                  Rate Cache::FastMmap Cache::Swifty swifty_direct          hash
Cache::FastMmap 13.1/s              --          -90%          -94%          -97%
Cache::Swifty    133/s            919%            --          -40%          -74%
swifty_direct    223/s           1605%           67%            --          -56%
hash             510/s           3798%          283%          129%            --

Read  (16 bytes):
                  Rate Cache::FastMmap Cache::Swifty swifty_direct          hash
Cache::FastMmap 41.6/s              --          -80%          -91%          -93%
Cache::Swifty    205/s            392%            --          -57%          -65%
swifty_direct    476/s           1044%          132%            --          -19%
hash             588/s           1313%          187%           24%            --

Write (1024 bytes):
                  Rate Cache::FastMmap Cache::Swifty swifty_direct          hash
Cache::FastMmap 12.6/s              --          -90%          -94%          -97%
Cache::Swifty    125/s            893%            --          -39%          -70%
swifty_direct    207/s           1541%           65%            --          -50%
hash             413/s           3183%          231%          100%            --

Read  (1024 bytes):
                  Rate Cache::FastMmap Cache::Swifty swifty_direct          hash
Cache::FastMmap 38.6/s              --          -79%          -91%          -93%
Cache::Swifty    183/s            375%            --          -56%          -68%
swifty_direct    413/s            972%          126%            --          -28%
hash             575/s           1391%          214%           39%            --

September 12, 2007

Swifty-0.02 and Perl Binding

This is the announcement of swifty version 0.02 and its perl binding Cache-Swifty 0.02.

To install swifty, run configure, make, and make install. To install Cache-Swifty, run perl Makefile.PL, make, and make install. Changes from the previous version are:

  1. added configure script
  2. expiration handling by modification time and a global lifetime property
  3. introduction of perl binding

As part of the perl binding, I created a benchmark comparing swifty and memcached (the benchmark script is included in the perl binding).

$ perl -Iblib/lib -Iblib/arch -Ilib benchmark/memcached.pl

Write (16 bytes):
                   Rate Cache::Memcached    Cache::Swifty
Cache::Memcached 6.96/s               --             -94%
Cache::Swifty     108/s            1444%               --

Read  (16 bytes):
                   Rate Cache::Memcached    Cache::Swifty
Cache::Memcached 3.87/s               --             -99%
Cache::Swifty     291/s            7420%               --

Write (1024 bytes):
                   Rate Cache::Memcached    Cache::Swifty
Cache::Memcached 6.88/s               --             -93%
Cache::Swifty     101/s            1375%               --

Read  (1024 bytes):
                   Rate Cache::Memcached    Cache::Swifty
Cache::Memcached 3.77/s               --             -99%
Cache::Swifty     281/s            7357%               --
displayed units are 1,000 iterations

It is difficult to draw a single conclusion from this benchmark, since the design goals of the softwares are quite different (see my previous post for the design of swifty). For example, swifty concentrates on speeding up a shared-memory cache on a single server, while memcached is a disributed system. But looking at the numbers, clear advantage seems to exist for swifty in single-server systems. Also it seems to be logical to use swifty as a frontend for memcached.

05:06pm added: Benchmark against Cache::FastMmap. Which cache module is better depend would depend on the cache access pattern and how high the penalty cost of cache misses are.

$ perl -Iblib/lib -Iblib/arch -Ilib benchmark/all.pl

Write (16 bytes):
                  Rate Cache::FastMmap   Cache::Swifty
Cache::FastMmap 8.63/s              --            -92%
Cache::Swifty    108/s           1157%              --

Read  (16 bytes):
                  Rate Cache::FastMmap   Cache::Swifty
Cache::FastMmap 36.5/s              --            -88%
Cache::Swifty    296/s            709%              --

Write (1024 bytes):
                  Rate Cache::FastMmap   Cache::Swifty
Cache::FastMmap 8.57/s              --            -92%
Cache::Swifty    104/s           1113%              --

Read  (1024 bytes):
                  Rate Cache::FastMmap   Cache::Swifty
Cache::FastMmap 35.5/s              --            -87%
Cache::Swifty    282/s            695%              --