Showing posts from November, 2009

Compiling Python 2.6 with sqlite3 support

Quick note to self, hopefully useful to others too:

If you compile Python 2.6 (or 2.5) from source, and you want to enable sqlite3 support (which is included in the stdlib for 2.5 and above), then you need to pass a special USE flag to the configuration command line, like this:

./configure USE="sqlite"

(note "sqlite" and not "sqlite3")

5 years of blogging

Today marks the 5th anniversary of my blog. It's been a fun and rewarding experience, and I hope to never run out of interesting topics to post about ;-)

As a sort of retrospective, I was curious to see which of my blog posts have been getting the most traffic. Here's the top 10 over the last 9 months, according to Google Analytics:

1. Performance vs. load vs. stress testing (as an aside, I think this has been wildly popular because I inadvertently hit on a lot of keywords in the title)
2. Experiences deploying a large-scale infrastructure in Amazon EC2
3. Ajax testing with Selenium using waitForCondition
4. Useful tools for writing Selenium tests
5. Load balancing in EC2 with HAProxy
6. Python unit testing part 1: the unittest module
7. HTTP performance testing with httperf, autobench and openload
8. Running a Python script as a Windows service
9. Apache virtual hosting with Tomcat and mod_jk
10. Configuring Apache 2 and Tomcat 5.5 with mod_jk

It's interesting that 2 of t…

Monitoring multiple MySQL instances with Munin

I've been using Munin for its resource graphing capabilities. I especially like the fact that you can group servers together and watch a common metric (let's say system load) across all servers in a group -- something that is hard to achieve with other similar tools such as Cacti and Ganglia.

I did have the need to monitor multiple MySQL instances running on the same server. I am using mysql-sandbox to launch and manage these instances. I haven't found any pointers on how to use Munin to monitor several MySQL instances, so I rolled my own solution.

First of all, here is my scenario:
server running Ubuntu 9.04 64-bit
N+1 MySQL instances installed as sandboxes rooted in /mysql/m0, /mysql/m1,..., /mysql/mNmunin-node package version 1.2.6-8ubuntu3 (installed via 'apt-get install munin-node')
Step 1

Locate mysql_* plugins already installed by the munin-node package in /usr/share/munin/plugins. I have 5 such plugins: mysql_bytes, mysql_isam_space_, mysql_queries, mysql_slo…

Behaviour Driven Infrastructure

I just read a post by Matthew Flanagan on Behaviour Driven Infrastructure or BDI, a concept that apparently originates with Martin Englund's post on this topic. The idea is that you describe what you need your system to do in natural language, using for example a tool such as Cucumber. What's more, you can then use the cucumber-nagios plugin to express the desired behaviour of the new system as a series of Nagios checks. The checks will initiall fail (just like in a TDD or BDD development cycle), but you will make them pass by deploying the appropriate packages and applications to the system.

I also expressed the need for automated testing of production deployments in one of my blog posts. However, BDI goes one step further, by describing a test plan for production deployments in natural language. Pretty cool, and again I can only wish that the Python testing tools kept up with Ruby-based tools such as Cucumber and friends....

Great series of posts on Tokyo Tyrant

Matt Yonkovit has started a series of posts on Tokyo Tyrant at Percona's MySQL Performance Blog. Great in-depth analysis of the reliability and performance of TT.

Part 1: Tokyo Tyrant -- is it durable?
Part 2: Tokyo Tyrant -- the performance wall
Part 3: Tokyo Tyrant -- write bottleneck

(parts 4 and 5, about replication and scaling, are hopefully coming soon)

Google using buildbot for Chromium continuous integration

Via Ben Bangert, this gem of a page showing the continuous integration status for the Chromium project at Google. It's cool to see that they're using buildbot. But just like Ben says -- I wish they open sourced the look and feel of that buildbot status page ;-)

NFS troubleshooting with iostat and lsof

Scenario: you mount a volume exported from a NetApp on several Linux clients via NFS

Problem: you see constant high CPU usage on the NetApp, and some of the Linux clients become sluggish, primarily in terms of I/O

Troubleshooting steps:

1) If iostat is not already on the clients, install the sysstat utilities.

2) On each client mounting from the filer, or on a representative sample of the clients, run iostat with -n so that it shows NFS-related statistics. The following
command will run iostat every 5 seconds and show NFS stats in a nicely tabulated output:

# iostat -nh 5
3) Notice which client exhibits the most NFS operations per second, and correlate it with the NFS volume on that client which shows the most NFS reads and/or writes per second.

At this point you found the most likely culprit in terms of sending NFS traffic to the filer (there could be several client machines in this position, for example if they are part of a cluster).

5) If not already installed, download and install

Automated deployments with Puppet and Fabric

I've been looking into various configuration management/automated deployment tools lately. At OpenX we used slack, but I wanted something with a bit more functionality than that (although I'm not badmouthing slack by any means -- it can definitely be bent to your will to do pretty much whatever you need in terms of automating your deployments).

From what I see, there are 2 types of configuration management tools:
The first type I call 'pull', which means that the servers pull their configurations and their marching orders in terms of applying those configurations from a centralized location -- both slack and Puppet are in this category. I think this is great for initial configuration of a server. As I described in another post, you can have a server bootstrap itself by installing Puppet (or slack) and then 'call home' to the central Puppet master (or slack repository) and get all the information it needs to configure itselfThe second type I call 'push', …