Friday, August 18, 2006

On the importance of functional testing

I did not need further proof of the fact that functional tests are a vital piece in a project's overall testing strategy. I got that proof anyway last night, while watching the Pybots buildmaster status page. I noticed that the Twisted unit tests were failing, but not because of errors within the Twisted package, but because pre-requisite packages such as ZopeInterface could not be installed anymore. If you followed my post on setting up a Pybots buildslave, you know that before running the Twisted unit tests, I attempt to instal ZopeInterface and other packages using the newly-built python binary, via "/tmp/python-buildbot/local/bin/python setup.py install".

Well, all of a sudden last night this last command was failing with errors such as:
error: invalid Python installation:
unable to open /tmp/python-buildbot/local/lib/python2.6/config/Makefile
(No such file or directory)

This proved to be was a transient error, due to some recent checkins that modified the release numbers in the python svn trunk from 2.5 to 2.6. This issue was fixed within an hour, but the interesting thing to me was that, while this step was failing in the Pybots Twisted tests, the Python builbots running the Python-specific unit tests against the trunk were merrily chugging along, green and happy (at least on Unix platforms). This was of course to be expected, since nothing major changed as far as the internal Python unit tests were concerned. However, when running a functional test involving the newly-built Python binary -- and in my case that functional test consisted simply in running "python setup.py install" on some packages -- things started to break.

Lesson learned? Always make sure you test your application from the outside in, by exercising it as a user would. Unit tests are necessary (indeed, they are essential), but they are not sufficient by any means. A 'holistic' test strategy takes into consideration both white-box-type unit tests, and black-box-type functional tests. Of course, the recommended way of running all these types of tests is via a continuous integration tool such as buildbot.

Thursday, August 17, 2006

QA blog at W3C

Karl Dubost sent me a message about some issues he had with running Cheescake on a Mac OS X machine. It turned out he was using an ancient version of Cheesecake, although he ran "easy_install Cheesecake". I told him to upgrade to the latest version via "easy_install Cheesecake==0.6" and his problems dissapeared.

Anyway, this is not what I was trying to blog about. Reading his email signature, I noticed he works as a Conformance Manager at W3C. Karl also mentions a QA blog at W3C in his signature. Very interesting blog, from the little I've seen so far. For example, from the "Meet the Unicorn" post, I found out about a W3C project (code-name Unicorn) which aims to be THE one tool to use when you want to check the quality -- i.e. the W3C conformance I suspect -- of web pages. This tool would "gather observations made on a single document by various validators and quality checkers, and summarize all of that neatly for the user." BTW, here is a list of validators and other test tools that you can currently use to check the conformance of your web pages.

Added the blog to my fluctuating collection of Bloglines feeds...Thanks, Karl!

Setting up a Pybots buildslave

If you're interested in setting up a buildbot buildslave for the Pybots project, here are some instructions:

Step 1

Install buildbot on your machine. Instructions can be found here, here, here and here.

Step 2

Create a user that will run the buildbot slave process. Let's call it buildslave, with a home directory of /home/buildslave. Also create a /home/buildslave/pybot directory.

Step 3

Create the file buildbot.tac in /home/buildslave/pybot, with content similar to this:

from twisted.application import service
from buildbot.slave.bot import BuildSlave

# set your basedir appropriately
basedir = r'/home/buildslave/pybot'
host = 'www.python.org'
port = 9070
slavename = 'abc'
passwd = 'xyz'
keepalive = 600
usepty = 1

application = service.Application('buildslave')
s = BuildSlave(host, port, slavename, passwd, basedir, keepalive, usepty)
s.setServiceParent(application)


Step 4

Create a python-tool directory under /home/buildslave/pybots. You must name this directory python-tool, as the buildmaster will use this name in the build steps.

Step 5

Create a file called run_tests.py under the python-tool directory. This is where you will invoke the automated tests for your projects.

How this all works

The buildmaster will have your buildslave execute the following steps, every time a check-in is made into the python subversion trunk (and also every time a check-in is made in the 2.5 branch):

1. Update step: runs "svn update" from the python svn trunk
2. Configure step: runs "./configure --prefix=/tmp/python-buildbot/local"
3. Make step: runs "make all"
4. Test step: runs "make test" (note: this step runs the python unit tests, not your project's unit tests)
5. Make install step: runs "make install"; this will install the newly-built python binary in /tmp/python-buildbot/local/bin/python
6. Project-specific tests step: this is when your run_tests.py file will be run via "/tmp/python-buildbot/local/bin/python ../../python-tool/run_tests.py"
7. Clean step: runs "make distclean"

Important note: since your automated tests will be run via the newly-built python binary installed in /tmp/python-buildbot/local/bin/python, you need to make sure you install all the pre-requisite packages for your package using this custom python binary, otherwise your unit tests will fail because they will not find these pre-requisites. For example, for the Twisted unit tests, I had to install setuptools, ZopeInterface, pycrypto and pyOpenSSL, before I could actually run the Twisted test suite.

So in my run_tests.py file I first call a prepare_packages.sh shell script, before I launch the actual test suite (I copied the pre-requisite packages in /home/buildslave):

$ cat prepare_packages.sh

#!/bin/bash

cd /tmp

rm -rf setuptools*
cp ~/setuptools-0.6c1.zip .
unzip setuptools-0.6c1.zip
cd setuptools-0.6c1
/tmp/python-buildbot/local/bin/python setup.py install
cd ..

rm -rf ZopeInterface*
cp ~/ZopeInterface-3.1.0c1.tgz .
tar xvfz ZopeInterface-3.1.0c1.tgz
cd ZopeInterface-3.1.0c1
/tmp/python-buildbot/local/bin/python setup.py install
cd ..

rm -rf pycrypto-2.0.1*
cp ~/pycrypto-2.0.1.tar.gz .
tar xvfz pycrypto-2.0.1.tar.gz
cd pycrypto-2.0.1
/tmp/python-buildbot/local/bin/python setup.py install
cd ..

rm -rf pyOpenSSL-0.6*
cp ~/pyOpenSSL-0.6.tar.gz .
tar xvfz pyOpenSSL-0.6.tar.gz
cd pyOpenSSL-0.6
/tmp/python-buildbot/local/bin/python setup.py install
cd ..

rm -rf Twisted
svn co svn://svn.twistedmatrix.com/svn/Twisted/trunk Twisted

Then I call the actual Twisted test suite, via:

/tmp/python-buildbot/local/bin/python -Wall /tmp/Twisted/bin/trial --reporter=bwverbose --random=0 twisted

You can see the current Pybots status page here.

If you are interested in setting up your own buildslave to participate in the Pybots project, please send a message to the Pybots mailing list. I will send you a slavename and a password, and then we can test the integration of your buildslave with the buildmaster.

Update 10/16/09

I realized that these instructions for setting up a Pybot buildslave are a bit outdated. Discussions on the Pybots mailing list prompted certain changes to run_tests.py, even though you're still OK if you follow the instructions above to the letter.

Here are some enhancements that you can take advantage of:

1. You can test several projects, each in its own build step, simply by having your run_tests.py script be aware of an extra command-line argument, which will be the name of the project under tests. An example of such a script is here: run_tests.py. The script receives a command-line argument (let's call it proj_name) and invokes a shell script called proj_name.sh. The shell script checks out the latest code for project proj_name (or downloads the latest distribution), then runs its unit tests. Here is an example: Cheesecake.sh.

2. You do not have to hardcode the path to the newly built Python binary in your run_tests.py or your shell scripts. You can simply retrieve the path to the binary via sys.executable. This run_tests.py script sets an environment variable called PYTHON via a call to
os.putenv('PYTHON', sys.executable)
Then the variable is used as $PYTHON in the shell scripts invoked by run_tests.py (thanks to Elliot Murphy and Seo Sanghyeon for coming up with this solution.)

Cheesecake case study: Cleaning up PyBlosxom

Will Guaraldi wrote an article on "Cleaning up PyBlosxom using Cheesecake". Cool stuff!

Will, I hope we meet at the next PyCon, I owe you a case of your favorite beer :-)

Wednesday, August 16, 2006

Pybots -- Python Community Buildbots

The idea behind the Pybots project (short for "Python Community Buildbots") is to allow people to run automated tests for their Python projects, while using Python binaries built from the very latest source code from the Python subversion repository.

The idea originated from Glyph, of Twisted fame. He sent out a message to the python-dev mailing list (thanks to John J. Lee for bringing this message to my attention), in which he said:

"I would like to propose, although I certainly don't have time to implement, a program by which Python-using projects could contribute buildslaves which would run their projects' tests with the latest Python trunk. This would provide two useful incentives: Python code would gain a reputation as generally well-tested (since there is a direct incentive to write tests for your project: get notified when core python changes might break it), and the core developers would have instant feedback when a "small" change breaks more code than it was expected to."

Well, Neal Norwitz made this happen by setting up a buildmaster process on one of the servers maintained by the PSF. He graciously allowed me to maintain this buildmaster, and I already added a buildslave which runs the Twisted unit tests (in honor of Glyph, who was the originator
of this idea) every time a check-in is made in the Python trunk. You can see the buildmaster's status page here.

Note that some of the Twisted unit tests sometimes fail for various reasons. Most of these reasons are environment-related -- for example the user that the buildbot slave runs as used to not have a login shell in /etc/passwd, and thus a specific test which was trying to run the login shell as a child process was failing. Not necessarily a Twisted bug, but still something that's nice to catch.

And this brings me to a point I want to make about running your project's automated tests in buildbot: I can almost guarantee that you will find many environment-specific issues that would otherwise remain dormant, since most likely the way you're usually running your tests is in a controlled environment that you had set up carefully some time ago. There's nothing like running the same tests under a different user's account and environment.

Of course, if you've never run your tests under a continuous integration process such as buildbot, you'll also be pleasantly -- or maybe not so pleasantly -- surprised at the amount of stuff that can get broken by one of your check-ins that you considered foolproof. This is because buildbot tirelessly checks out the very latest source code of your project, then runs your unit tests against that code. When you run your unit tests on your local machine, chances are you might not have synchronized your local copy with the repository.

This does assume that you have unit tests for your project, but since you have been reading this post this far, I assume you either do, or are interested in adding them. I strongly urge you to do so, and also to contribute to the Pybots project by setting up a buildslave for your project.

I'll post another entry with instructions on how to configure a buildslave so that it can be coordinated by the Pybots buildmaster. I also have a mailing list here (thanks, Titus!) for people who are interested in this project. Please send a message there, and I'll respond to you.

The buildmaster is currently running the build steps every time a check-in is made to the Python trunk, and to the 2.4 branch. In the near future, there will be a 2.5 branch, and the trunk will be used for 2.6 check-ins. I'll modify the buildmaster configuration to account for this.

Update: The buildmaster is now aware of changes in both the trunk and the newly created release25-maint branch. You can watch the HTML status page for all builders, or for the trunk builders.

BTW, if you need instructions on setting up buildbot, you can find some here, here, here and here.

Dave Nicolette's recommended reading list on agile development

Worth perusing. I always enjoy Dave's blog posts on agile development, so I trust his taste :-)

Tuesday, August 15, 2006

Cheesecake 0.6 released

Thanks to MichaƂ's hard work, we released Cheesecake 0.6 today. The easiest way to install it is via easy_install: sudo easy_install Cheesecake

Update: Please report bugs to the cheesecake-users mailing list.

Here's what you get if you run cheesecake_index on Cheesecake itself:

$ cheesecake_index -n cheesecake
py_pi_download ......................... 50 (downloaded package cheesecake-0.6.tar.gz directly from the Cheese Shop)
unpack ................................. 25 (package unpacked successfully)
unpack_dir ............................. 15 (unpack directory is cheesecake-0.6 as expected)
setup.py ............................... 25 (setup.py found)
install ................................ 50 (package installed in /tmp/cheesecakeNyfM4f/tmp_install_cheesecake-0.6)
generated_files ........................ 0 (0 .pyc and 0 .pyo files found)
---------------------------------------------
INSTALLABILITY INDEX (ABSOLUTE) ........ 165
INSTALLABILITY INDEX (RELATIVE) ........ 100 (165 out of a maximum of 165 points is 100%)

required_files ......................... 180 (6 files and 2 required directories found)
docstrings ............................. 63 (found 17/27=62.96% objects with docstrings)
formatted_docstrings ................... 0 (found 2/27=7.41% objects with formatted docstrings)
---------------------------------------------
DOCUMENTATION INDEX (ABSOLUTE) ......... 243
DOCUMENTATION INDEX (RELATIVE) ......... 70 (243 out of a maximum of 350 points is 70%)

pylint ................................. 36 (pylint score was 7.01 out of 10)
unit_tested ............................ 30 (has unit tests)
---------------------------------------------
CODE KWALITEE INDEX (ABSOLUTE) ......... 66
CODE KWALITEE INDEX (RELATIVE) ......... 83 (66 out of a maximum of 80 points is 83%)


=============================================
OVERALL CHEESECAKE INDEX (ABSOLUTE) .... 474
OVERALL CHEESECAKE INDEX (RELATIVE) .... 79 (474 out of a maximum of 595 points is 79%)

For a detailed explanation of how the scoring is done, see the main Wiki page, and/or run cheesecake_index in --verbose mode.

Stay tuned for a cool case study on improving a package using the Cheesecake guidelines, courtesy of Will Guaraldi.

Tuesday, August 01, 2006

A couple of Apache performance tips

I had to troubleshoot an Apache installation recently. Apache 2.0 was running on several Linux boxes behind a load balancer. If you ran top on each box, the CPU was mostly idle, there was plenty of memory available, and yet Apache seemed sluggish. Here are a couple of things I did to speed things up.

1. Disable RedirectMatch directives temporarily

All the Apache servers had directives such as:

RedirectMatch /abc/xyz/data http://admin.mysite.com/abc/xyz/data

This was done so administrators who visited a special URL would be redirected to a special-purpose admin server. Since the servers were pretty much serving static pages, and they were under considerable load due to a special event, I disabled the RedirectMatch directives temporarily, for the duration of the event. Result? Apache was a lot faster.

2. Increase MaxClients and ServerLimit

This is a well-known Apache performance optimization tip. Its effect is to increase the number of httpd processes available to service the HTTP requests.

However, when I tried to increase MaxClients over 256 in the prefork.c directives and I restarted Apache, I got a message such as:

WARNING: MaxClients of 1000 exceeds ServerLimit value of 256 servers, lowering MaxClients to 256. To increase, please see the ServerLimit directive.

There is no ServerLimit entry by default in httpd.conf, so I proceeded to add one just below the MaxClients entry. I restarted httpd, and I still got the message above. The 2 entries I had in httpd.conf in the IfModule prefork.c section were:

MaxClients 1000
ServerLimit 1000

At this point I resorted to all kinds of Google searches in order to find out how to get past this issue, only to notice after a few minutes that the number of httpd processes HAD been increased to well over the default of 256!

UPDATE 03/06/09: It turns out that the new MaxClient and ServerLimit values take effect only if you stop httpd then start it back again. Just doing a restart doesn't do the trick...


So, lesson learned? Always double-check your work and, most importantly, know when to ignore warnings :-)

Now I have a procedure for tuning the number of httpd processes on a given box:

1. Start small, with the default MaxClients (150).
2. If Apache seems sluggish, start increasing both MaxClients and ServerLimit; restart httpd every time you do this.
3. Monitor the number of httpd processes; you can use something like:

ps -def | grep httpd | grep -v grep | wc -l

If the number of httpd processes becomes equal to the MaxClients limit you specified in httpd.conf, check your CPU and memory (via top or vmstat). If the system is not yet overloaded, go to step 2. If the system is overloaded, it's time to put another server in the server farm behind the load balancer.

That's it for now. There are many other Apache performance tuning tips that you can read from the official Apache documentation here.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...