SciPy 2007 Testing BOF Notes
Discussion starter
- Testing in Python's stdlib: unittest and doctest.
- Unittest: JUnit inspired.
- Doctest: lightweight tests, in docstrings or standalone plaintext files. Docstring examples can be used as tests, and more complicated tests can be written in separate files.
- Alternative frameworks for testing: nose.py, py.test. Others?
Fernando's personal gripes
We want a smooth workflow based on testing as part of the development process, but that relies only on what the stdlib provides. We will thus restrict our effort to supporting and improving a workflow based on the unittest and doctest modules.
There are basically three types of testing that we want to be able to easily work with:
- Normal unittests: but with the improvement of easy-to-write parametrized tests.
- Doctests: but we want to make it easier/more convenient to develop and modify doctests.
- Standalone scripts: it is often easiest when developing something to hack out a little script that does a few things, and to either visually inspect or programmatically test its outputs for validity. Once done and satisfied with such a script, it should take NO additional effort to turn it into a proper unittest. We also want a single script to be able to easily record multiple test conditions (and successes) and to provide summary information.
Most importantly: I want an easy workflow!
- The time invested in trying out an idea and pounding on a small test script should contribute to the final test suite results without having to redo all that in some other format later (doctest, unittest, whatever).
- Very convenient debugging/exploration of failures. Interactive, pdb-style, good tracebacks, etc.
Questions
- What do other people use?
- Good workflows, tools?
- What should we put into python or numpy.testing so we all stop reinventing this particular wheel?
- code coverage: py.test/nose have it, but I think it belongs in the stdlib.
Meeting notes
Gripes
- Parametrized tests :
You want testing a function over a many values. One solution is to do a loop in a unit test, the problem is that the failure of one test case (one execution of the inner loop) fails the entire loop.
Titus Remarks that nose addresses this well.
- Doctest workflow :
Friction in the workflow going from the test file, to the python/ipython prompt is clumsy.
- Standalone scripts
I would when to be able to introspect/debug/explore/run the test as a normal python code. Travis O. seconds that: it is a pain to debug failing tests.
- Statistical test
Mike M. remarks that there is no statistical based testing suite, that runs throught a large codebase (JUMBL, developed by U. Tennessee provides this).
- Historical regression test suite
Eric J. mentions that there are no good regression test suite, including performance test suite, that store results in a historical database.
- Packaging testing
Eric J. mentions that testing a large distribution is very painful and to be hand-implemented.
One solution is to use ipython, that can test the output of shell commands, with a pexpect-based test runners (ipdoctest)
- UI testing
No good tools to do UI testing. How do you define ?
Tests in svn to do performance test.
- How do you test multiprocessor code (eg: coverage analysis, can't open
50 000 files)
- Testing levels like in scipy: smoke test in 10s. This is what UnitTest
test suites are supposed to be used for. Nose has this kind of features.
How do you test for memory size (gmemprof for C).
Solutions
- Extension of the build framework to do regression testing on
distribution.
- UI testing:
Win32: using the accessibility framework
Qt4: has testing
- kwWidgets: Cross platform widget library, with testing built
in (made by kitware)
- VTK compare two images (one reference, the other tested), make a
diff between the two, and if the diff is above a threshhold, complain (see DART2).
Red hat dogtail.
- Eric J. mentions that good MVC-like separation, plus good
description of the GUI layout by text/object (see Traits or Pyre for instance).
LDTP: test anything accessibility-enabled, but Linux only.
- Code coverage testing:
- Figleaf
- coverage.py
Nose
- Nice docs
- Ease of use
- Based on unittest (setup.py test-compatible)
- Test discovery
Future solutions
- Pylint can be set so that it runs at each time you do check in to your
repository, this could be "borrowed".
References
- Testing Python mainly list
- Python testing tools taxonomy