SciPy2007/TestingNotes - SciPy wiki dump

SciPy 2007 Testing BOF Notes

Discussion starter

Testing in Python's stdlib: unittest and doctest.
Unittest: JUnit inspired.
Doctest: lightweight tests, in docstrings or standalone plaintext files. Docstring examples can be used as tests, and more complicated tests can be written in separate files.
Alternative frameworks for testing: nose.py, py.test. Others?

Fernando's personal gripes

We want a smooth workflow based on testing as part of the development process, but that relies only on what the stdlib provides. We will thus restrict our effort to supporting and improving a workflow based on the unittest and doctest modules.

There are basically three types of testing that we want to be able to easily work with:

Normal unittests: but with the improvement of easy-to-write parametrized tests.
Doctests: but we want to make it easier/more convenient to develop and modify doctests.
Standalone scripts: it is often easiest when developing something to hack out a little script that does a few things, and to either visually inspect or programmatically test its outputs for validity. Once done and satisfied with such a script, it should take NO additional effort to turn it into a proper unittest. We also want a single script to be able to easily record multiple test conditions (and successes) and to provide summary information.

Most importantly: I want an easy workflow!

The time invested in trying out an idea and pounding on a small test script should contribute to the final test suite results without having to redo all that in some other format later (doctest, unittest, whatever).
Very convenient debugging/exploration of failures. Interactive, pdb-style, good tracebacks, etc.

Questions

What do other people use?
Good workflows, tools?
What should we put into python or numpy.testing so we all stop reinventing this particular wheel?
code coverage: py.test/nose have it, but I think it belongs in the stdlib.

Meeting notes

Gripes

Parametrized tests :

You want testing a function over a many values. One solution is to do a loop in a unit test, the problem is that the failure of one test case (one execution of the inner loop) fails the entire loop.

Titus Remarks that nose addresses this well.
Doctest workflow :

Friction in the workflow going from the test file, to the python/ipython prompt is clumsy.
Standalone scripts

I would when to be able to introspect/debug/explore/run the test as a normal python code. Travis O. seconds that: it is a pain to debug failing tests.
Statistical test

Mike M. remarks that there is no statistical based testing suite, that runs throught a large codebase (JUMBL, developed by U. Tennessee provides this).
Historical regression test suite

Eric J. mentions that there are no good regression test suite, including performance test suite, that store results in a historical database.
Packaging testing

Eric J. mentions that testing a large distribution is very painful and to be hand-implemented.

One solution is to use ipython, that can test the output of shell commands, with a pexpect-based test runners (ipdoctest)
UI testing

No good tools to do UI testing. How do you define ?
Tests in svn to do performance test.
How do you test multiprocessor code (eg: coverage analysis, can't open

50 000 files)
Testing levels like in scipy: smoke test in 10s. This is what UnitTest

test suites are supposed to be used for. Nose has this kind of features.
How do you test for memory size (gmemprof for C).

Solutions

Extension of the build framework to do regression testing on

distribution.
UI testing:
- Win32: using the accessibility framework
- Qt4: has testing
- kwWidgets: Cross platform widget library, with testing built
  
  in (made by kitware)
- VTK compare two images (one reference, the other tested), make a
  
  diff between the two, and if the diff is above a threshhold, complain (see DART2).
- Red hat dogtail.
- Eric J. mentions that good MVC-like separation, plus good
  
  description of the GUI layout by text/object (see Traits or Pyre for instance).
- LDTP: test anything accessibility-enabled, but Linux only.
Code coverage testing:
- Figleaf
- coverage.py

Nose

Nice docs
Ease of use
Based on unittest (setup.py test-compatible)
Test discovery

Future solutions

Pylint can be set so that it runs at each time you do check in to your

repository, this could be "borrowed".

References

Testing Python mainly list
Python testing tools taxonomy