This is an archival dump of old wiki content --- see scipy.org for current material

SciPy 2007 Testing BOF Notes

Discussion starter

  • Testing in Python's stdlib: unittest and doctest.
  • Unittest: JUnit inspired.
  • Doctest: lightweight tests, in docstrings or standalone plaintext files. Docstring examples can be used as tests, and more complicated tests can be written in separate files.
  • Alternative frameworks for testing: nose.py, py.test. Others?

Fernando's personal gripes

We want a smooth workflow based on testing as part of the development process, but that relies only on what the stdlib provides. We will thus restrict our effort to supporting and improving a workflow based on the unittest and doctest modules.

There are basically three types of testing that we want to be able to easily work with:

  • Normal unittests: but with the improvement of easy-to-write parametrized tests.
  • Doctests: but we want to make it easier/more convenient to develop and modify doctests.
  • Standalone scripts: it is often easiest when developing something to hack out a little script that does a few things, and to either visually inspect or programmatically test its outputs for validity. Once done and satisfied with such a script, it should take NO additional effort to turn it into a proper unittest. We also want a single script to be able to easily record multiple test conditions (and successes) and to provide summary information.

Most importantly: I want an easy workflow!

  • The time invested in trying out an idea and pounding on a small test script should contribute to the final test suite results without having to redo all that in some other format later (doctest, unittest, whatever).
  • Very convenient debugging/exploration of failures. Interactive, pdb-style, good tracebacks, etc.

Questions

  • What do other people use?
  • Good workflows, tools?
  • What should we put into python or numpy.testing so we all stop reinventing this particular wheel?
  • code coverage: py.test/nose have it, but I think it belongs in the stdlib.

Meeting notes

Gripes

  • Parametrized tests :

    You want testing a function over a many values. One solution is to do a loop in a unit test, the problem is that the failure of one test case (one execution of the inner loop) fails the entire loop.

    Titus Remarks that nose addresses this well.

  • Doctest workflow :

    Friction in the workflow going from the test file, to the python/ipython prompt is clumsy.

  • Standalone scripts

    I would when to be able to introspect/debug/explore/run the test as a normal python code. Travis O. seconds that: it is a pain to debug failing tests.

  • Statistical test

    Mike M. remarks that there is no statistical based testing suite, that runs throught a large codebase (JUMBL, developed by U. Tennessee provides this).

  • Historical regression test suite

    Eric J. mentions that there are no good regression test suite, including performance test suite, that store results in a historical database.

  • Packaging testing

    Eric J. mentions that testing a large distribution is very painful and to be hand-implemented.

    One solution is to use ipython, that can test the output of shell commands, with a pexpect-based test runners (ipdoctest)

  • UI testing

    No good tools to do UI testing. How do you define ?

  • Tests in svn to do performance test.

  • How do you test multiprocessor code (eg: coverage analysis, can't open

    50 000 files)

  • Testing levels like in scipy: smoke test in 10s. This is what UnitTest

    test suites are supposed to be used for. Nose has this kind of features.

  • How do you test for memory size (gmemprof for C).

Solutions

  • Extension of the build framework to do regression testing on

    distribution.

  • UI testing:
    • Win32: using the accessibility framework

    • Qt4: has testing

    • kwWidgets: Cross platform widget library, with testing built

      in (made by kitware)

    • VTK compare two images (one reference, the other tested), make a

      diff between the two, and if the diff is above a threshhold, complain (see DART2).

    • Red hat dogtail.

    • Eric J. mentions that good MVC-like separation, plus good

      description of the GUI layout by text/object (see Traits or Pyre for instance).

    • LDTP: test anything accessibility-enabled, but Linux only.

  • Code coverage testing:
    • Figleaf
    • coverage.py
Nose
  • Nice docs
  • Ease of use
  • Based on unittest (setup.py test-compatible)
  • Test discovery

Future solutions

  • Pylint can be set so that it runs at each time you do check in to your

    repository, this could be "borrowed".

References

  • Testing Python mainly list
  • Python testing tools taxonomy

SciPy: SciPy2007/TestingNotes (last edited 2015-10-24 17:48:26 by anonymous)