python

Organizing Python Unit Tests

Having earned most of my TDD experience in the Ruby world, I became very used to writing unit tests in RSpec.  Jana’s API is built in Python with Flask, and learning Python’s unit-testing framework is straightforward, basically, a direct port of JUnit.  However, I quickly felt like I didn’t have good patterns for organizing my tests.

Agnostic to the language and framework, I believe that good unit-testing follows these widely discussed behaviors:

  • One assertion per test
  • Keep the data and setup close to the test

Quick note: I use methods to describe Ruby code, and functions to describe Python code.

One Assertion Per Test

You can get kind of crazy in RSpec and literally make one assertion per test. However, I find that a bit overboard for my taste. In addition, for tests that require somewhat costly setup, one assertion per test can end up making your test suite take much longer to run than necessary. So, I think of it as “one scenario assertion per test”, meaning that after my data setup and invoking the method under test with one set of input, I can make any number of assertions to verify the output.

Keep your data close

Being able to scan code and tests quickly is often a time saver. If we’re optimizing for that, then the closer the test setup is to the test assertions, the better.  I’ll cover this in a follow up post.

An RSpec Example

In RSpec, you can use the describe function to arbitrarily organize your tests. One very common pattern is to organize your test by Class or Module, then by method, then by scenario.  For example if we had a Calculator class, we might have something like this (without other fancier RSpec syntax for the example):


describe Calculator do
  describe '#add' do
    it 'adds two numbers' do
      calculator = Calculator.new
      expect(calculator.add(2, 2)).to eq(4)
    end

    it 'raises InvalidInputError if an argument is nil' do
      calculator = Calculator.new
      expect(calculator.add(nil, 2)).to raise(InvalidInputError)
    end
  end

  describe '#subtract' do
    # you get the point
  end
end

As you can see in this simple example, we make assertions per set of input.  One reason to organize tests this way is to also take advantage of test output. You can get a very detailed explanation on what failed. If we had just one test method that tested both successful addition and the exception case, our test output would read something paraphrased as:

Calculator#add - Fail

when we could have the following, which is more specific about the failure. Helping us troubleshoot and debug much faster.

Calculator#add adds two numbers - Fail

Python UnitTest

I noticed our test cases were starting to violate the one assertion per test guidance. Although we created a TestCase for each class, we still combined several scenarios into a single test function. It looked like this:

# test/CalculatorTest.py

class CalculatorTest(unittest.TestCase):

    def test_add(self):
        # test success
        calculator = Calculator()
        self.assertEqual(calculator.add(2, 2))

        # test error
        self.assertRaises(InvalidInputError, calculator.add, 2, 2);

    def test_subtract(self):
        # ...

Of course, you say, we should break up the success case and the error case into two test functions, test_add_success and test_add_nil_argument_raises_error.  Unlucky for us, our function names are not always so short, and describing the failure can take up a lot of space as well.  Imagine our calculator had a function perform_advanced_physics_simulation:

# test/CalculatorTest.py

class CalculatorTest(unittest.TestCase):

    def test_perform_advanced_physics_simulation_with_string_theory(self):
        # magic

    def test_perform_advanced_physics_simulation_in_a_vacuum(self):
        # more magic

    def test_perform_advanced_physics_simulation_when_dataset_is_sparse(self):
        # even more magic

The function names start to get long because describing the test scenario is important to understanding what you are testing. Not only does this become tricky with PEP8 compliance, it’s really tedious!

One TestCase per Function

So can we combine some of the ideas from the RSpec example with the facilities in UnitTest?  I think we can.  I’ve been using the following pattern in my more recent tests and have been happy with the results.  No one said you can’t use more that one TestCase in a file, so let’s take advantage of that. For every function we want to test, let’s create a TestCase.  Our file will logically organize the tests to our Calculator class.

# test/CalculatorTest.py

class AddTest(unittest.TestCase):

    def test_adds_two_numbers(self):
        calculator = Calculator.new
        self.assertEqual(calculator.add(2, 2))

    def test_raises_error_with_nil_argument(self):
        calculator = Calculator.new
        self.assertRaises(InvalidInputError, calculator.add, 2, 2);

class SubtractTest(unittest.TestCase):
    # ... test scenarios for the subtract function

class PerformAdvancedPhysicsSimulationTest(unittest.testCase):

    def test_with_string_theory(self):
        # still magic

    def test_with_in_a_vacuum(self):
        # still more magic

    def test_when_dataset_is_sparse(self):
        # still even more magic

By organizing our functions under test to their own classes, we get several wins:

  1. We can omit the function under test’s name (Add, Subtract, PerformAdvancedPhysicsSimulation) from the test function name.
  2. We have more characters to describe our test scenario. This might not be a big deal if you’re not following PEP8’s 80-char, 4-space soft-tab rules.
  3. When practicing TDD and using nosetest, you can invoke all the methods on your test more easily by running the FunctionTestCase: nosetests test/CalculatorTest:AddTest. If you want to run a single test you can still do: nosetests test/CalculatorTest:AddTest.test_adds_two_numbers
  4. If we have generic setup required to test each function, we can now leverage the TestCase setUp or setUpClass function in each TestCase. When setup is different for each function, this can be more convenient than trying to share setup with other functions.  Also, it helps us keep our test setup closer to our tests.  Note: sharing setup between functions under test requires us to add other python functions to the file, or create a superclass to share between our TestCases.

Conclusion

I believe that when testing is low overhead and doesn’t feel tedious, they can provide a lot of value and developers might enjoy writing them.  Using some simple patterns to organize our tests, we can create simpler, easier to read, more maintainable tests without installing any additional packages (although nosetest seems like a big win.)

Give this a shot, let me know what you think. I’ve been trying this out for a few weeks now and I’m happy with the results. If you have other techniques to write better unit tests or TDD in python, please leave a comment! I’m still looking for good red/green test output and an autotest/watchr/guard replacement.

Tags:

Discussion

One response to “Organizing Python Unit Tests

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s