technology

A Note About Comparisons

I thought it would be good to continue on the theme presented by Yohei last month and show one of the more subtle bugs I have encountered.

Suppose you need to write a “between” function that returns True if and only if x is between a minimum and a maximum value. Consider the two possible implementations of the function:

    def between_0(x, minimum, maximum):
      if x >= minimum and x <= maximum:
        return True
      else:
        return False

    def between_1(x, minimum, maximum):
      if x < minimum or x > maximum:
        return False
      else:
        return True

Would you:

  1. Choose the first implementation
  2. Choose the second implementation
  3. Choose either because they are logically equivalent

At this point I hope that you trust my ability to select an interesting topic enough to not choose 3 even though De Morgan’s Law might convince you they are. To help illustrate the difference between 1 and 2 let’s look at a unit test for this function that we might want to use:

    def test_nan_not_between():
      result = between(float('nan'), 1, 10)
      assert result == False

float('nan') here is a special floating point value called “Not a Number“, commonly written as NaN and essentially defined as the result of several arithmetic operations not having numeric results. NaN has a very odd status: logically speaking it is considered incomparable to numeric floating point values but the floating point standard dictates that rather than those comparisons resulting in an error, they return False. An unfortunate consequence of this behavior is that unlike most comparable types, for floating-point numbers <= is not the logical negation of >.

Getting back to the question of which of the two implementations to choose, we can see with our understanding of NaN that the test will pass the between_0 and fail between_1 and so we should definitely not use between_1.

It is very easy to forget about this very odd interaction and I’m sure this small error has been repeated many times. I even encountered a variant of this bug in the C# testing library’s assertAreEqual function that caused my tests to incorrectly pass erroneous results involving NaNs.

This is just one of many complications of dealing with floating-point numbers. Keep it in mind when writing functions (and tests) that must deal with them.

Discussion

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s