Why don’t people use doctest in Python?

7 min readApr 29, 2023

What is this somewhat unknown, yet very useful Python feature?

One of the most important processes in programming is testing your code to make sure there are no bugs in your program or the program produces an unwanted output to the user.

There are a few ways to approach testing. Many programmers will think of different test cases and their desired outputs and then manually test these cases. This is highly inefficient and often not all possible cases are included. Because of this programmers have devised ways to ensure that all possible inputs are tested.

Path coverage: This method ensures that every path in the program is covered, by choosing test cases that cover each path. (There's a useful Python library called, coverage, that allows you to visualize path coverage)
Boundary Values: This is used to test the values around each category of input. For example, if we have a script that checks if a number is between 0 and 9, inclusive, then the boundary values would be a list of [-1,0,1] and [8,9,10]

def is_single_digit(number: int) -> bool:
    if number >= 0 and number <= 9:
        return True
    return False

As programs get longer and more complex, this becomes difficult. That’s why it is useful to have tools that allow you to automatically run test cases. While you still have to come up with the test cases yourself, it does make the process a lot easier.

Python has a feature called doctest which allows you to automatically test certain inputs and compare with the expected output. To demonstrate this, I have written a simple Python script, factors.py, to calculate the factors of a given integer.

def find_factors(num):
    factor_list = []
    for i in range(1, num + 1):
        if num % i == 0:
            factor_list.append(i)
    return factor_list

To test this program is quite simple as we can easily determine if the factors calculated are correct, by multiplying the factor at index n by the factor at index -(n+1) in the list. If the value is the same as the input number, then those 2 numbers are factors of the number. Repeat this for all indexes.

Lets write a simple doctest file, test_factors.py.

"""
>>> from factors import find_factors as ff
>>> ff(2)
[1, 2]
>>> ff(10)
[1, 2, 5, 10]
"""
import doctest
doctest.testmod(verbose=True)

I added the optional parameter, “verbose=True” to make the program print what it is doing.

After running test_factors.py, we get an output of:

Trying:
    from factors import find_factors as ff
Expecting nothing
ok
Trying:
    ff(2)
Expecting:
    [1, 2]
ok
Trying:
    ff(10)
Expecting:
    [1, 2, 5, 10]
ok
1 items passed all tests:
   3 tests in __main__
3 tests in 1 items.
3 passed and 0 failed.
Test passed.

Every line that starts with “>>>” is the test case and the line below is the expected output. Python will compare the expected output with the actual output. In this case, all the test cases were passed. Can we conclude that the program is error free and will always output the correct answer? The short answer is no. While we have demonstrated that the program will function correctly for positive prime numbers and positive non-prime numbers, what happens if the user enters a negative number, or a floating point number, or an absurdly large number, or nothing, or None, or a string, dictionary, list, tuple, set, nested lists, objects… You get the point.

If we consider the mathematical definition of the factors of a number,

Mathematical definition of a factor

then the factors of negative numbers are the positive and negative factors of the positive number. For instance, the factors of -10 are 1, -1, 2, -2, 5, -5, 10, -10. Therefore when testing if the factors are correct we get:

Therefore these factors are correct. However upon testing our script with -10 we get an empty list, which is of course incorrect. We can modify the script a bit to account for this:

def find_factors(num):
    factor_list = []
    if num >= 0:
        for i in range(1, num + 1):
            if num % i == 0:
                factor_list.append(i)
        return factor_list
    else:
        for i in range(-1, num-1, -1):
            if num % i == 0:
                factor_list.append(-i)
                factor_list.append(i)
        return factor_list

This will now return the correct values for -10 and any other negative numbers.

What if I input, 432875987598723985723897593827598379853205894096843, this would take an absurdly long time to calculate, thus we need to have limits on our program, I’m going to set a limit that the input number has to be between -1 000 000 and 1 000 000 inclusive. Additionally we want to make sure that we return a nice error message to the user to indicate the limitations.

While we’re at it, lets fix the problem of people potentially entering floating-point numbers, strings, lists, tuples, None, nothing etc. Instead of creating a blacklist of what cannot be entered, we will create a whitelist of what can be entered. Therefore our whitelist will be any integer. (Some people may notice that I could have combined the previous 2 conditions of being between -1 000 000 and 1 000 000 and the integer whitelist into 1 condition, I am keeping then separate to be able to show different error messages).

Putting this all together we get:

def find_factors(num=None):

    if not isinstance(num, int):
        return "Input Must Be An Integer"
    if not (-1000000 <= num <= 1000000):
        return "Input Must Be Between -1 000 000 And 1 000 000, Inclusive"
    
    factor_list = []
    if num >= 0:
        for i in range(1, num + 1):
            if num % i == 0:
                factor_list.append(i)
        return factor_list
    else:
        for i in range(-1, num-1, -1):
            if num % i == 0:
                factor_list.append(-i)
                factor_list.append(i)
        return factor_list

Now modifying our doctest file:

"""
>>> from factors import find_factors as ff
>>> ff(0)
[]
>>> ff(1)
[1]
>>> ff(10)
[1, 2, 5, 10]
>>> ff(-10)
[1, -1, 2, -2, 5, -5, 10, -10]
>>> ff(-1)
[1, -1]
>>> ff(1000001)
'Input Must Be Between -1 000 000 And 1 000 000, Inclusive'
>>> ff(-1000000)
[1, -1, 2, -2, 4, -4, 5, -5, 8, -8, 10, -10, 16, -16, 20, -20, 25, -25, 32, -32, 40, -40, 50, -50, 64, -64, 80, -80, 100, -100, 125, -125, 160, -160, 200, -200, 250, -250, 320, -320, 400, -400, 500, -500, 625, -625, 800, -800, 1000, -1000, 1250, -1250, 1600, -1600, 2000, -2000, 2500, -2500, 3125, -3125, 4000, -4000, 5000, -5000, 6250, -6250, 8000, -8000, 10000, -10000, 12500, -12500, 15625, -15625, 20000, -20000, 25000, -25000, 31250, -31250, 40000, -40000, 50000, -50000, 62500, -62500, 100000, -100000, 125000, -125000, 200000, -200000, 250000, -250000, 500000, -500000, 1000000, -1000000]
>>> ff("hello")
'Input Must Be An Integer'
"""
import doctest
doctest.testmod(verbose=True)

We get an output of:

Trying:
    from factors import find_factors as ff
Expecting nothing
ok
Trying:
    ff(0)
Expecting:
    []
ok
Trying:
    ff(1)
Expecting:
    [1]
ok
Trying:
    ff(10)
Expecting:
    [1, 2, 5, 10]
ok
Trying:
    ff(-10)
Expecting:
    [1, -1, 2, -2, 5, -5, 10, -10]
ok
Trying:
    ff(-1)
Expecting:
    [1, -1]
ok
Trying:
    ff(1000001)
Expecting:
    'Input Must Be Between -1 000 000 And 1 000 000, Inclusive'
ok
Trying:
    ff(-1000000)
Expecting:
    [1, -1, 2, -2, 4, -4, 5, -5, 8, -8, 10, -10, 16, -16, 20, -20, 25, -25, 32, -32, 40, -40, 50, -50, 64, -64, 80, -80, 100, -100, 125, -125, 160, -160, 200, -200, 250, -250, 320, -320, 400, -400, 500, -500, 625, -625, 800, -800, 1000, -1000, 1250, -1250, 1600, -1600, 2000, -2000, 2500, -2500, 3125, -3125, 4000, -4000, 5000, -5000, 6250, -6250, 8000, -8000, 10000, -10000, 12500, -12500, 15625, -15625, 20000, -20000, 25000, -25000, 31250, -31250, 40000, -40000, 50000, -50000, 62500, -62500, 100000, -100000, 125000, -125000, 200000, -200000, 250000, -250000, 500000, -500000, 1000000, -1000000]
ok
Trying:
    ff("hello")
Expecting:
    'Input Must Be An Integer'
ok
1 items passed all tests:
   9 tests in __main__
9 tests in 1 items.
9 passed and 0 failed.
Test passed.

Great! Our program has passed all the tests!

It would be wonderful in programming if we could just test every possible input, but that is just not possible. In this case, there are infinitely many numbers that a user could input and it is just not feasible to calculate if the output is correct for all numbers individually. By using, either path coverage or boundary values with a doctest file, you can try to account for most possible inputs.

You might be screaming at me right now for missing a possible test case which I might very well have done (If I have please leave a comment indicating so). This shows the importance of giving your programs to other people to test, as not everyone approaching testing in the same way.

So please use a doctest file when programming in Python, it can come in very handy, and there is a lot more to the doctest module than what I have explained here so keep exploring!

Cheers!

Why don’t people use doctest in Python?

Written by Gideon Weiss