Saturday, May 11, 2013

python mocking

If you are into unit testing, you probably have been introduced to mocking. And if that is the case, you probably already have been bitten by it. Mocking requires some understanding of code execution, importing and name resolution that most people lack when first encountering such situations. In python, mocking is a relatively simple process, if you analyze carefully what needs to be done.

Mocking simply means replacing an object with another. This is usually done to avoid instantiating costly systems or to change the behaviour of a system. To begin with a simple example:
def some_function_in_your_code(a):
    if a.do_something():
        return 3.14159
    else:
        return 6.26318

def some_other_function_in_your_code():
    # ...
    a = create_complicated_object(*thousand_parameters)
    result = some_other_function_in_your_code(a)
    # ...
Here we have a function some_other_function_in_your_code that uses an object. Somewhere else in your code, some_other_function_in_your_code creates that object, which is a complicated process that involves hundreds of operations. If you just want to test some_other_function_in_your_code, you shouldn't need to go through this whole process[1].

To avoid that, we can use mocking. Notice that all we need to test on some_other_function_in_your_code is that the argument passed has a member called "do_something" that can be called with no arguments and returns something that can be converted to bool. There are many ways to do that, but to keep things short, I'll skip right to the library I use most of the times, mock.

The main component of the mock library is the Mock class, which is basically an empty object with a few useful characteristics (be sure to check the documentation, because they are really useful). Two of them are important for this discussion. First, every time you access an attribute of a Mock object, another Mock object is created and assigned to the attribute being accessed.
>>> import mock
>>> m = mock.Mock()
>>> print(m)
<Mock id='140095453080976'>
>>> print(id(m.some_attribute))
140095453131664
>>> print(m.some_attribute)
<Mock name='mock.some_attribute' id='140095453131664'>
>>> print(id(m.some_attribute))
140095453131664
As you can see, when we tried to access some_attribute, a Mock object was created for us. We could do it by hand with a single line of code, but it makes it makes the code easier, shorter and cleaner.

The other feature of Mock objects is the return_value attribute. Whatever is contained in this attribute gets returned when the object is called as a function object [2].
>>> import mock
>>> m = mock.Mock()
>>> m.return_value = u'the return value'
>>> m()
u'the return value'
Using this two techniques, we can now test our function:
import mock

def your_test():
    a = mock.Mock()
    a.do_something.return_value = True
    if some_function_in_your_code(a) != 3.14159:
        raise Exception(u'return wasn't 3.14159.')
    a.return_value = False
    if some_function_in_your_code(a) != 6.28318:
        raise Exception(u'return wasn't 6.28318.')
Let's break that down. We first create a Mock object to represent the argument passed to the function. The next line takes care of the two things we need to test the function: the do_something attribute and its return value. Then, all we have to do is call the function, passing the mocked argument, and check the return value. After that, repeat the process, this time with a different return value.

That was the easy part


This first section was easy, nothing you couldn't find out with a quick search on the internet. But the real world is not that pretty (at least mine isn't). The trickiest situation is when you have to change the behaviour of something inside a function, but you don't pass that something as an argument. Suppose we have this:
# some_file.py
import random

def f():
    if random.random() < 0.5:
        return 3.14159
    else:
        return 6.28318
How can you test that function if the value tested is conjured from oblivion in the middle of the function? Fear not, you can actually do it. The trick here is to mock the random function from the random module before the first line of f is executed. Let's start with this simple example, just to get the basic idea:
>>> import random
>>> random.random()
0.5212285734499994
>>> random.random()
0.40492920488281725
>>> import mock
>>> random.random = mock.Mock(return_value=0.5)
>>> random.random()
0.5
This seems pretty simple. But here is where everybody gets lost:
>>> import random
>>> import mock
>>> random.random = mock.Mock(return_value=0.5)
>>> import some_file
>>> some_file.f() == 6.28318 or raise ThisShouldNotBeHappeningException()
This may not explode the first time, but you have a 50% chance of getting an exception. To see why, you can put a print random.random inside f and see that the mock didn't work. To understand why, we have to dig a little deeper.

Import


What happens when you run import random? You can watch this video to know exactly what. Or you can just continue reading to get a summary. Or both. Anyway:
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', '__doc__': None, '__package__': None}
>>> import random
>>> locals()
{'__builtins__': <module '__builtin__' (built-in)>, '__name__': '__main__', 'random': <module 'random' from '/usr/lib/python2.7/random.pyc'>, '__doc__': None, '__package__': None}
In python, locals is a built-in function that returns the local variables accessible on the current state of execution (don't believe me, run print(__builtins__.locals)). When you execute import random, the interpreter does its magic to find the module and load it, but more important, it creates an entry called "random" on the current namespace referring to the module loaded. The critical part here is "current namespace". Try this:
>>> def f(): print(locals())
>>> import random
>>> f()
{}
Here, importing random didn't affect the namespace on f. The same thing applies to namespaces of other modules. Our example fails because the namespace on the some_file module is different than the namespace where we run our tests. To change the namespace of some_file, we have to do it explicitly:
>>> import some_file
>>> some_file.random.random = lambda: 0.5
>>> some_file.f() == 6.28318 or raise ThisShouldNotBeHappeningException()
You can run that many times if you don't trust me, but that will always succeed. And it does because we now are changing the correct namespace. You can check it by putting a print(random.random) on f again.

Being nice


Now you know how to mock, but there is something I must say before you leave. Always, always, ALWAYS restore any mock you do. Seriously. Even if you're sure no one will use the mocked attribute. You don't want to loose an entire day of work just to find out that the problem was an undone mock.

And doing it is so simple: store the original value on a variable and restore it after the operation. I like to do it as soon as the operation is complete, before anything else is executed, but you don't need to, if you're not paranoid. Just to clear any doubt, here is exactly how to do it:
>>> import random
>>> original_random = random.random
>>> random.random = lambda: 0.5
>>> # do something
>>> random.random = original_random
Now you have no excuse. Better yet, you can use another feature of the mock library called patch. But that would be an extension to an already long post. Maybe I'll cover it in the future. Anyway, happy mocking!

Notes


1: You shouldn't have complicated processes that involve hundreds of operations anyway, but that is another problem.

2: Curious to know what happens when you access return_value without setting it first? No? Well, I'll show you anyway:
>>> import mock
>>> m = mock.Mock()
>>> m.return_value
<Mock name='mock()' id='140095453168080'>
Since we didn't set it, we get the default behaviour of __setattr__, which is to create another Mock.