unit-testing-tips

Unit Testing Games: Tip #3

Don’t let full coverage fool you

When playing XCOM 2, the survival and effectiveness of your squad is heavily influenced by the amount of cover available. Half-cover items provide some protection from enemy fire, but full-cover offers the best protection. Though critical to survival, full-cover doesn’t guarantee survival.

When testing your code, it’s common practice to examine code coverage. This is a metric that identifies what percentage of your code is executed by your tests. Higher coverage equates to more lines of code being put through the paces.

But just as full-cover doesn’t prevent your soldier from being pummeled in XCOM 2, having 100% code coverage doesn’t guarantee your code is bulletproof. Tests can execute lines of code, but the Engineer has to make sure the tests verify behavior.

Take this code sample of a square grid-based map, stored in a linear array.

class GameMap(object):
 
    NUM_ROWS = 10
    NUM_COLS = 10
 
    def __init__(self):
        # Initialize map to all 'grass'
        self._map = ['grass'] * GameMap.NUM_ROWS * GameMap.NUM_COLS
 
    def get_terrain(self, x, y):
        # grass, road, mountain (impassable), brickwall (impassable)
        return self._map[y * GameMap.NUM_ROWS + x]
 
    def is_passable(self, x, y):
        grid = self.get_terrain(x, y)
        if 'mountain' not in grid:
            return True
        return False

The get_terrain() function returns the type of terrain at a given map point. To test this function, we only need x and y coordinates that are inside the map area.

import unittest

from main import GameMap


class TestGameMap(unittest.TestCase):

    def test_get_grid_returns_value(self):
        m = GameMap()
        val = m.get_terrain(1, 1)
        self.assertEqual(val, 'grass')

Since the get_terrain() function only has a single line, it is guaranteed to be executed by our test function (100% test coverage). Great! Except, this doesn’t test the full range of behavior. What happens if x or y are negative? What about x or y being outside of the map bounds? For both cases, the behavior is “raise an exception when this condition is true”. These are edge cases we should cover in our tests.

One of the benefits of testing your code is to provide real-usage documentation. A new user might assume the only behavior of get_terrain() is to return a terrain value. They might proceed to write their code, assuming invalid map coordinates are handled gracefully. With explicit tests that highlight these edge cases, the new engineer can write safer code.

Another example is when a function doesn’t properly handle return values. The get_terrain() function has 4 return values, 2 of which are ‘impassable’. The is_passable() function relies on these return values to do it’s job. Our tests below also offer 100% code coverage by touching every line of the function.

    def test_is_passable(self):
        m = GameMap()
        m._map[0] = 'road'
        val = m.is_passable(0, 0)
        self.assertTrue(val)

    def test_is_passable_fails_for_mountains(self):
        m = GameMap()
        m._map[0] = 'mountain'
        val = m.is_passable(0, 0)
        self.assertFalse(val)

Coverage is at 100% as well, but these tests don’t offer insight into a bug in the code. As it’s written, is_passable() will incorrectly identify ‘brickwall’ as passable. The missing test that checks for ‘brickwall’ will catch this error.

    def test_is_passable_fails_for_brickwalls(self):
        m = GameMap()
        m._map[0] = 'brickwall'
        val = m.is_passable(0, 0)
        self.assertFalse(val)

This test will fail since ‘brickwall’ results in a True (passable) return value. Only ‘mountain’ terrain returns False (impassable).

Test coverage is a fantastic tool in your software engineering arsenal. Increasing the coverage percentage means more of your lines of code are executed. However, this should not be taken as the sole measure of quality.

Unit Testing Games: Tip #2

Keep tests fast

Tip #1 is about keeping unit tests small. This is greatly complemented by keeping them fast.

Unit tests are a powerful tool for your development team, and time is possibly the teams most valuable asset. Having tests that execute rapidly both promotes their use, and removes barriers to running them in the heat of Alpha. If an engineer can execute 300 tests in less than 2 seconds, they will likely get run. If 100 tests take 20 minutes, they will almost never be executed.

Remember not to confuse size with speed. Small tests will very likely be fast, but that’s not a guarantee. Let’s look at our “small” test from Tip #1.

def test_weapon_fire(self):
    # setup
    # =============================
 
    # This version will NOT implicitly load assets on construction
    weapon = Weapon("rifle")
    start_position = vector(2.0, 0.0, 2.0) 
 
    # execute
    # =============================
 
    # Reduce ammo
    # Note: we no longer require a target
    weapon.fire(start_position, None)
    actual_ammo_post_fire = weapon.get_current_ammo()
 
    # assert
    # =============================
 
    # Verify we lost ammo on fire
    self.assertEqual(weapon.get_max_ammo() - 1, actual_ammo_post_fire)

Depending on the details of the Weapon class constructor and the fire() method, this test could take a second or two to execute. It’s plausible the constructor loads art assets from disk. This could be for the weapon itself, as well as for it’s ammo. It’s also plausible the fire() method performs line-of-sight checks, which can be computationally expensive.

Having this test execute in 1-2 seconds may not be an issue in isolation. It becomes a problem when we add more tests. Adding tests for reloading (another 1-2 seconds) and switching (2-4 seconds to load 2 weapons) raises the tally to 4-8 seconds. When we have 100 tests (100-200 seconds) or, even better, 600 tests (600-1200 seconds), you see how this rapidly increases.

The Weapon class could be refactored to remove asset loading, maybe by moving this to a load() function. This could reduce construction to simple variable initialization, which takes milliseconds. If you can reduce the time to 100-200 milliseconds, you can now get through 100 tests in 1-2 seconds instead of 1. 600 tests now execute in 6-12 seconds instead of 20 minutes.

Something is better than nothing

It may not be reasonable to refactor like this for speed increases. That’s OK. The primary goal is to add reliability to your systems, which increases overall productivity. I will always prefer a single, slow, test over no tests at all!

Unit Testing Games: Tip #1

I previously discussed some benefits of Test-Driven Development for Games. Next, I’ll offer some tips for effectively adopting TDD and unit testing.

Keep tests small

Game teams strive to keep the size of their game assets to a minimum. This conserves disk space, and allows packing more content into a single release. Your unit tests should also be small, though for different reasons.

Small tests aid in comprehension and provide real-world documentation that is always up-to-date. Naming functions is hard, and sometimes the name doesn’t properly convey behavior. Unit tests identify exactly HOW the function should be called and if/what it should return.

A passive benefit of unit testing is forcing you to examine your interfaces. Spaghetti code usually inflates the size of your tests. Functions with multiple dependencies are hard to understand and refactor. If the dependencies require sizable setup, you lose even more flexibility.

Larger tests also translates to more written code, which can be a barrier to a team accepting the process.

The Weapon class in this code sample is very coupled to other game objects. The fire() function in particular has implicit dependencies on the Player class (someone must be holding the weapon), the World class for line-of-sight checks, and the Sound Manager class to play sounds.

def fire(self, target):
    # Must fire at a valid target player
    if target is None:
        raise MissingTargetException()

    # Weapon must be held by someone to be fired
    if self._player is None:
        raise NoWeaponOwnerException()

    # If no ammo, play "empty click" sound and return
    if self._current_ammo == 0:
        sound_manager.play(self._empty_sound)
        return

    # Query current map to verify line-of-sight to target
    # Player carring this weapon is cached on construction
    if world.get_current_map().check_los(self._player, target):
        # Decrease ammo
        self._current_ammo = self._current_ammo - 1

        # Apply damage to target
        target.take_damage(self._damage)

        # Play weapon "fire" sound
        sound_manager.play(self._fire_sound)

    else:
        # Play "error" sound indicating no line of sight
        sound_manager.play(self._not_visible_sound)

    return

Let’s add test coverage to the fire() method. We can start by adding a test which verifies the ammo count is correctly reduced by one.

The ‘setup’ portion of the test must instantiate the dependencies.

def test_weapon_fire(self):
    # setup
    # =============================

    # Instantiate player object because Weapon is required to belong to someone,
    # and uses Player for line-of-sight checks.
    player = Player("user_player")

    # fire() requires a valid target
    target_player = Player("some_enemy")

    # Weapon plays sounds, and requires a World instance for line-of-sight
    sound_mgr = sound_manager()
    world_mgr = world()

    sound_mgr.init()
    world_mgr.init("some_map")

    # Weapon will load required sounds and art assets on its own
    weapon = Weapon("rifle")

    # Weapon has circular reference to owning Player
    player.add_weapon(weapon)


    # execute
    # =============================

    # Reduce ammo
    weapon.fire(target)
    actual_ammo_post_fire = weapon.get_current_ammo()


    # assert
    # =============================
    # Verify expected results
    
    # Verify we lost ammo on fire
    self.assertEqual(weapon.get_max_ammo() - 1, actual_ammo_post_fire)

Comments aside, you can see how convoluted things get. The number of implicit dependencies add overhead to object creation and test comprehension. Remember, a benefit of unit testing is providing real-world usage and documentation. A new engineer will need a baseline understanding of multiple systems to work on Weapons.

If the Weapon class were decoupled from players, and world functionality, things get simpler. This alternate fire() function no longer requires the World class for line-of-sight checks. You could argue this should be handled by an A.I. system.

Attachment to a Player instance is also removed, and fire() can now operate without a valid target. There is still an implicit link to the Sound Manager, though it’s not required. The dependency overhead is reduced to zero so you can now test the function in isolation. The goal is to test that ammo is decreased, not that the sound system works or that the target can take damage. This simplifies our system and test.

def fire(self, start, target):
    # If no ammo, play "empty click" sound and return
    if self._current_ammo == 0:
        if sound_manager:
            sound_manager.play(self._empty_sound)
        return

    # Decrease ammo
    self._current_ammo = self._current_ammo - 1

    # Play weapon "fire" sound
    if sound_manager:
        sound_manager.play(self._fire_sound)

    # Apply damage to target
    if target:
        target.take_damage(self._damage)

    return

The test below operates on the second Weapon class. Having fewer responsibilities allows for more maintainable code and simpler testing. This is much easier for a new engineer to grok.

def test_weapon_fire(self):
    # setup
    # =============================

    # This version will NOT implicitly load assets on construction
    weapon = Weapon("rifle")
    start_position = vector(2.0, 0.0, 2.0) 

    # execute
    # =============================

    # Reduce ammo
    # Note: we no longer require a target
    weapon.fire(start_position, None)
    actual_ammo_post_fire = weapon.get_current_ammo()

    # assert
    # =============================

    # Verify we lost ammo on fire
    self.assertEqual(weapon.get_max_ammo() - 1, actual_ammo_post_fire)

Poor interfaces and tightly coupled code usually lead to long and convoluted tests. Writing tests first force you to consider how your interfaces will be used and help create more maintainable code.