How I Learned to Love Property Based Testing

Katie Cleary – Software Engineer

zack.png

The first time I was invited to join a field test for OSCC, our open source car control platform, I was beyond excited. Driving a car with a game controller seemed like playing a real life racing game. As an engineer previously focused on game development, witnessing how the hardware, firmware, and software interacted with real-world physical interference captured my curiosity. After a deeper dive into the project, my Need for Speed¹ daydreams were shelved as I became aware of two important realities:

  • That actual car behavior isn't as easy to quantify as the deterministic physics involved in your average racing game.
  • The criticality of safety for everyone inside and outside the vehicle.

While an incorrect assumption or forgotten edge case in a video game can make for an entertaining YouTube video, a bug in vehicle software has much graver consequences.

Beyond Mario Kart

After joining the project, I became focused on making things as safe, solid, and predictable as possible. The gamer in me wanted the controls to feel fun, but the software engineer in me didn’t want to stick code changes into a moving vehicle without first feeling confident in their correctness. When you become responsible for testing your steering changes in an actual car moving at 40 mph, safety and assurance in those changes easily take higher priority over making a real-life version of Mario Kart².

With an eye turned toward safety, I began the process of familiarizing myself with the system and its parts. After some digging, it became clear we needed a more thorough testing method. We could write unit tests, but these would only slightly improve upon the kinds of tests that already existed. Typically, unit tests exist to enumerate all the happy-path/sad-path scenarios a developer (i.e. me) can conjure up. Enter: property based testing!

PRoperty BAsed TEsting and the REal Girl

When I began to investigate property based testing, I was initially apprehensive. How would this be different than carefully selecting cases for unit tests? How do you determine a useful and complete set of properties that describe the general functionality of a function? Seemingly, a big barrier of entry existed in order for me to establish a test suite that may only provide a few advantages over the kinds of testing I was already familiar with. Property based testing involves determining the general behavior of the piece of code you’d like to exercise and ensuring that it responds correctly to any potential input. For example, if I were to write a function that woke me up after some x hours of sleep and returned my mood:

enum Mood { Tired, Excited }
fn wakeup(gets_coffee: boo1, hrs_slept: u8) -> Mood {
        match hrs_slept {
        0...5 => Mood::Tired,
        6...9 if gets_coffee => Mood::Excited,
        _ => Mood::Tired
        }
}

Potential properties to test could be:

  • If Katie doesn’t get coffee, the function should never return an excited mood, regardless of the input.
  • For an input less than 6 or greater than 9, it should always return a tired mood, regardless of the state of Katie’s coffee intake.
  • If Katie is in an excited mood, that means that she was definitely given coffee and got between 6 and 9 hours of sleep.

Generalizing the behavior of the function simplifies case automation. This occurs because we can send in any valid inputs to the function, ensuring that it always returns the appropriate output.

After learning the concepts behind this testing strategy, I came to see this type of testing as inherently more robust than traditional unit tests. When testing the steering firmware, for example, we had specific properties that should always remain true regardless of the specific input value or state:

  • It should only change state if it receives an enable, disable, fault report, or command steering CAN frame.
  • It should never change state due to receiving any other CAN frame.
  • It should never write out any values that would cause the vehicle to fault, even if it receives an unexpected input value.
  • If it receives a disable command or a fault report, it should definitely change to the requested state.
  • It should always package and send its state reports as correctly formatted CAN frames with the correct CAN frame ID.

These properties gave us a good foundation for establishing a functional test suite that ensures correct behavior of the module, without having to choose specific cases. Instead of writing unit tests to ensure that an input value of 4000 was correctly constrained to 3440, we could write a property based test that checked that the outputs were always contained in the non-faulting range. We could run that test thousands of different times with thousands of different generated inputs and ensure that the property always holds true. If we wanted to achieve those test targets with unit testing, we would need to specify each different case independently. With property based testing, we only needed to specify the input type to the function and let the generators do the work.

Choosing Quickcheck

We used a property-based testing tool called QuickCheck for Rust. One of the reasons we went through the effort of integrating Rust into our testing procedures—even though the project is all C—is that eventually defining the property model of a system can become fairly complicated. Having a higher degree of trust in the correctness of the property model implementation becomes valuable quickly. Rust provides more assurances on some facets of the implementation than if we were to do the same in C itself. This enabled us to specify allowable ranges and which inputs to randomize. It also provided support for randomly generating each field of a struct. Since we were randomly generating inputs instead of manually choosing specific ones, unexpected edge cases were easier to find. The randomly generated inputs enabled us to find several edge cases where we had critical casting or overflow/underflow issues that could have affected how we integrated with steering control and brake actuation. QuickCheck for Rust also provides automatic shrinking—meaning that once it generates an input that creates a failure condition, it will continuously re-test with a shrunken range of inputs in an attempt to discover the minimum possible failing input space. The automation of these test conditions made it much easier to discover and diagnose issues that may have been missed completely with more traditional strategies.

A useful side effect of property based testing is that since you need to find properties that will always hold true over any inputs, you will develop a deeper understanding of the code and what it’s supposed to be doing. Instead of considering expected output for specific inputs the general output could be considered, which clarifies the purpose of the function under test. This makes it easier for developers to qualitatively reason through the high level purpose and responsibilities of the function, the things that it should or shouldn’t be able to change, and the ways in which it should be able to change them. Determining properties of what the function should be doing elicits a stronger analysis of the requirements of that piece of code, and a clear idea of where it fits into the system at large. Being more concerned with the overall properties of the code to test allows you to focus on creating more robust code. Code that responds correctly to any allowable input, instead of just testing it against your assumptions on an avalanche of specific cases.

After carefully determining functional characteristics and implementing our property-based test suite, we were able to quickly run thousands of tests over many different inputs. This provided a much higher level of assurance than we’d been able to achieve with previous testing strategies. Through building our property-based testing suite, I developed a better understanding of the project’s requirements and each module’s responsibilities. Ultimately, it enabled us to make deeply informed engineering decisions, because we now had a better understanding of the properties of each piece of firmware.

The result? A modular design where each piece of code has one job that it does really well. Consequently, I feel much more comfortable while trying to get the Kia controller to drive like Mario Kart, no bananas included.


1.  A product of Electronic Arts

2. A product of Nintendo


 
 

Share Katie's post: