Pydemic: A command-line implementation of the board game Pandemic
Introduction
I recently wrapped up work on a command-line implementation of the board game Pandemic. The origin of this project dates back to 2018 when, during my early days of learning Python, I was curious about object-oriented programming. Games are a great fit for object-oriented patterns since they have complex behaviors and interactions that are represented by literal objects. I had recently bought the game and was familiar with its rules, so I naturally chose it for this project. I ultimately managed to get a mostly usable prototype working before running out of steam and taking a break.
Six years ended up being the right amount of time to forget about my first go-around, so when I recently came across it again while re-organizing my coding projects, I thought it was a shame I left it almost finished. I expected it would only a few weeks in my spare time to tie up the remaining loose ends, but I ended up playing a good joke on myself because it ended up taking several months. Some of this time was undoubtedly spent chasing perfection, but the main reason it took so long is that it turns out game development (even for a simple text-based one) is very complex! I’m sure this is no surprise to anyone who’s worked on similar projects, but it honestly caught me off guard. Unlike the data pipelines I usually work on that handle a small number of rigid file formats, this program had to harmonize many interlocking systems, any of which could attempt to put the game into an invalid state during normal operation. It was a new style of programming for me, and since I learned a lot about creating robust and extensible programs, I wanted to briefly reflect on some lessons learned–both for myself and anyone else who might embark on a similar project in the future.
I’ve grouped my lessons into two broad categories. The first is a short but sweet plea for incorporating testing early and often in development, and the second is a series of observations about best practices for program design.
Test early and often
The last thing I did for this project was write a suite of tests, but I wish I had done it from the start. It would have saved me hours of manually playing the game attempting to debug complex states that can only be reached after many turns. It also would have enforced separation-of-concerns and modularity from the start. I was generally good about doing this anyway, but there were a few parts of the codebase where I had to disentangle large multi-purpose functions into smaller components to test their functionalities individually. For example, for game setup I originally wrote a single function that both creates the game objects and puts them in their proper states for the start of a game. While convenient for firing up a game with a single function call, this design didn’t provide access to a “clean slate” state that would allow me precise control to test more complex scenarios.
To write my test suite, I used pytest
which was easy to get started with but is also rich in features that support more complex tests. There are, however, many other options in the Python ecosystem, including a testing module in the Python Standard Library called unittest
. It supports a similar set of basic testing functionality, but pytest
has more features. I’ve only used it briefly in the past, but I generally find the functional style of pytest
cleaner and more intuitive over unittest
’s more verbose object-oriented patterns. However, if “batteries-included” testing is a high priority, it’s still a good option.
Best practices for program design
Choose meaningful and specific names
There’s a common saying in computer science that there are only two hard problems: naming things and cache invalidation.1 At this point, it’s thrown around so much and spawned so many variants that it feels trite to repeat it here, but I believe the first part at least holds a lot of truth. (As a data scientist rather than a software engineer, I can’t speak to the latter.) This shouldn’t be much of a surprise though because the value of computer languages in the first place is in abstracting the details of the how the computer works, so the developer can focus on solving the problem at hand. The right level of abstraction depends on the nature of that problem, but what all languages have in common is assigning meaningful names to operations. In assembly languages, these names correspond to the simple but fundamental operations of the processor, like moving a value into a registry, whereas in high-level languages such as Python they correspond to intuitive but complex operations, like appending a value to a list. However, none of these languages would be of any value to humans if these operations didn’t have names that reminded us what the computer is doing.
This principle also applies when writing programs, and part of the art of software engineering is identifying the right abstractions for a particular problem and giving them meaningful names. This came into play with Pydemic when naming the classes that encapsulated the behavior and state of various game concepts, like cities and players. The names of these classes are generally intuitive, corresponding closely or even exactly with their natural language names. For example, the class that models cities is called City
. Where this gets complicated, though, is instances of these classes are frequently represented by their natural language names, like san_francisco
for the City
representing it, both internally and externally. At first, I wasn’t always consistent with using variable names that reflected these nuances. For example, I would sometimes use a variable called city
to represent either the name of the City
or the City
itself rather than something more precise like city_name
. While this might seem like splitting hairs, whether a variable is a string or a City
is an essential piece of information when scanning a block of code and trying to quickly understand its function. I fixed these ambiguities where I found them, but the hard thing about these kinds of software engineering best practices is their adherence is entirely at the whim of the developer. A program will run even if all its variables have terrible names, but the project will suffer in the long run because those bad names become a barrier to entry as implementation details are forgotten or developers turn over. Ultimately, wrestling with code that I wrote years previously taught me that even a team of one is a collaboration, and coding standards are essential for facilitating that collaboration.
Use interfaces to enforce data integrity
On a related note, I largely avoided writing setters and getters for the attributes of classes, preferring instead to directly access and manipulate the underlying objects. This style is a result of my frustration of having to learn new APIs when working with Python packages that use them heavily. Why should I need to remember the specific method to add an item to a list when I can just access it directly and use append
? However, after working on this project, I have a newfound appreciation for just how brittle this approach is, especially when data validation is necessary. I was generally able to avoid breaking anything since as the only developer, I was familiar with the underlying assumptions of each object and could respect them when modifying their data. For larger teams, there are no guarantees, so setters and getters are effective techniques for enforcing data validation. Where I shot myself in the foot in this project was not using a setter for a Player
’s current city. Initially, I directly modified the underlying attribute (stored as a string). Later, I needed to change this “interface” to a function that did more than just modify an attribute, meaning I had to replace all instances of code that looked like player.city = city_name
to player.set_city(city_name)
.2 This is by no means the worst chore, but if I could start over I would use setters and getters more liberally to avoid committing to a specific interface syntax.
Avoid global state
A much more painful refactor was moving data accessed and modified by different objects from a module, shared
, into an encapsulated GameState
object. To be fair to my past self, this was better than using global variables in the main
module. However, it still reduced readability because it obscured how functions modify the game’s state, as they could directly access shared data through a top-level import rather than through an explicit argument. It also would have greatly complicated testing since there can be only one instance of a module at a time in a given Python session. I initially used this approach because it seemed “cleaner” than passing around a GameState
object to every function that needed it, but in the end that choice cost me a commit with a diff spanning several hundred lines of code.
Measure twice, cut once
My last lessons are the most general about program design. While hindsight is always 20/20, I wish I had thought more about the overall structure of Pydemic and how different parts of the program communicate with each other before starting to code. For example, player actions report their results by printing a message to the screen. While this works, actions don’t report a return state, so it’s difficult to check if they succeeded or failed as expected during testing. Moreover, these “external” interfaces don’t follow a consistent naming convention, so it’s not immediately apparent from scanning the codebase which functions handle user input. Similarly, Pydemic doesn’t have a consistent approach in whether it asks for permission or begs for forgiveness. In other words, sometimes Pydemic performs checks to avoid calling a function that would put the game into an invalid state, and other times it barrels ahead and instead relies on catching exceptions. Honestly, it may be impractical to use a single approach, especially if functions need to return objects other than numeric codes. After all, exceptions exist for a reason. However, because Pydemic was developed organically, I’m not confident there’s a consistent pattern when one method is used over the other.
Conclusion
There are countless other minor issues I would fix if I had the time and energy, but ultimately every project reaches a point where good enough is good enough. Ultimately, for me Pydemic was more about the journey than the destination, so by that measure it’s a success in spite of its rough edges.
This is commonly attributed to Phil Karlton, a software engineer at Netscape in the 90s. ↩︎
Python’s
property
decorator can make setter and getter functions behave like direct attribute access. However, the decorator only works for single-argument setters, and in this case the setter required multiple arguments. ↩︎