The Problems With Difficulty Modification

21 Mar 2017

Win/Loss Ranking

The win/loss ranking system, first described in this article is a unique way of structuring single-player strategy games that provides some significant advantages over other formats. Under this structure, exemplified in Dinofarm Games’s Auro: A Monster Bumping Adventure and BrainGoodGames’s Minos Strategos, each match consists of the player trying to reach a binary goal and either succeeding or failing. The unique aspect of this format is that every match is balanced such that the player has about a 50% chance of winning. This is accomplished by giving the player a “rank,” a number that exists between matches and determines the difficulty of each game. The player’s rank will increase if they win enough games, and decrease if they lose enough. In this way, the win-loss ranking system mirrors the concept of matchmaking in multiplayer games in which the player is always matched with an enemy of a similar skill level.

This match format is intended to solve several problems with other formats, particularly the “high-score” format. Two of the most significant problems it solves, in my opinion, are the problem of having indefinitely scaling match lengths as the player reaches for higher and higher scores, and the problem of having to replay the easy early stages of the game every match even though only the later stages of the game are difficult enough to be interesting.

This format has significant advantages over other formats, such as the “high-score” format or the “roguelite” format. However, I believe that there are improvements that can be made.

Score-based Ranking

There is a variant of the single-player elo format that I’ll call the “score-based ranking” format, described by Redless in his article A New Scoring System as the “score + ranking system”. In this format the player has a rank that determines the difficulty of a match, just like in win/loss ranking. The difference is that instead of there being a binary win/loss condition at the end of the match, the player receives a score. The way the player’s rank is changed is different as well: since there isn’t a discrete win/loss condition, you can’t merely increase the rank if the player wins or decrease it if they lose. Instead, each match has a certain point goal, a number of points that the player is expected to reach on average. If they score above the goal their rank is increased, and if they score below their rank is decreased.

The biggest advantage of this format compared to win/loss ranking is that the feedback the player receives on the value of their actions is continuous instead of binary. Getting a score that is 2 points above your point goal or 20 points below it tells you a lot more about how well you did than just “you win” or “you lose.” This advantage in feedback efficiency is most important when comparing two strategies of marginally different value, because you can get a sense of the difference in fewer matches if with continuous feedback than with binary feedback. It will take fewer games to be confident that you’ve made a small increase in score rather than a small increase in winrate.

Though this format is an improvement over win/loss ranking, I believe it still has problems that need to be solved.

Problems with difficulty-modification

Though the win/loss ranking and score-based ranking formats solve the problem of early-game boredom and indefinite scaling, they have some unique problems caused by the fact that they include difficulty modification, or in other words that they change the difficulty of the game as the player gets better. One such issue is that if the rules change when the player’s rank increases, some of the strategic information that the player learned in the past becomes useless. For instance, in Auro the difference in the difficulty between ranks is accomplished in part by changing the distribution of enemy monster types. So, when the player learns the game at a low rank, they are playing under different rules than at a higher rank. Thus, if the player has learned strategic information that is only useful for the enemy distribution in their current rank, and then their rank increases, that strategic information has been useless. For instance, the player may learn that being very offensive instead of defensive might be a good idea under one enemy distribution, but when they are presented with a new distribution that information they learned is destroyed.

To solve this problem, games should avoid using things like enemy distribution to modify rank, and instead modify a small number of variables in a consistent and simple way. A good example of this is Minos Strategos. In this game, the only variables that are modified as the player’s rank increases are the number of points needed to win, and the number of points the enemy can score before the player loses. Technically, at low ranks the number of enemies that spawn is lowered and a few special enemies don’t show up, and the at a high rank the enemy point goal reaches a minimum that higher levels don’t go below, but this game is still a good example of what I’m talking about. By limiting the number of things that change with rank, and making the changes obvious and simple, you minimize the destruction of strategic information. The ultimate implementation of this idea would be to have everything in the game stay constant except for a single variable, like the number of points the player needs to win.

Another issue with difficulty modification is that it leads to the player oscillating between two ranks if one of them is too hard and one is too easy. Though ideally the player’s rank corresponds exactly to their skill level, in practice the player will often be slightly worse or slightly better than their rank expects them to be, leading to the game being slightly too difficult or too easy. This leads to the player repeatedly moving back and forth between two ranks, spending half their time in a rank too low for them, and half in a rank too high.

The reason this is a problem is that the time spent in the lower rank will feel like time wasted to the player. When the player falls back down after attaining the higher rank, they feel as though they have lost progress and the time spent getting back to the higher rank feels like time wasted. Worse, if the winrate isn’t at 50% the feedback the player receives isn’t optimally useful (you can find a rigorous proof of that claim here).

This oscillation problem can be solved by making the differences in difficulty between consecutive ranks smaller while increasing the total number of ranks, so moving a single rank isn’t too big of a deal in terms of the change in difficulty it represents. For instance, both Auro and Minos Strategos expect most players’ ranks to be somewhere in the range 1 through 20. If instead most players fell into ranks 1 through 100, the oscilation problem wouldn’t be as big of a deal.

Of course, it would be necessary to make the rate at which the player progresses through ranks much faster. Having so many ranks also poses an issue in games like Auro where the difficulty is adjusted through changes in enemy distribution, since that presumably requires attention to be spent on the distribution at each individual rank. However, it would be much easier to accomplish having many different ranks in a game where the difficulty modification was accomplished through the change of a small number of variables, as I proposed previously.

With these two considerations in mind, modifying a small number of variables to change the difficulty between ranks and having large number of ranks, I would like to propose a different format that takes these considerations into account.

The Par Format

If a game modifies its difficulty only by modifying the number of points the player is attempting to reach, and you want to have as many ranks as possible, you can’t really get any more ranks than having one rank corresponding to each possible score. Thus, instead of having both a rank and a number of points determined by that rank, you can simply think in terms of a single number, the “par” that the player is attempting to reach. Just as happens in the score-based ranking format, the player’s par is adjusted when they get above par and decreased when they go below. However, since the rank and point goal are now the same thing, you can just modify the rank by making it move closer to the score that the player got. In fact, you could even just set the par to the average score the player got on the last several games to get a pretty reasonable approximation of a good rank for the player.

Of course, this isn’t very different from a normal high-score format game, in fact the only concrete difference is that instead of showing the player their high-score, you show them par. Since, in taking the idea of difficulty modification to its logical conclusion, we’ve ended up with essentially a system without difficulty modification, it seems like this system will fall victim to the same problems as the high-score format that win/loss ranking originally intended to solve, namely the problem of indefinitely scaling match lengths, and the problem of having to replay the early-game every match. Thus, the idea of win/loss ranking in general is implicitly based on the premise that these problems can’t be solved merely with a game where only a single variable modifies difficulty, and instead require a more complex difficulty modification scheme. However, I believe that premise is false.

Optional-Challenge Rewards

To solve the early-game boredom problem without a more complex difficulty modification scheme, games should include optional challenges the player can take on that will make the game harder, but give them a higher score in return. If a game includes enough of these optional challenges, the early-game boredom problems is solved, since in the parts of the game that aren’t hard enough to interest the player, they will seek the extra rewards to get more points while simultaneously making the game harder, and thus more interesting. I call these rewards “optional-challenge rewards,” or OCRs, and I believe that including enough of these is the key to varying the difficulty of a game by modifying a single variable.

There are many ways to implement ORCs, and I will list a few possible examples here:

It’s also possible to have OCRs that don’t immediately give the player points, but instead make it easier for them to get points in the future. For example, imagine that in the early the player can choose to spend effort gathering a resource that won’t immediately benefit them, but that will allow them to get more points once they get to the late game. If you take into account optional-challenge rewards, the indefinitely-scaling-match-length problem is rather easy to solve: just put a limit on how long the game can last. Normally this would put a hard cap on how many points the player can get, since in many high score games the amount of points the player has is mostly determined by how far they get in the game. However, in a game with enough OCRs, the player’s score isn’t determined by how long they survive, but by how many points they are able to get in each moment of the game.

An example of this would be to have a game which always lasts the same amount of time, say 5 minutes. There is no concept of “death,” the game just ends once those five minutes are up. Here, the question of “how long does the player survive” isn’t even a consideration, instead the question is “how much value is the player able to squeeze out of the time that they are given.” However, games don’t necessarily need to go as far as having a completely static match length. A game can also just have a difficulty curve that becomes very steep at some point in the game, such that its implausible for the player to survive for an indefinite amount of time.

The Par Format with a Binary Outcome

Some people, for one reason or another, believe that games need to have a binary win/loss condition at the end instead of simply giving the player a score. This isn’t a view that I agree with, but it is a view I respect, and for people who hold that view I would like to present a slightly modified version of the par format that works for a game with a binary win/loss outcome.

Under a binary outcome, the par system would work mostly the same, except that when the player reaches par, the game ends and they are presented either a victory or loss depending on whether or not they reached par. The other difference is in the way the par is modified between matches, instead of having the par tend towards the player’s average, the par needs to increase when the player wins, and decrease when they lose.

Other than that, I think the par format is mostly the same for a binary outcome as it is for a score outcome. To modify the game’s difficulty with a single variable and avoid indefinitely-scaling match lengths, the inclusion of optional-challenge rewards is still just as important as it is for a score-based game.


Difficulty modification in games can be improved by lowering the number of variables changed between ranks, to prevent the destruction of strategic information and minimize the number of new rules the player has to learn, and increasing the number of ranks to make the transition between them smoother and prevent rank oscillation. Taking these two ideas to their extreme, we can have a single number, the “par,” used to modify difficulty, that is the both the only thing that is modified between and the representation of the rank itself.

To avoid the problem of the early game being boringly easy and the problem of having the length of matches scale indefinitely as the player gets better at the game, a game must include “optional-challenge rewards,” challenges that aren’t strictly necessary to success, but provide a point bonus if the player chooses to take them, at the expense of making the game harder.