Measuring the competitiveness is, in some ways at least, easier than measuring fairness. Fairness is a complex notion, and even if you’ve clearly identified the aspect of fairness you’d like to measure, it may be difficult to know how fair a result is. Fairness (C), for example, is defined as rewarding merit. In my computer models, merit is simply a number, and so fairness measures relating to it can be calculated precisely. But in the world, what constitutes merit is not so clear.
Competition, in contrast, is more readily visible, though there are still some serious questions about how it might be measured. In this post, I’ll work through some of the issues, and suggest a couple of competition measures that I plan to test further.In a footrace, the difference between the winning time and the time of the runner up gives an indication of how competitive the race was. But the final time differential does not say everything about whether the race was excitingly close or not – perhaps the leader was comfortably ahead, and decided to trot across the finish line, winning with a time difference similar to that of a race in which the result was in doubt until the last.
For many sports, a point differential is a plausible indicator of how competitive a match is. But here, too, the way the final total was reached affects how competitive the match is perceived to be. A baseball game that ends with a score of 4-1 might have been a barnburner where there was a walk-off grand slam with two outs in the bottom of the ninth, but it is more likely to have been a rather unremarkable game.
The match model I use in my tourney simulator simply adds a normally-distributed random number for each game, multiplied by a luck factor, to a normally-distributed skill factor, also randomly-generated, but constant for each player for the duration of the tournament, to a similar number for the opposition. The higher number wins. And the difference between the numbers will be starting point for the metric of competition.
But the right way to do this is not clear. Should the metric of competition be computed before or after the chance factors are added in? A tourney design that manages to match up players with very similar skill factors has done all it can to increase the competitiveness of the fixture, but that doesn’t mean that the match will actually be close – one side or the other might have all the luck. And, whether it’s done before or after the luck factor is applied, does it make sense to simply add up the differences between the two players? Perhaps the metric should be computed so as to discount the difference between an ordinary thumping and a huge blowout. In basketball terms, a 15-point victory is usually no more interesting a game than a 45-point blowout, and it may be wrong to measure competitiveness in a way that gives the latter three times the influence of the former.
My preferred alternative is to designate some threshold that separates close games from not-close games, and count the proportion of games in a tourney format that are close. The threshold chosen is bound to be somewhat arbitrary, but competition expressed in terms of the percentage of close games is a statistic that’s easy to interpret in a way that the existing fairness measures are not. If competition were given a similarly abstract numerical definition, it would take some time before one could get a feel for whether a score of, say, 23.43 indicated a highly competitive tourney, or otherwise. But a statement that 23.43% of matches were close enough to be considered competitive carries information that a score differential does not.
Accordingly, I plan to proceed, in the near future, to test two forms of a competition statistic, both of which will be couched in terms of a percentage of games that are close. The first will count games in which the skill disparity alone is less than some threshold – perhaps 0.2 for the current model. The second form will report the same percentage, but wait until after the chance factors have been applied. I may think of better names for these in the future, but for now I’ll just call them competition (A) and competition (B). Competition (A) is the percentage of games in which the opponents are nearly evenly matched, and competition (B) is the percentage of games in which, no matter what the disparity of skill level, the result of the match is within the threshold.