There are four goals of tournament design: fairness, efficiency, participation, and spectator appeal. The metrics offered so far offer limited help in assessing whether some of those goals are achieved, and none at all for others.
Fairness, so far, is measured only with respect to fairness (C), the extent to which it is the best players who win. Efficiency is measured only in terms of the number of rounds in the tournament – a measure that is not important at all for and event in which the number of rounds is not a limiting factor. Participation has been measured only so far as I’ve counted the number of repeated pairings likely to occur in a format. And spectator appeal has not been measured at all.
In this post, and another to come soon, I’ll introduce two new metrics: Fairness (B), and Competitiveness.
First fairness (B), a measure of the extent to which all entries are treated equally.
The coefficient of fairness used to date measures only fairness (C), the extent to which the format rewards better performance. But in recent posts, particularly those treating the subject of byes, I’ve wanted to show that certain formats, though reasonably good at letting skill win out in the tournament as a whole, put some players or teams at a substantial disadvantage.
This is a particular hazard with respect to the treatment of less skillful players. If a format is to be judged exclusively by its fairness (C) coefficient, it doesn’t really matter what it does with respect to the players who are highly unlikely to win in any case. But for a tournament to be perceived as fair, it’s also got to give all entrants, including those who are very unlikely to win the whole thing, a fair chance to succeed.
I intend to implement a new fairness statistic, to be called fairness (B) to measure this. It’s a simple statistic:
1 / (SD + 0.01)
where SD is the standard deviation in the number of overall wins among the starting positions of the players. The 0.01 is added so that the statistic will take a maximum value of 100 rather than infinity when the standard deviation is zero.
This is, admittedly, a rather blunt tool. It cares only about the entry position, so it will measure the effects of seeding, byes, and precious little else. In this way, it’s the mirror image of the fairness (C) statistic, which cares only about overall tournament wins. In the future, I’ll look for ways to extend the reach of both of these measures, so that fairness (B) can be assessed with respect to inequities that emerge after the first round, and fairness (C) can take account of how rewards other than outright victory are distributed.
But the decision to limit a formal fairness (B) metric to initial positions is not entirely arbitrary. Recall that in the analysis of the shifted bracket, we saw an inequity that emerged only in the second round. The fact that the winners of some initial round matches did a little worse than others was compensated by the fact that the losers of those same matches did a little better, so that the overall chances were not perceptibly different. I think it’s reasonable for a statistic to take this compensation into account.
Going forward, I expect to report the fairness (B) measure whenever it makes sense to do so, and over time to add it to previous analyses. It will often, however, be a pretty dull statistic. It will be very nearly 100 for any blind-draw tournament with a full, balanced bracket. And I expect it will also be very nearly 100 for other kinds of tournaments, including, for example, complete round robins. But I expect it will be useful for studying the allocation of byes, and for assessing different kinds of seeding.