For some time, I’ve been concerned that the fairness statistics I report are harder to interpret than I’d like. The new version of fairness (B) in particular has made the problem more apparent to me, and so I’m finally ready to make a change that I should probably have made long ago.
The difficulty is that the measures I’ve defined really measure unfairness more directly than they measure fairness. I define fairness (C) by adding up the instances of a less skillful player being rewarded in preference to the more skillful player. Fairness (B) is determined by adding up the inequalities in the result of similarly-placed competitors. In both cases, a higher number meant less fairness, not more.
To date, I’ve been flipping this around by then taking a reciprocal, and that makes higher numbers good and lower numbers bad. But it also has a tendency to scale the numbers in ways that make them hard to interpret. For the new fairness (B) measure, the difference between a score of 3 and a score of 4 is quite significant – a format scoring 3 is much less fair than one scoring 4. But the difference between a fairness (B) score of 70 and one of 100 is pretty trivial, nothing that the designer should worry about.
For this reason, I’m getting rid of the reciprocals. Henceforth, lower numbers are good, and higher numbers are bad.
Continue reading “Fairness Turned Upside-Down”