I’ve long wondered whether the qualifying times for age group championship meets are fair. If the girls’ qualifying time is faster than the boys’, does it mean that girls are faster than boys or that the event favors boys at the expense of girls? And, vice-versa, if the boys’ qualifying time is faster than the girls’, does it mean that boys are faster than girls or that the event favors girls at the expense of boys? At what point do the qualifying times become fair? Without a large representative database of age group swims, it’s difficult for the casual eye to assess equality of opportunity from a championship meet announcement.
An important component of the way age group athletes define their own success -- and build their confidence -- is by the number of championship meets they attend, the number of championship events they qualify for, and whether they swim on their team’s championship relay team. Systematically unfair qualifying times would mean that USA-S is insidiously demoralizing athletes on the wrong side of the bias. Therefore it is vital to the mission and health of USA Swimming that championship qualifying times be fair.
In this post, we’ll explain how to analyze the fairness of an individual event in a championship meet, and how to roll up that analysis to assess the overall fairness of championship meets. In subsequent posts, we’ll apply our analysis to USA Swimming championship meets, including Junior Olympics, Zones, Sectionals, Futures, and Junior Nationals. We’ll also look at YMCA nationals. Our analysis will show that age group championship meets hosted by USA Swimming systematically favor boys at the expense of girls.
Throughout our analysis, we’ll stipulate that a championship event is "sex-fair" if male and female athletes of the same age are given the same opportunity to compete in that event. A priori, we’d expect that males and females would participate in sex-fair championship events in rough proportion to their overall participation in the sport.
This chart plots the fraction of female USA-S athletes as a function of age. It shows that USA-S has more 16/Under female athletes, and more 17/Over male athletes. Therefore, we would expect that sex-fair championship events would admit more 16/Under girls than 16/Under boys, and more 17/Over men than 17/Over women.
Our notion of fairness may be generalized to athlete attributes other than sex. For any athlete attribute -- such as age, birth month, or LSC -- we’ll say a championship event is fair with respect to that attribute if and only if athletes that differ with respect to that attribute have the same likelihood of qualifying for the event. Later we’ll see that championship events are extremely unfair with respect to age and birth month (link).
For the record, we don’t consider age groups to be inherently unfair, even though the athletes at the top of an age group have a greater opportunity to compete than those at the bottom. After all, in the normal course, every 9 year old should have an opportunity to compete as a 10 year old, every 11 year old should have an opportunity to compete as a 12 year old, and so on. Giving different ages different opportunities is not intrinsically unfair, unless the unfairness persists as the athlete grows older.
This narrow definition of fairness does not account for deeper inequities in USA Swimming, such as the inherent advantage of children raised in wealthy communities when compared with those in impoverished communities. It is designed to focus on the narrow question of whether a given meet entry specification is fair to existing USA-S age group athletes.
To compete in a championship event, an athlete must satisfy three conditions. Firstly, they must satisfy the sex and age restrictions of the event, if any. Secondly, they must have competed in the event (or an equivalent event) recently, typically in the past year. And thirdly, one of their recent swims in the event must have a time that is at least as fast as the event’s qualifying time.
Each of these three conditions progressively reduces the number of athletes who can compete in the event. Let’s consider an example from a recent elite championship meet targeted to 9-14 year old USA-S athletes. In the “13-14 200 Breaststroke”, the women’s qualifying time was 2:56.39; the men’s was 2:44.79. Times must have been achieved in the past year. Converted and non-conforming times are not accepted. We won’t name the meet because our goal is to develop our analytic technique rather than criticize the meet or its organizers.
The following table charts the number of athletes in our data that satisfy the three conditions for this championship event. The first record, “Population”, reports that our data includes 434,928 female athletes and 309,995 male athletes. (Recall that we define an athlete to be a person of a given age in years, so an athlete who competes at three different ages is counted three times.) The second record, “Eligible”, reports that 136,642 women (31.42%) and 100,729 men are aged 13-14 (32.49%). The third record, “Recent Swim” reports that 27,476 of those women (20.11%) and 20,630 of those men (20.48%) swam the event in their 13th or 14th year. The fourth record, “Qualifying Time” reports that 3,178 of those women (11.57%) and 2,332 of those men (11.30%) achieved the qualifying time in that event in their 13th or 14th year.
Putting this information in the following bar chart makes it easier to compare the relative frequencies. The bar chart adds a fourth record, “Accepted”, for the relative frequency that eligible athletes are accepted to the event. In this case, 3,178 of the 136,642 eligible women (2.33%) and 2,332 of the 100,729 eligible men (2.31%) qualified. These “Accepted” values combine the likelihood of having swum the event recently with the likelihood of achieving the qualifying time in one of those swims.
This bar chart shows that the slightly greater likelihood of men having swum the event recently (“Recent Swim”) is offset by the slightly greater likelihood of women making the qualifying time (“Qualifying Time”), resulting in nearly identical likelihoods of qualifying for the event (“Accepted”).
Finally, if we only care about the sex-fairness of an event, the following bar chart is useful. It plots the natural logarithm of the ratio of the men’s and women’s likelihoods. The “Recent Swim” column plots ln(0.2048/0.2011); the “Qualifying Time” column plots ln(0.1130/0.1157); and the “Accepted” column plots ln(0.0231/0.0233). The logarithm of the ratio (aka “log-ratio”) is zero if the two likelihoods are identical, positive if the men’s likelihood is greater than the women’s, and negative if the women’s likelihood is greater than the men’s. The closer the men’s and women’s relative frequencies, the closer the log-ratio is to zero; the farther apart the two relative frequencies, the farther the log-ratio is from zero.
Log-ratio plots have three significant advantages over, say, plotting the ratios directly. Firstly, the log-ratio is zero when the men’s and women’s relative frequencies are identical, which makes it easy to spot sex-fair events. Secondly, the log-ratio is symmetric. If the women’s relative frequency is twice the men’s, then the log-ratio will have the same magnitude (but a different sign) as when the men’s relative frequency is twice the women’s. This makes it easy to compare the relative unfairness of events. Thirdly, the log-ratio reduces the magnitude of extremely large likelihood differences, which makes it easier to include events with large and small sex differences on the same plot. Thus the log-ratio makes it easy to compare the championship opportunities afforded to men and women. For these reasons, we’ll typically plot log-ratios when we want to assess the fairness of championship events.
After all this analysis, it’s clear to see that our example championship event (the 13-14 200 Breaststroke) is very sex-fair. The meet organizer chose the qualifying times for this event so that men and women have nearly identical opportunities to compete. The qualifying time likelihoods are very close and the overall accepted likelihoods are nearly identical. In our next post, we’ll analyze the sex-fairness of the other events in this championship meet. We’ll learn that many events in this meet are sex-unfair, as is the overall meet.
The preceding analysis shows that men and women in the 13-14 age group had nearly identical opportunities to compete in the 200 Breaststroke. The following plot breaks down that 13-14 age group by age in years. Breaking down the age group reveals the event is both sex- and age-unfair. The event is sex-unfair because the 14 year old men are 1.14 times more likely to qualify than the 14 year old women, while the 13 year old women are 1.45 times more likely to qualify than the 13 year old men. The event is also age-unfair because the 14 year old men are 3.0 times more likely to qualify than the 13 year old men, while the 14 year old women are 1.8 times more likely to qualify than the 13 year old women.
This combination of age- and sex-unfairness has a subtle consequence. If every 13 year old athlete competes as 14 year old, then women are more likely than men to make the cut both years (in their 13th and 14th year), while men are more likely than women to make the cut at least once (in their 14th year). It’s not clear whether this subtle sex difference is fair or not.
We can reframe championship event qualifying as a three-phase process:
In the first phase, athletes of the required age and sex progress to the second phase; everyone else is rejected from the event.
In the second phase, the athletes who have recently swum the event (or an equivalent event) progress to the third phase; everyone else is rejected.
In the third phase, athletes with a recent swim whose time is at least as fast as the event’s qualifying time are accepted to the event; everyone else is rejected.
The first and third phases are under control of the meet organizer, who chooses the age and sex restrictions for events and sets the qualifying times. The second phase is largely under control of the athlete and their coach, who decide which events the athlete swims. Dividing the championship event qualifying process into these three phases allows us to more clearly assess the sex-fairness of the meet organizer’s decisions.
We also refer to a synthetic “Accepted” phase that combines the second and third phases into an overall likelihood that an athlete has swum the event recently (second phase) and achieved the qualifying time in one of those swims (third phase). This “Accepted” phase shows the overall likelihood that an athlete of the required age and sex can enter the given event.
In the next post, we’ll analyze the sex-fairness of the remaining events in this meet, as well as the overall sex-fairness of the meet.
Revision History.
2021-02-18
Published
Added links. Replaced “gender” with “sex”.