003.013a Deconstructing the Olympic Team Trials Qualifying Times

In this post, we analyze the process by which USA Swimming determined the qualifying times for the 2020 U.S. Olympic Team Trials.  We’ll learn that USA-S significantly favored men in three Trials events and women in two.  We’ll discover phenomena that favor younger women in the non-freestyle events and younger men in longer events.  These phenomena combine to create greater opportunities for 18/under men in the 1500m freestyle and 18/under women in the 200m butterfly and 400m individual medley.  We’ll also identify methodological flaws in the process used by USA-S to select the 2020 Trials cuts.

Background.

In fall 2018, USA Swimming published an article sketching the process for setting the 2020 Trials qualifying times entitled “Behind the 2020 Olympic Trials Cuts: Business, Performance and Analytics”.  According to the article, USA-S seeded the Trials cuts with the 70th fastest time in SWIMS for each event in competition year 2018, and then increased those times heuristically:


“We started with the 70th fastest time in each event as our baseline,” [Patrick] Murphy says. “We went into the database for 2010 and 2014 and determined how many would have qualified for the Trials two years later using those times. Using the 70th fastest time in each event in 2018, we could then project how [many] swimmers would qualify for the 2020 Trials.”


That number came to 1,148, 252 swimmers below the target. A handful of coaches took a look and suggested where to make adjustments. That raised the projection to 1,219. A small amount of secret sauce based on intuition was applied to the standards and the projection increased to 1,302.


If this description is correct, we should expect that every Trials event should have at least 70 qualifiers in the 2018 competition year; and that men’s and women’s events should have similar numbers of qualifiers.  Let’s retrace the steps outlined in the article.

From Top70 Times to Trials Cuts.

First, we’ll use the top times search on usaswimming.org to find the top 70 male and female swims for the Olympic swimming events in the 2018 competition year.  We’ll limit our search to USA-S members only. The times resulting from this search are shown in the table below.



CY2018 Top70 USA-S Members

Event

Female

Male

Fr50

25.87

22.88

Fr100

55.99

49.98

Fr200

2:01.40

1:50.19

Fr400

4:16.76

3:56.12

Fr800

8:50.70

8:14.75

Fr1500

17:02.20

15:48.44

Bk100

1:02.26

56.05

Bk200

2:14.63

2:02.69

Br100

1:10.62

1:02.64

Br200

2:32.72

2:16.53

Fl100

1:00.51

0:53.83

Fl200

2:14.51

2:01.01

IM200

2:17.13

2:03.73

IM400

4:51.24

4:26.11



Next we’ll compare these Top70 times to the 2020 Trials cuts.  Per the USA-S data research manager’s description, we expect the Trials cuts to be slower than the 2018 Top70 times, to qualify more athletes.  This graph plots the logarithm of the ratio of Trials cuts to Top70 times. Positive values indicate that the Trials cut is slower than the corresponding Top70 time; negative values indicate the opposite.   When the values are close, as they are here, they accurately approximate the fraction increase (or decrease) from the Top70 time to the Trials cut. For example, the Trials cut in the men’s 100m freestyle is 0.010 times greater than the Top70 time while the Trials cut in the women’s 800m freestyle is 0.005 times less than the Top70 time.



This comparison of Top70 times and Trials cuts has two striking features.  The first feature is that 800m and 1500m freestyle are the only events whose Trials cuts are significantly faster than the Top70 times, for both men and women.  This is shown by the negative values on the plot, which indicate that the Trials cut is less than (ie., faster than) the Top70 time. The second feature is that USA-S increased the men’s Trials cuts significantly more from the Top70 times than the women’s.  This is shown by the larger positive values in the plot, which indicate that the Trials cut is much greater than (ie., slower than) the Top70 time. The greatest disparity between the men’s and women’s adjustments are in the 50m and 1500m freestyle, where USA-S increased the men’s cuts 0.9% more than the women’s.  On average the men’s Trials cuts are 0.5% greater than the Top70 times while the women’s are only 0.1% greater.  


To visualize the effects of these adjustments on athlete qualification, we plot the number of USA-S athletes who swam at least as fast as the 2020 Trials cut in the 2018 competition year.  This graph shows, for example, that the Trials cuts for the 50m freestyle were top 89 for women in 2018 and top 125 for men. The 1500m Trials cuts were top 40 for women in 2018 and top 54 for men.



As in the previous graph of Top70 time adjustments, this graph has two notable features.


The first notable feature is that the distance freestyle events are the only events whose qualifying times are significantly below (ie., faster than) Top70 times.  The 800m qualifying times are top 58 for women and top 63 for men; the 1500m qualifying times are top 40 for women and top 54 for men. Distance freestyle events are often heat-limited in age group meets due to their poor economics (event entry fee divided by pool and volunteer time).  Such an economic rationale would not apply to the Trials because 70 athletes with Top70 times would swim preliminaries of the 800m freestyle in just over an hour and the 1500m freestyle in just over two hours. The Trials format provides for preliminaries in the morning, optional time trials in the midday, and finals at night.  The Trials have a surfeit of pool and volunteer time relative to age group meets and the distance freestyle events could easily include many more athletes. Furthermore an economic rationale would not justify favoring the men over the women in these events, as the Trials cuts appear to. So it is difficult to understand why the distance freestyle events are so disadvantaged in Trials.


The second notable feature of the preceding plot is that USA Swimming’s ad-hoc adjustments to the Top70 times consistently created more qualifying opportunities for men than women.  The only event where USA Swimming’s “secret sauce” favored women is the 400IM, and only by a small amount (0.2%). On average, the men’s Trials cuts are top 90 times while the women’s are only top 77 times!  We speculate this disparity may be intended to compensate for the larger number of foreign nationals in the the men’s Top70 times than the women’s.


Let’s look at the effect of including foreign nationals in the Top70 times.  This graph plots the number of foreign nationals included in the Top70 times for the 2018 competition year.  It shows that the men’s Top70 times include significantly more foreign nationals than the women’s, and that shorter events typically include more than longer events.  On average the men’s Top70 times include 20.3 foreign nationals per event while the women’s include 13.6.


We’ve seen that USA Swimming adjusted the Top70 times to favor shorter events and to favor men’s events.  Accordingly, these adjustments would appear to mitigate the effects of including foreign nationals in their Top70 times.  Let’s now consider the net effect of these adjustments on the target Trials population of US nationals in the 2018 competition year.

The Empirical Truth Underlying Trials Cuts.

This graph plots the number of US nationals who swam at least as fast as a 2020 Trials cut in the 2018 competition year.  Trials are limited to passport-carrying US nationals, so this graph represents the bottom line outcome of USA Swimming’s process for setting the Trials cuts using swims from the 2018 competition year.  It shows that USA Swimming significantly advantaged men in three events and women in two, and that USA Swimming did not accurately undo the effects of including foreign nationals in their initial Top70 times.


The next graph plots the logarithm of the ratio of men’s to women’s qualifiers, for US nationals in the 2018 competition year.  Positive values indicate that an event’s qualifying times favored men over women; negative values indicate the opposite. The three events most advantageous to men were the 50m, 400m and 1500m freestyle, where USA-S allocated men 21%, 16% and 27% more qualifiers, respectively.  The two events most advantageous to women were the 200m butterfly and 400m IM, where USA-S allocated women 16% and 25% more qualifiers, respectively. Overall, USA-S allocated men 2% more qualifiers than women. The average women’s Trials cut is a top 62 time for US nationals in the 2018 competition year while the average men’s Trials cut is a top 63 time.



It is striking how arbitrary these Trials qualifying times are.  Why would USA Swimming seek to favor men in some events but women in others?  Why make it 29% easier for men to qualify in the 100m breaststroke than the 100m freestyle?  If anything we would expect USA Swimming to assign the most opportunity to the 100m and 200m freestyle, to ensure optimal staffing for its 400m and 800m freestyle relays.  It is also disappointing, considering how easy it would be to transparently construct fair, non-arbitrary Trials qualifying times using top 60 times for US nationals in the 2018 competition year.

Implications for Age Group Athletes.

Let’s now consider the implications of the Trials qualifying time selection process for 18/under athletes.  We’ll begin with a graph we’ve already shown, which counts the number of US nationals who swam at least as fast as a 2020 Trials cut in the 2018 competition year.



Next we’ll break down the female qualifiers by age group: 18/under verus 19/over.  This graph shows that 18/under women had a greater share of the non-freestyle qualifications while 19/over women had a greater share of the freestyle qualifications. 



On average, 18/under women took 47% of the non-freestyle qualifications but only 30% of the freestyle qualifications.  Thus, 18/under women were de facto favored in the non-freestyle events versus the freestyle events by a factor of 1.6. Since the developmental difference between 18/under and 19/over women is small, we speculate that this disparity is due to an age-dependent difference in practice habits.



Turning now to the men, we see a very different pattern. This graph shows that 18/under men had a greater share of the longer distance qualifications (400m, 800m, 1500m freestyle; 200 backstroke, breastroke, butterfly; and 400 individual medley) while 19/over men had a greater share of the shorter distance qualifications (50m, 100m, 200m freestyle; 100 backstroke, breastroke, butterfly; and 200 individual medley).  

On average, 18/under men took 37% of the longer distance qualifications but only 27% of the shorter distance qualifications.  Thus, 18/under men were de facto favored in the longer distances over the shorter ones by a factor of 1.4 times. We speculate that the 19/over men are stronger than the 18/under men on average, and prefer to train for shorter distance events.  As a result, the best Trials opportunity for 18/under men is in the longer distances.



To sum up.. elite 18/under women are favored in the non-freestyle events over the freestyle events while elite 18/under men are favored in the longer events over the shorter ones.  These two factors combine with USA Swimming’s “secret sauce” to explain the 18/under event acceptance likelihoods including the previously noted Olympic distance freestyle asymmetry.


Let’s start with the event most favorable to 18/under men: the 1500m freestyle.  The USA-S “secret sauce” gifted men 1.27 times the number of women’s slots in this event.  18/under men earned an relatively large share (40%) of those endurance slots from 19/over men while 18/under women earned an unusually small share (30%) of those freestyle slots from 19/over women.  Combining these three factors results in a Trials event that qualified 19 swims by 18/under men in the 2018 competition year and only 10 by 18/under women, thus explicitly granting 18/under men 1.9 times as many slots as 18/under women.  In our next post (link), we use our database of 23 million age group swims to independently calculate that 18/under men had 1.9 times the unweighted opportunity of 18/under women in the Trials 1500m freestyle, so this phenomenon is not limited to the 2018 competition year.


Next let’s consider the event most favorable to 18/under women: the 200m butterfly.  The USA-S “secret sauce” gifted women 1.16 times the number of women’s slots in this event.  18/under women earned an unusually large share of those non-freestyle slots (55%) versus 19/over women.  Combining these two factors results in a Trials event that qualified 36 swims by 18/under women in the 2018 competition year and only 22 by 18/under men, thus explicitly granting 18/under women 1.6 times as many slots as 18/under men.  Again, our database of 23 million age group swims independently confirmed that 18/under women have the greatest likelihood of qualifying in the 200m butterfly, twice the likelihood of 18/under men, so this phenomenon is not limited to the 2018 competition year.

Implications for Coaching.

I was surprised to learn that 18/under men perform best in longer events relative to 19/over men.  In other sports, men attain their peak endurance after their peak strength, and both well after age 20.  VO2max doesn’t start declining until after age 30. The current world record in the men’s marathon was set by Eliud Kipchoge at age 33!  So if nature does not explain the relatively weaker performance of 19/over men in the longer events, perhaps nurture can. There may be an opportunity for a new approach to coaching 19/over men to improve their performance in longer swimming events.


I was also surprised to learn that 18/under women perform best in non-freestyle events relative to 19/over women.  Why would women experience a relative decline in their ability to swim backstroke, breastroke, and butterfly after age 18?   Again, this seems like an opportunity for a new approach to coaching 19/over women to excel in non-freestyle events.

Methodological Recommendations.

At this point it is only fair to point out some defects in USA Swimming’s methodology for setting the 2020 Trials qualifying times.


The first process defect is a lack of transparency.  Failing to disclose the transformation from Top70 times to Trials cuts serves primarily to hide technical errors and political decisions that favor some athletes over others.  It’s difficult to believe that USA Swimming intended to simultaneously advantage men in the 1500m freestyle and disadvantage them in the 400m IM.  Maybe it was by mistake? The best way to address this process deficiency is for USA Swimming to publish the full data and analysis used to set the proposed Trials cuts, and publicly solicit feedback from the swimming and technical communities before finalizing the meet cuts.


The second process defect is a lack of accountability.  The crux of a successful forecasting effort is to forecast future events and validate the forecasts.  Only by holding an organization accountable for the accuracy of its forecasts can the accuracy of an organization’s forecasts improve.  In this case, USA Swimming internally forecast the number of Trials qualifiers for each of the twenty eight Olympic swimming events, but only published their forecast for the aggregate: 1302 total qualifiers across all Trials events.  An accountable forecasting process would publish its internal forecasts for each of the twenty eight Trials events, along with a measure of uncertainty for each forecast.


The first technical defect is an apparent failure to match the historical data to the dependent variable. USA Swimming apparently included foreign nationals in the historical data used to forecast the number of 2020 Trials qualifiers.  Foreign nationals cannot qualify for the U.S. Olympic Team Trials! So the better approach would be to make the forecast using historical data on US nationals only, excluding foreign nationals. This simple change would obviate the need for coaches adjustments and the unfortunate “secret sauce based on intuition”.


The second technical defect is data sparsity.  USA Swimming used Top70 times from a single competition year to seed the Trials qualifying times.  At that time (2018), SWIMS contained fifteen years of elite competition data, including the results of four previous Olympic team trials.  I wouldn’t be surprised if Olympic years have more LCM swims than average, as more athletes attempt to make Trials cuts. Swimming has changed since USA Swimming started collecting data in 2003 but not so much as to invalidate swims prior to the current year.  So including more historical data in the forecast would likely result in a more accurate forecast.

Conclusion.

We’ve learned that the USA-S “secret sauce” explicitly favors men in the 50m, 400m, and 1500m freestyle and women in the 200m butterfly and 400m individual medley.  We’ve also learned that 18/under women perform best relative to 19/over women in the non-freestyle events, while 18/under men perform best relative to 19/over men at longer distances.   These effects combine multiplicatively to slightly favor 18/under women in the 200m butterfly and 400m individual medley and greatly favor 18/under men in the 1500m freestyle.


We’ve identified four methodological weaknesses in the process USA Swimming used to determine qualifying times for the 2020 U.S. Olympic Team trials.  The process was neither transparent nor accountable. It failed to match historical data to the dependent variable and was based on an unnecessarily small sample of elite swims.  Fortunately, these methodological defects are easily remedied and we eagerly await the 2024 Trials cuts.