In the aftermath of last year's Boston Marathon attacks, the Boston Athletic Association (BAA) was faced with—among other, more serious problems—a pair of logistical issues. It had to consider how to provide the nearly 6,000 runners who were unable to finish the 2013 race with an opportunity to qualify for this year's race; it also had to decide how to treat their missing finish times, which were of real significance to the people who had an opportunity to fulfill a dream taken from them. Ultimately, all runners who completed the first half of the race but did not finish (DNF) received automatic entries. The BAA also pledged to provide official finish times for every runner who was still on the course when the race was halted, and made this a priority. I was a part of team of experts assembled to provide the BAA with guidance on how best to predict the finish times of the runners who DNF. Our proposed solution was to find a set of comparison runners who finished the race in 2010 and 2011 and whose running patterns were as close to each DNF runner as possible. These 2010 and 2011 runners were then used to predict the finish times. We did not consider the 2012 marathon due to a heat wave. With the help of the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, and all the runners from the 2010 and 2011 Boston Marathons. The two explosions occurred at 2:49 p.m.; based on the latest recorded start time, anyone running faster than 3 hours, 56 minutes would have finished the race. Therefore, we restricted our analysis to runners projected to finish the race in more than four hours. The data consist of "split times" from each of the five-kilometer sections of the course and the final 2.2-kilometer section. Our task was to predict the missing split times for the runners who failed to finish the 2013 marathon. We considered several different prediction algorithms, two of which I will discuss momentarily. First, it was necessary for us to objectively validate the quality of the different prediction algorithms. We wanted to know how close our predictions were to the true finish times. Of course, for runners in 2013 who failed to finish, this is impossible. However, we were able to create an independent validation dataset using the data from 2010 and 2011 by setting aside the finish time for some of the runners. We then applied all of the proposed algorithms to predict the missing finish times in the validation dataset. We compared our projected finish times with the true finish times of these runners. In this way, we were able to assess the accuracy of the different approaches. The most common and straightforward method to predict finish times of a race is to employ a constant pace rule—that is, assume all runners have and will continue to run at a constant pace. For elite marathoners, this might be a viable solution; however, for the nearly 6,000 runners who did not finish the 2013 Boston Marathon, this is not ideal. To understand why, let us consider a hypothetical runner who has just crossed the 35-kilometer mark. Will this runner continue to run at a constant pace for the remaining 7.2 kilometers? Or will this runner hit the "wall" and slow down considerably? A less obvious problem is that the runner may have started too slow, as is common with first-time marathoners, and might be expected to speed up over the last 7.2 kilometers. These different running patterns are the fundamental problem in predicting finish times of a race, and we need to be able to identify them to be able to accurately predict finish times. For runners who were stopped after 35 kilometers but before 40, the constant pace rule predicted only 26 percent of runners in our validation dataset within three minutes of their true finish times. This emphasizes the importance of accounting for the speeding up or slowing down of the runner. To achieve better predictions, we proposed five different prediction algorithms that, in one way or another, account for the different running patterns. While each of these methods outperformed the constant pace rule, a method based on a k-nearest-neighbors (KNN) algorithm worked best. The KNN algorithm is a method that, as a runner, I find very intuitive. The idea behind KNN is to find a set of comparison runners who finished the race in 2010 and 2011 whose split times were as close to each DNF runner as possible. These runners are called the "nearest neighbors." We chose to consider the 200 nearest neighbors, and performed a local linear regression among these nearest neighbors to produce our final predictions. (Oversimplifying, the local linear regression fits a line through the 200 nearest neighbors, and then uses this line to predict the finish time.) Under the KNN approach, we are able to predict 69 percent of runners who were stopped after 35 kilometers but before 40 within three minutes of their finish time. This highlights the usefulness of incorporating all available information on the running pattern into the prediction algorithm and not naively assuming a constant pace. Another measure of accuracy is mean absolute error. We found that on average across all DNFs, the KNN approach predicted the finish time within 1.57 minutes of the true finish time, compared with 3.25 minutes using the constant pace rule. We recommended that the BAA adopt this approach, but they decided to use a constant pace rule that assumes all runners would continue to run at a constant pace. Was this a bit surprising? Absolutely, but I still believe that our approach is much better at predicting finish times than the method employed by the BAA, and the data support this claim. There is one silver lining, though. In most cases, the BAA finish times based on the constant pace rule are lower than our projections. This means the BAA projected finish times are more favorable to the runners than our predictions. Looking beyond the specific issues posed by the 2013 Boston marathon, there are other applications for this approach. Many large road races provide real-time information on competitors' performances via split times and predicted finish times. We adapted our KNN approach to provide real-time predictions as to when each runner will arrive at various checkpoints along the course (e.g. five-kilometer sections of the Boston Marathon) and more realistic estimates of the finish times than provided by the constant pace rule. This information would be valuable to spectators by providing real-time updates estimating when specific runners will arrive at various locations along the course. This is an example of a problem for which the data sources are freely available and the problem is easily stated in language that does not require advanced scientific expertise. I encourage you to download the data from our website and produce your own predictions. This article summarizes the work that Dorit Hammerling, Jessi Cisewski, Francesca Dominici, Giovanni Parmigiani, Charles Paulson, Richard Smith, and I did in conjunction with the BAA to provide official finish times for all of the runners of the 2013 Boston Marathon. Of our team of analysts, Francesca, Giovanni, and Dorit all participated in the 2013 race. For a complete description of the work, see our paper titled " Completing the Results of the 2013 Boston Marathon" appearing in the April 11 edition of PLOS ONE.
Related Posts
Bordeaux coach Francis Gillot is concerned about his future at the Ligue 1 club after the side s poor start to the season.
Gillot s men have won once in five league games so far in 2013-14 and fell to a 3-0 defeat against Eintracht Frankfurt in the UEFA Europa League on Thursday.
The 53-year-old who has been in charge since June 2011 has seen Bordeaux lose their last three matches in all competitions, conceding seven goals in the process.
I am very worried for the future, he told L Equipe. Not only for Europe but also for the Championship, I worry a lot.
Some had the opportunity (against Frankfurt) to show and they did not. They did not show anything.
We are not effective in both areas (attack and defence), this means that there are no effective players.…
Roberto Mancini will be very happy to see Carlos Tevez again when his Galatasaray side meet Juventus in the UEFA Champions League.
The Italian signed a three-year deal at the Turkish champions on Monday after Fatih Terim paid the price for a poor start to the campaign that saw the side win just once in five domestic fixtures.
Mancini s first match in charge sees Galatasaray travel to the Juventus Stadium to face the Serie A champions in Wednesday s Group B fixture.
The game will see Argentina international striker Tevez reunited with Mancini, the duo having worked together at Manchester City.
The pair are believed to have endured something of a fractious relationship at City and famously fell out when Tevez refused to come off the bench in a Champions League fixt…
Liverpool defender Mamadou Sakho has agreed a new long-term contract with the Premier League club.
Sakho has made 47 appearances for Liverpool since joining from Paris Saint-Germain in 2013.
The France international centre-back, who has established himself as a favourite with the Anfield faithful, played for the first time this season when he captained Brendan Rodgers side to a 1-1 Europa League draw at Bordeaux on Thursday night.
Rodgers expressed his wish to retain Sakho s services earlier this week and the 25-year-old has obliged by penning his new deal.
Sakho told Liverpoolfc.com: I am very happy to sign a long-term contract with Liverpool because, as I ve always said, I am very happy here and my family are happy in Liverpool.
I will keep w…
Eden Hazard has been rewarded for an impressive campaign at Chelsea with an improved five-and-a-half-year contract.
Chelsea have extended star winger Eden Hazard s contract until 2020.
Hazard has scored 13 goals for Chelsea this season, helping them open up a seven-point lead at the top of the Premier League and secure a place in the League Cup final.
Stamford Bridge boss Jose Mourinho had joked earlier in the week that delays in tying the Belgian down to a new deal were down to him, and any fears that may have been brewing among the Chelsea faithful have been put to bed.
I am very happy to sign a new contract with Chelsea, Hazard told the club s official website.
Since I came here in 2012 I have always felt good and the club has been very sup…
Antonio Rudiger marked his 100th appearance for Chelsea with an unlikely brace but Frank Lampard s side are still looking for a first Premier League away win in 2020 after a 2-2 draw at Leicester City.
The battle between third and fourth in the Premier League table sparked into life after the interval, Chelsea going ahead at the King Power Stadium when Rudiger nodded in Mason Mount s corner.
However, Chelsea s lead lasted less than 10 minutes, Harvey Barnes drawing the hosts level with the aid of a deflection off Reece James to leave Willy Caballero picked ahead of Kepa Arrizabalaga no chance.
The Argentine goalkeeper, making his first league start since May 2019, had little chance of keeping out Leicester s second goal, Ben Chilwell popping up at the b…
Jose Mourinho admitted he was a little scared amid speculation of Nicolo Zaniolo leaving Roma, while he refuted suggestions the attacker was difficult to manage.
Zaniolo scored the winning goal as Roma ended a 14-year trophy drought, defeating Feyenoord in the inaugural Europa Conference League final in May.
But speculation persisted in the close season that the 23-year-old may depart the Eternal City, with Juventus reportedly the favourites to secure his signature.
Zaniolo dismissed the move as never likely ahead of Thursday s Europa League clash with HJK Helsinki, which Roma won 3-0 after goals from Paulo Dybala, Lorenzo Pellegrini and Andrea Belotti.
While the Italy international suggested a transfer was never on the cards, Mourinho acknowledged h…
League Two promotion hopefuls Fleetwood Town have signed Nathan Tyson on a month-long loan from Blackpool, .
The 31 year old has made just six league appearances for the Seasiders, all off which have come as a substitute, having joined from Derby County on a free transfer in the summer.
He had a brief spell with Millwall towards the end of last season, but started just once for the Lions, and failed to hit the in any of his four outings for the club.
Tyson moved to Derby County in 2011 after spending six years with fellow East Midlands club Nottingham Forest.
He goes straight into the Cod Army’s side for their match against Millwall tomorrow afternoon, where victory could see them return to the top of the division.
The Reading born strike…
West Bromwich Albion manager Steve Clarke expects Saido Berahino will agree terms on a new deal at the Hawthorns within 24 hours, according to the club s official website, .
Berahino, 20, has been on sensational form to start the 2013/14 campaign, having scored six goals in nine first-team appearances, albeit just four starts, across all competitions.
The England Under-21 international, who netted in back-to-back matches against Arsenal and Manchester United in late September, is out of a contract at the end of the season.
A host of top European clubs, including the likes of Arsenal, Chelsea, Everton and Monaco, are reportedly monitoring the highly-rated 20-year-old ace.
However, Albion boss Steve Clarke is confident the player in question wil…
Burnley s hopes of reaching the Europa League group stage for the first time received a blow as Monday s draw for the third qualifying round pit them against Istanbul Basaksehir, if they can get past Aberdeen.
Sean Dyche s men were the surprise package of the Premier League last term as they managed to finish seventh, securing a return to European football after a 51-year absence.
They have to defeat Aberdeen in the second qualifying round, but should they succeed, Burnley will face the side that finished third, just three points behind champions Galatasaray, in the Turkish Super Lig in 2017-18.
Basaksehir boast a squad full of familiar faces – including Arda Turan, Emmanuel Adebayor, Gokhan Inler and Gael Clichy – and went very close to qualifying for the…
The expectations weren t particularly high for Scotland coming into this game against Russia, given that the 2018 FIFA World Cup hosts are coming off the back of a strong tournament in their homeland. The Scots, on the other hand, continue to fail when it comes to securing qualification for either the World Cup or the European Championships.
In the end, Russia took their moments as and when they came to them, coming from behind to pick up all three points at Hampden Park. The atmosphere was strong and the home fans were intent on trying to force a positive result for their team, and they got off to the perfect start when John McGinn gave Steve Clarke s men the lead.
Heroes crumbling
Artem Dzyuba was able to pounce on poor defensive work from Scotland to level things up…