2017 Dartmouth Natal Day 6 Mile Road Race Results Visualized and Analyzed
I recently developed a vision for how a generic set of running race results could be presented in more interesting ways via Tableau data visualization software, than just a table they are normally presented as. To share this vision, I've created a set of dashboards I'll walk you through below, using a generic set of running race results from the 2017 Dartmouth Natal Day 6 Mile Road race that took place just days ago in my area. The result has a dozen or so fields like runner name, bib number, home town, division (gender & age group), gun and chip times, pace per mile, and ranking among a few demographics. I've turned them into interactive dashboards that can mine the data for details, summarize it at high levels, visualize it in graphs, and customize it for the user's needs, with options to export it in several formats! You can send a link or embed with JavaScript in the Share button down at bottom right. You can export what you see as a PDF or graphic JPG under Download. Or you can just view it at shown sizes or full screen.
Come with me and see what race results look like in my world!
This is just the title dashboard to give you a preview of what's to come. Don't play with it too much as the real version comes later. But already, you can see race results with coloured icons for runners by gender and division, with their times. They are listed by order of finish vertically where each rank is an equal gap to the next, and in proportional time gap horizontally. Hover your cursor over icons representing the runners, or click on the icon, to see more info on each runner and their performance. Move the scroll bars at right and/or bottom to see how runners basically crossed the finish time. However, let's start from beginning so things that come later will make more sense for value of their meaning .
A basic overview of what can be done with these dashboards, and most importantly, your navigation centre to which you would want to come if you weren't sure where to find some features you saw among the many tabs at the top. Click on the TOC tab to return here any time.
This is a survey of the field for who showed up to race in various age and gender categories. It'd also be good for targeting demographics from which to try and attract more runners. You can look up the denominator in various division placings, then add them up, or just see them like this. Some information is shown in what looks to be duplication. However, I had room to show them in different ways for different comparisons, like gender split in stacked bar format, side by side to really tell which had more, and grouped by gender to show the age distribution difference, or lack thereof, between the genders. Hover the mouse cursor over the bars to get even more information!
As for results, I'm not sure what the typical distribution of demographics should look like for a distance race in the 10K range. I'm more familiar with longer races' demographic distributions, which tends to be slightly older by volume of higher age groups, and had a fair chance of having more female than male runners. That is not the case here, but I don't know if this would unusual. Also note the 355 total runners for a race size indication.
This is where things get fun! People come from many places for a race, but have you ever wondered how you did among the people from your own town or city? Or many province? Or perhaps even country if only a small group of you showed up? Here, you can see squares representing runners from each home town or city location lined up in a row. Towns and cities are grouped by province or state, then country, respectively. Each square represents one runner, in the order of their finish among runners from that town or city. A colour denotes their gender, with more of their information if you hover over any specific square. Suddenly, you have a new, unofficial category of competition!
You actually can have many more unofficial categories among which to compare your results, or that of others. Just use the filters on the side so you could see how runners from NS (Nova Scotia), who were males 20-29 fared. And maybe just the ones from Halifax and/or Dartmouth!
A distribution percentage exists on the side for a quick comparison of your geographical home town or city distribution.
Looking at the dashboard in the default format, I'd say it's good to see Dartmouth being the most represented home location of runners, even if it were less populous than Halifax. It is the Dartmouth Natal Day Road Race, after all! Their fastest runner was slower than a handful of the fastest runners from Halifax, but the guy, Wade Were was quite fast running a 5:48/mile race (3:36/km), and was in the 50-59 age group! All the younger runners in Dartmouth who were in this race needs to step up their game for next year!
Looking further down, I see Alberta had a relatively strong representation! They aren't exactly near Nova Scotia where this race was held. New Brunswick and PEI are, and they hardly had anybody in the race! Geographically otherwise, Belgium (BE) and Bermuda (BM) were the only other countries that had representation. I hope those runners enjoyed this great race!
The difference between your gun time and chip time tells how long it took you to cross the start line. From that, I've, more or less, recreated a picture of the start based on who cross the line when, among others who crossed in the same second. Order within the second was by increasing bib number. Unlike the true start picture, though, you can tell each runner's division visually (age group and gender), and see that distribution very easily and quickly.
The maximum number of runners that crossed the Start line in any second was 16. The person who was the middle of the pack runner (median runner by order of start), crossed at 15 seconds. That's not bad for a group of 355 runners on a small street start. With just three runners having sprinter like reaction to be less than half a second (as rounding probably determines this), it seems not everybody was keen to explode out of the gates after the gun given 16 were capable of crossing mid-pack in one second. At least there was a female among the very first group, with males much more aggressive to get out of the gates otherwise. Interesting to note that most of the top finishers crossed a second after the gun. Start out front and early, but no need to gun it like a 100m sprint and pull something on de Grasse, right? :)
In this alternate version of the start line, I showed the runners' names along with their demographic information, and delay after the gun. Order within the same second here was determined by alphabetical order of first name (as names were registered in full as a field).
This view doesn't offer much more in terms of analytics than the previous dashboard. However, it does let you see more about the runners that would be more meaningful if you knew some of them.
Without mid-race data, the next thing I had to present was the Finish, which comes in two flavours, the official or gun time finish on which awards are made, and chip time finish that more accurately shows the race you ran from when you crossed the start line. Awards are made on the former because it is a real race where you should be able to determine your immediate competition from who's closest to you, not a virtual race where someone beside you could be running with a significantly more or less time delay than you.
I displayed the finish in two fashions, vertically and horizontally. Vertically represents the order in which runners finished, where the difference between each place is the same. That is, first and second is a difference of 1, just as 201st and 202nd. Horizontally, though, there is a scaled time difference so that if you finished a minute ahead of the next person, and 1:01 ahead of the person after that, the gap between the last two is 1/60th of the gap between you and the next runner behind you. This is done with the top finisher at the most left, the way one might see a finish from a spectator's view point from the side. Left or right is a judgement call, but we're more familiar with top left than top right for first in something so I picked left to go with the vertical alignment at the top.
I only showed the runner's name and time in the visual to avoid cluttering it with info and losing their symbol showing the finish gaps among all that information. Hover your mouse cursor over each runner's symbol, or click on it, to see much more information on them! I've basically included every calculation with respect to their "competition" for which they may receive an award, like overall, gender and division place. I also gave them a percentile placing as well, along with the usual race results. Those in the front may like to see some of these stats, where as those in the back might not. However, if I had a cut off point, many will call me judgemental, even if for good intent. Besides, the "risk" that comes with publicity in sports is you can be the one who gets known for getting the winning score, or the one who gave it up.
Finally, I added median and mean times for the pack and genders. Mouse over the lines or click on them to see what they represent and their values. Median is the literal middle if you lined up everyone from fastest to slowest. Mean is an average of the times so nobody might well have ran that time. The middle of the pack is a median time, and runners can see if they were in the front or back half of their demographic pack (overall, gender) by being on the left or right of that line, respectively... or right in the middle. There is no "mean" runner of the pack unless by fluke someone ran that same time as the one calculated. Those 6 lines were enough to start cluttering the visual, and similar times for divisions would be suspect on small sample size giving a misleading time much faster or slower than what it might have been had enough people in that division raced.
On to the results, I don't have other sets to compare to for this first visualization. One can easily see, though, that it was a good race between the first and second runners for at least a while. A big gap existed between second and third, though. The female race was closer at the front with smaller gaps, starting at the 19th finisher. You can very easily see all the close sprints to the finish for the entire race, actually, by scrolling down and looking for vertical groups of symbols practically on top of each other.
On the other end of the results, the end was a spread out affair with larger than usual gaps among the last half dozen runners or so, compared to the rest of the runners. Beyond that, at the levels detail, we're probably looking at specific runners and additional information to discuss anything, that probably isn't of general interest. There's plenty of text here already not to have to go into that!
Looking at the aggregate results of means and median, they are pretty close for each category. That meant there were probably no time or small set of times that skewed the results unfairly (i.e. really warped a bell distribution). That's a rough statement not always true but likely to win you money in the long run for being right is how I put it in plain language. Male times were faster than female times, but whether the gap was usual or not, I don't know enough to comment on. If you want that gap in terms of pace, grab pace info from a runner with a close time to the one on the line you want to get a pace for. That's not easy to get Tableau to show with the feature.
This is the same dashboard format as the Finish Time dashboard, but based on Chip time, so there is no need to explain the dashboard features again. There is a lot of information here you would never get from a set of race results due to it all being "theoretical" as a race and set of results for the masses! But every interesting to see how actual rimes ran turned out compared to each other!
Chip times are more of a personal thing as race awards and placements are based on official time. The general question people have for chip time is if it makes much of a difference if it had been used instead of overall time. While it covers each runner's race time more accurately than overall time, it still doesn't account for traffic jams you might get into at the start or somewhere along the way to keep you from running your optimum speed race. I answer that question of difference in the next dashboard, but here, you can explore what the race would have been like if chip time had been used, something not offered with most race results. That should at least partially answer the question of difference, though not very well due to you having to memorize two finishing lists and their orders!
Since this visual is a theoretical race, I'll let you do your own analysis for what you want to see from it. It was easy to create from the other one, and worth the while for what it showed, that I did it.
For practical purposes, the only impact this would have are for curiosity, and whining if you somehow lost a placing on Official time instead of Chip time. It's not going to get you anything or anything more. :)
To answer this question, I plotted the change in runners' places for which they could have been awarded a prize (overall, gender, age group), if chip time had been used, given official time determined their default and given placings. The formula is:
Just remember positive result meant benefitting from Chip time.
Knowing this, looking at the graphs, the answer to would using Chip time have made much of a difference, was clearly NO for this race. Keep in mind usually only Top 3 awards are given so anything beyond the third person in the category affected, doesn't affect the awards.
In the overall graph, It took until the 21st finisher for a difference to have been made had Chip time been used. Not even close.
Top gender? 20th male first affected so not even close. 7th female so closer but still a no. No such thing as "close" here. It's either YES or NO.
Top Division? Filter out one division at a time if you need to, using the filters on the right. For the most part, the answer is as above, NO. There were literally a few cases but you can hunt for them if you want. They're the exception, won't make a difference, and I've got a lot more writing to do! lol
At least for this race, Chip versus Gun Time used didn't really matter, but it should never by concept of a race being a true race by the Official clock anyway. I'll just leave it at that.
Now comes the group analytics set of dashboards. Here, I compared the categories of gender, age group and both (as Division), to each other. Someone not experience in distance running might well say males and youth are faster, on average, but you look and see how well that holds true, or not! Just remember this is also only one race, with limited sample size for some categories, like Divisions, that could skew results unrealistically, while there were enough runners for gender to get a fair sample. That's why I included number of runners in each category, so you can determine how reliable or "representative" the results you were looking at might have been.
This was also only a 6 miler. Wait till I get a set of marathon results done for much more fun to testing out that theory!
The top two graphs look similar but showed two different things in more convenient ways each. The top graph showed gap between gender for each age group in having the age groups one below the other. The graph below it showed distribution among each gender by having the gender symbols next to each other. Take out the gender and you get the age only comparison third from above, with the final graph being just gender, without the ages. Dotted lines show average (mean) times for the pack and genders.
Only the youngest age group, who are not yet adults mostly being 18 and under, did not fall into the general thinking of males and youths being faster. Kudos to the two gentlemen 70-79 years old for their relatively fast races, though!
Were runners from Halifax faster than those from Dartmouth? Or guys from Bedford faster than guys from Lower Sackville? These graphs show that. You can even filter out for more like age groups, or just not to have to see results from other locations. I just did averages without standard deviations for determining statistical significance because the small numbers from most places would have made that measure meaningless. Besides, it was for this race only so just interpret the actual outcome for this race as a strict YES or NO.
This was a novel thing of curiosity that wasn't too hard to set up, but opened a new dimension to looking at results. :)
Dartmouth runners were definitely not faster than Haligonian ones. on average. That applied to the whole, as well as by gender. Guys from Bedford were slower than guys from Lower Sackville, but only by 0.7 minutes per runner (42 seconds).
This dashboard is really fun to for your own running rivalries!
Identify your teams beforehand. Run your races. Then come here to check off boxes of team members and see the average times for each team , as well as specific runner comparisons, for your own competitive racing chart!
I just picked some fast runners I knew for examples. :)
These final dashboards let you get customized results tables based on Bib number, Names or quick lists of any filter available. You can download them as PDFs or graphic JPGs in the Download menu from the button at bottom right.
AND THAT'S IT! For now, at least. Please let me know in the Comments below if you have suggestions and/or see errors. I'll fix the latter and see what I can do about the former. I can also answer questions, of course.
Thank you to Race Director Dave Nevitt and Troy Musseau from Atlantic Chip timing services and for letting me use the data for this exercise.
Come with me and see what race results look like in my world!
A Preview
This is just the title dashboard to give you a preview of what's to come. Don't play with it too much as the real version comes later. But already, you can see race results with coloured icons for runners by gender and division, with their times. They are listed by order of finish vertically where each rank is an equal gap to the next, and in proportional time gap horizontally. Hover your cursor over icons representing the runners, or click on the icon, to see more info on each runner and their performance. Move the scroll bars at right and/or bottom to see how runners basically crossed the finish time. However, let's start from beginning so things that come later will make more sense for value of their meaning .
Table of Contents
A basic overview of what can be done with these dashboards, and most importantly, your navigation centre to which you would want to come if you weren't sure where to find some features you saw among the many tabs at the top. Click on the TOC tab to return here any time.
Who turned up to race by demographics?
This is a survey of the field for who showed up to race in various age and gender categories. It'd also be good for targeting demographics from which to try and attract more runners. You can look up the denominator in various division placings, then add them up, or just see them like this. Some information is shown in what looks to be duplication. However, I had room to show them in different ways for different comparisons, like gender split in stacked bar format, side by side to really tell which had more, and grouped by gender to show the age distribution difference, or lack thereof, between the genders. Hover the mouse cursor over the bars to get even more information!
As for results, I'm not sure what the typical distribution of demographics should look like for a distance race in the 10K range. I'm more familiar with longer races' demographic distributions, which tends to be slightly older by volume of higher age groups, and had a fair chance of having more female than male runners. That is not the case here, but I don't know if this would unusual. Also note the 355 total runners for a race size indication.
From where did runners come and how did they do among each other?
This is where things get fun! People come from many places for a race, but have you ever wondered how you did among the people from your own town or city? Or many province? Or perhaps even country if only a small group of you showed up? Here, you can see squares representing runners from each home town or city location lined up in a row. Towns and cities are grouped by province or state, then country, respectively. Each square represents one runner, in the order of their finish among runners from that town or city. A colour denotes their gender, with more of their information if you hover over any specific square. Suddenly, you have a new, unofficial category of competition!
You actually can have many more unofficial categories among which to compare your results, or that of others. Just use the filters on the side so you could see how runners from NS (Nova Scotia), who were males 20-29 fared. And maybe just the ones from Halifax and/or Dartmouth!
A distribution percentage exists on the side for a quick comparison of your geographical home town or city distribution.
Looking at the dashboard in the default format, I'd say it's good to see Dartmouth being the most represented home location of runners, even if it were less populous than Halifax. It is the Dartmouth Natal Day Road Race, after all! Their fastest runner was slower than a handful of the fastest runners from Halifax, but the guy, Wade Were was quite fast running a 5:48/mile race (3:36/km), and was in the 50-59 age group! All the younger runners in Dartmouth who were in this race needs to step up their game for next year!
Looking further down, I see Alberta had a relatively strong representation! They aren't exactly near Nova Scotia where this race was held. New Brunswick and PEI are, and they hardly had anybody in the race! Geographically otherwise, Belgium (BE) and Bermuda (BM) were the only other countries that had representation. I hope those runners enjoyed this great race!
What did the start of the race look like?
The difference between your gun time and chip time tells how long it took you to cross the start line. From that, I've, more or less, recreated a picture of the start based on who cross the line when, among others who crossed in the same second. Order within the second was by increasing bib number. Unlike the true start picture, though, you can tell each runner's division visually (age group and gender), and see that distribution very easily and quickly.
The maximum number of runners that crossed the Start line in any second was 16. The person who was the middle of the pack runner (median runner by order of start), crossed at 15 seconds. That's not bad for a group of 355 runners on a small street start. With just three runners having sprinter like reaction to be less than half a second (as rounding probably determines this), it seems not everybody was keen to explode out of the gates after the gun given 16 were capable of crossing mid-pack in one second. At least there was a female among the very first group, with males much more aggressive to get out of the gates otherwise. Interesting to note that most of the top finishers crossed a second after the gun. Start out front and early, but no need to gun it like a 100m sprint and pull something on de Grasse, right? :)
Who crossed the start when?
In this alternate version of the start line, I showed the runners' names along with their demographic information, and delay after the gun. Order within the same second here was determined by alphabetical order of first name (as names were registered in full as a field).
This view doesn't offer much more in terms of analytics than the previous dashboard. However, it does let you see more about the runners that would be more meaningful if you knew some of them.
The Official (Gun or Clock Time) Finish
Without mid-race data, the next thing I had to present was the Finish, which comes in two flavours, the official or gun time finish on which awards are made, and chip time finish that more accurately shows the race you ran from when you crossed the start line. Awards are made on the former because it is a real race where you should be able to determine your immediate competition from who's closest to you, not a virtual race where someone beside you could be running with a significantly more or less time delay than you.
I displayed the finish in two fashions, vertically and horizontally. Vertically represents the order in which runners finished, where the difference between each place is the same. That is, first and second is a difference of 1, just as 201st and 202nd. Horizontally, though, there is a scaled time difference so that if you finished a minute ahead of the next person, and 1:01 ahead of the person after that, the gap between the last two is 1/60th of the gap between you and the next runner behind you. This is done with the top finisher at the most left, the way one might see a finish from a spectator's view point from the side. Left or right is a judgement call, but we're more familiar with top left than top right for first in something so I picked left to go with the vertical alignment at the top.
I only showed the runner's name and time in the visual to avoid cluttering it with info and losing their symbol showing the finish gaps among all that information. Hover your mouse cursor over each runner's symbol, or click on it, to see much more information on them! I've basically included every calculation with respect to their "competition" for which they may receive an award, like overall, gender and division place. I also gave them a percentile placing as well, along with the usual race results. Those in the front may like to see some of these stats, where as those in the back might not. However, if I had a cut off point, many will call me judgemental, even if for good intent. Besides, the "risk" that comes with publicity in sports is you can be the one who gets known for getting the winning score, or the one who gave it up.
Finally, I added median and mean times for the pack and genders. Mouse over the lines or click on them to see what they represent and their values. Median is the literal middle if you lined up everyone from fastest to slowest. Mean is an average of the times so nobody might well have ran that time. The middle of the pack is a median time, and runners can see if they were in the front or back half of their demographic pack (overall, gender) by being on the left or right of that line, respectively... or right in the middle. There is no "mean" runner of the pack unless by fluke someone ran that same time as the one calculated. Those 6 lines were enough to start cluttering the visual, and similar times for divisions would be suspect on small sample size giving a misleading time much faster or slower than what it might have been had enough people in that division raced.
On to the results, I don't have other sets to compare to for this first visualization. One can easily see, though, that it was a good race between the first and second runners for at least a while. A big gap existed between second and third, though. The female race was closer at the front with smaller gaps, starting at the 19th finisher. You can very easily see all the close sprints to the finish for the entire race, actually, by scrolling down and looking for vertical groups of symbols practically on top of each other.
On the other end of the results, the end was a spread out affair with larger than usual gaps among the last half dozen runners or so, compared to the rest of the runners. Beyond that, at the levels detail, we're probably looking at specific runners and additional information to discuss anything, that probably isn't of general interest. There's plenty of text here already not to have to go into that!
Looking at the aggregate results of means and median, they are pretty close for each category. That meant there were probably no time or small set of times that skewed the results unfairly (i.e. really warped a bell distribution). That's a rough statement not always true but likely to win you money in the long run for being right is how I put it in plain language. Male times were faster than female times, but whether the gap was usual or not, I don't know enough to comment on. If you want that gap in terms of pace, grab pace info from a runner with a close time to the one on the line you want to get a pace for. That's not easy to get Tableau to show with the feature.
The Chip Time Finish
This is the same dashboard format as the Finish Time dashboard, but based on Chip time, so there is no need to explain the dashboard features again. There is a lot of information here you would never get from a set of race results due to it all being "theoretical" as a race and set of results for the masses! But every interesting to see how actual rimes ran turned out compared to each other!
Chip times are more of a personal thing as race awards and placements are based on official time. The general question people have for chip time is if it makes much of a difference if it had been used instead of overall time. While it covers each runner's race time more accurately than overall time, it still doesn't account for traffic jams you might get into at the start or somewhere along the way to keep you from running your optimum speed race. I answer that question of difference in the next dashboard, but here, you can explore what the race would have been like if chip time had been used, something not offered with most race results. That should at least partially answer the question of difference, though not very well due to you having to memorize two finishing lists and their orders!
Since this visual is a theoretical race, I'll let you do your own analysis for what you want to see from it. It was easy to create from the other one, and worth the while for what it showed, that I did it.
Does (or did) using Official time rather than Chip time matter much?
For practical purposes, the only impact this would have are for curiosity, and whining if you somehow lost a placing on Official time instead of Chip time. It's not going to get you anything or anything more. :)
To answer this question, I plotted the change in runners' places for which they could have been awarded a prize (overall, gender, age group), if chip time had been used, given official time determined their default and given placings. The formula is:
- Official place minus Chip place (which I determined)
Just remember positive result meant benefitting from Chip time.
Knowing this, looking at the graphs, the answer to would using Chip time have made much of a difference, was clearly NO for this race. Keep in mind usually only Top 3 awards are given so anything beyond the third person in the category affected, doesn't affect the awards.
In the overall graph, It took until the 21st finisher for a difference to have been made had Chip time been used. Not even close.
Top gender? 20th male first affected so not even close. 7th female so closer but still a no. No such thing as "close" here. It's either YES or NO.
Top Division? Filter out one division at a time if you need to, using the filters on the right. For the most part, the answer is as above, NO. There were literally a few cases but you can hunt for them if you want. They're the exception, won't make a difference, and I've got a lot more writing to do! lol
At least for this race, Chip versus Gun Time used didn't really matter, but it should never by concept of a race being a true race by the Official clock anyway. I'll just leave it at that.
How did various runner demographics perform?
Now comes the group analytics set of dashboards. Here, I compared the categories of gender, age group and both (as Division), to each other. Someone not experience in distance running might well say males and youth are faster, on average, but you look and see how well that holds true, or not! Just remember this is also only one race, with limited sample size for some categories, like Divisions, that could skew results unrealistically, while there were enough runners for gender to get a fair sample. That's why I included number of runners in each category, so you can determine how reliable or "representative" the results you were looking at might have been.
This was also only a 6 miler. Wait till I get a set of marathon results done for much more fun to testing out that theory!
The top two graphs look similar but showed two different things in more convenient ways each. The top graph showed gap between gender for each age group in having the age groups one below the other. The graph below it showed distribution among each gender by having the gender symbols next to each other. Take out the gender and you get the age only comparison third from above, with the final graph being just gender, without the ages. Dotted lines show average (mean) times for the pack and genders.
Only the youngest age group, who are not yet adults mostly being 18 and under, did not fall into the general thinking of males and youths being faster. Kudos to the two gentlemen 70-79 years old for their relatively fast races, though!
How did runners from various locations perform as a group?
Were runners from Halifax faster than those from Dartmouth? Or guys from Bedford faster than guys from Lower Sackville? These graphs show that. You can even filter out for more like age groups, or just not to have to see results from other locations. I just did averages without standard deviations for determining statistical significance because the small numbers from most places would have made that measure meaningless. Besides, it was for this race only so just interpret the actual outcome for this race as a strict YES or NO.
This was a novel thing of curiosity that wasn't too hard to set up, but opened a new dimension to looking at results. :)
Dartmouth runners were definitely not faster than Haligonian ones. on average. That applied to the whole, as well as by gender. Guys from Bedford were slower than guys from Lower Sackville, but only by 0.7 minutes per runner (42 seconds).
Compare how your chosen groups of runners did!
This dashboard is really fun to for your own running rivalries!
Identify your teams beforehand. Run your races. Then come here to check off boxes of team members and see the average times for each team , as well as specific runner comparisons, for your own competitive racing chart!
I just picked some fast runners I knew for examples. :)
Tabled Results
These final dashboards let you get customized results tables based on Bib number, Names or quick lists of any filter available. You can download them as PDFs or graphic JPGs in the Download menu from the button at bottom right.
AND THAT'S IT! For now, at least. Please let me know in the Comments below if you have suggestions and/or see errors. I'll fix the latter and see what I can do about the former. I can also answer questions, of course.
Thank you to Race Director Dave Nevitt and Troy Musseau from Atlantic Chip timing services and for letting me use the data for this exercise.
Comments
Post a Comment