Updated November 23, 2011
Runpaces 6.0, the first update to the program in three years, is now available. The basic performance prediction algorithm remains the same as before, as continued use has confirmed its high accuracy, but some modifications were made at the longer distances for highly trained runners, in light of further data analysis. The training and specialization analysis now includes estimated heart rates, as well as improved estimates of aerobic and lactate threshold paces. Air resistance effects now included consideration of a runenr's height and weight. One example is a new feature estimating the effect of wind on running performance. A wide variety of scenarios are modeled: head wind, tail wind, out-and-back head then tail winds, cross winds, circular paths, and oval tracks with winds perpendicular or parallel to the straightaways. Previously, Version 5.0 added acouple of items, including tables of power output and caloric expenditure, and treadmill equivalent paces at different slopes.
The shareware demo is a free download; the price for the full, registered version is $35. Most features are enabled in the shareware, but some of the more specialized ones are reserved for the registered version. When you register, you send a code given by the program, and I then send you the key to unlock the extra features. I'm happy to supply two such keys, so that you can use it on two computers.
This shareware version, like its full, registered counterpart, is a Windows-based program that provides a (unique, I believe) way of modeling an individual's running race time versus distance relation, allowing predictions of race times at various distances and comparison to a standard performance curve to help evaluate relative strengths and weaknesses. For those who run 'on their own', it's a great tool for finding which performances were really the 'best' and for planning a realistic pace to run for a distance which is unfamiliar or at least not run recently. For coaches, it provides an objective way to arrive at realistic expectations and to guide the athlete towards the events for which he/she is best suited.
Runpaces is much more than a simple formula or curve-fit. It uses a model based on the physics and physiology and then sorts through your input data to pick the most appropriate performances for constructing your personal performance profile. Many other features, such as your "best distance", aerobic and anaerobic threshold paces, equivalent performances at different distances, and the effects of uneven pacing and hills, and the ability to store performances in personal data files are included as well (though some of these only with the registered version). The user interface forms are easy to use, and output is both tabular and graphical. Printouts are available with the registered version.
The free demo, with most of the features of the full version, can be downloaded here. If you have questions, suggestions, or other comments, you can email me at firstname.lastname@example.org.
This is the main form, just after entering data and clicking the Calculate button.
Clicking on Analyze brings up this information.
Here's a sample of information obtained by clicking Special. This particular option shows the effects of different pacing strategies on a course with a grade of 5% uphill on half the course and 5% downhill on the other half, compared to the race time for a flat course.
Clicking Graph yields a plot of the generated performance curve and the points used to create it. It also shows world records, percent of record pace, and the performance level, indicating that this runner's best performances relative to others at distances somewhere around 5 kilometers.
This table shows estimates of this same runner's performance at different ages, assuming fairly consistent training effort.
Another table shows equivalent paces on different slopes.
You can save race results for any number of individuals. These can be quickly selected for analysis on the main form.
The registered version has a few more items as well, not shown here.
Why I wrote it
I've been a runner and running enthusiast since about 1976 and began analyzing my own pace versus distance data as early as 1977, as a senior running track and cross country in high school. I never got that good at it (4:56 mile), but I've been running ever since and keeping data on it most of that time. I got my degrees in physics (which I really like) and started teaching it at the high school level. By 1990 I was coaching track and cross country and soon decided to try to attack the old pace versus distance problem from a physics and physiology standpoint, using the computer. I did this mostly for fun and just the challenge of it, but quickly began applying it both to myself and to those I coach, with great results. Now I want to share it with others, though I am asking the $35 for the full version since this represents a LOT of work.
How it works
The most basic assumption made here is that the energy used in running an 'all-out' race can be thought of as coming from two sources, one which is in more or less constant supply (the 'aerobic' component, roughly) and another which can be exhausted (the 'anaerobic' part). I also had to make an assumption about the relation of power output to running speed. It (in the form of VO2 max plus other components such as accumulation of blood lactate) is often assumed to be directly proportional to running speed, with a small additional term for air resistance. I assumed it to be proportional to the square of the speed, based on a (perhaps oversimplified) physics based model in which the energy is being used largely to accelerate and decelerate the legs and arms.
The next step was to decide how much aerobic power was available. This varies from runner to runner, but correlates fairly well with performance in distance events. HOWEVER, two runners with, say, the same mile time might have quite different aerobic power output (per unit body mass) since this race is partly anaerobic as well. Simply put, a runners 'speed' might compensate or a lack of 'endurance' or vice versa and it's impossible to tell from a single race how much of each was exploited. For this reason, I decided to solve the problem using TWO performances, since two equations are needed to solve fortwo unknowns!
My first results were somewhat realistic, but the performance curve wasn't quite the right shape, being too optimistic for distances outside the two input and slightly pessimistic in between. Any of my assumptions could have been wrong, but I decided to modify the anaerobic part by making it depend on race distance, the idea being that you simply can't get as 'exhausted' during a very short race as during a very long one (though this idea of exhaustion does NOT exactly correspond to lactic acid in the blood, which is not the only component of the phenomenon anyway). I little playing around with functions of distance quickly brought the curve BEAUTIFULLY in line with the large amount of data I had to check it against, which included my own and that of athletes, aquaintances, and friends of widely varying ability as well as that of world-class runners. The only significant departures occurred at very long distances (such as over 10 miles) and sprints (under 400 m). Another term, perhaps corresponding to the phenomenon of glycogen depletion, brought an appropriate correction to the very long distances and other terms, such as reaction and acceleration time corrections, have brought the sprints very nicely in line as well, though this demo version doesn't include races under 400 meters.
Generally, I've been able to connect these mathematical tweakings with real physiological phenomena, but whether the labeling is correct or not, the model reflects reality very well. I am still wondering if my anaerobic adjustment really reflects, at least partly, a small error in the overall power-proportional-to-speed-squared assumption. I now believe, after reading more about it and curve-fitting treadmill test data I've seen (see Coe and Martin reference below) that the function is best described by power proportional to speed to a power between 1.2 and 1.7, at least if it's fit to a power law. It can also be modeled in more complex ways, such as the sum of two or more power-law parts, but the results I've gotten seem so good that I'm reluctant to tinker with this.
In short, the program finds a curve, based on a very reasonable physical/physiological model, that fits the individual's abilities - regardless of whether these are genetic in nature or due to training specificty - and makes predictions as well as evaluations of relative strengths (i.e. one's best race distance). This is not all the program does, however, as will be discussed below.
Comparison to other models
When I started on this project, I had seen very little in the way of others' attempts to do the same sort of thing. As I've begun showing the program to others and also as I've gotten access to the Internet, I've finally seen some of these efforts. Some are in the form of published tables while some others are in the form of computer programs (some free).
So what makes this program different? For one thing, most (though not quite all) of the others base performance on a single result. While this has the advantage of requiring little in the way of input, it does not at all address the problem (discussed above) of the different makeup (i.e. speed/endurance) of different runners. Usually the models or charts seem geared heavily towards runners specializing in the longer distances (5 or 10 K and up) and are quite far off when applied to, say, an 800 meter specialist who wants to try the 1500. In addition, some of these charts are really accurate only for elite runners, with fairly inaccurate results for more modest achievers. One attempt to address this can be found in Martin and Coe's book (referenced below). This consists of three sets of formulas - for 10K, 5K, and 1500 m specialists. Other than an obvious typo and some very minor discrepancies, I found these to fit my program's output very closely if I used two points for each formula. Coming from such an authoritative source, this may be seen mostly as a nice validation of my program, but still, it's three separate formulas and one may not know offhand which to use. There are also no provisions for race distances other than the 5 or 6 listed or for runners with other specialties. In short, it's useful but incomplete.
Another approach I have seen really has a different purpose in mind. A number of 'equivalent performance' schemes have been devised (a good example being Gardner and Purdy's reference below). These can work very nicely, but only over a rather limited range surrounding the runner's optimum event. For example, a miler who rates 700 'Purdy points' in that event probably rates quite close to that in the 2 mile and the half mile, but outside this the fit starts to stray significantly. The authors acknowledge this problem and, again, the issue of comparing performances at different distances in GENERAL is quite separate from that of comparing an INDIVIDUAL'S performances at different distances.
To address some specific models, one type I've run across is the power-law fit. An example of this is found on Runner's World's web page. When testing against data I've collected, I find this one to be fairly good only under certain circumstances. First of all, it assumes that an individual's race times are proportional to a certain power (1.07) of the race distance. This gives a rather small drop-off in speed with increasing distance, mainly appropriate for runners specializing in distances of at least 10K and running relatively high mileage. The other problem is that even in general (allowing different power laws for differnt runners), the fit is not the best. Specifically, I believe it gives over-optimistic interpolations and pessimistic extrapolations. Again, RUNPACES uses a model that has a sound physical/physiological basis and therefore fits real data much better than most arbitrary (even if inspired) choices of function are likely to, though even this model contains 'free parameters' that have allowed me to 'bend' the curve to make it even more accurate.
At this point, I need to point out that some of these other approaches DO have a place . . . in this program! For example, a generalized performance curve like Gardner and Purdy's allows a type of objective rating system and, when used in combination with an individual's performance curve, allows one to see which event that individual is 'best at'. I had developed such a performance curve based on world records, both for males and for females. I also adapted this curve to fit U.S. records, state high school records, or any other level. I found later that these curves almost exactly fit equal-Purdy-point curves, though suspect the mathematical form may be fairly different.
As a first attempt to use such curves to rate performances, I simply had the program divide the individual's speed for a race by the generalized speed for that race using the particular level used for comparison. This works well, however, only if the individual is being compared to a curve based on runners of about the same ability. Getting a typical high-schooler's speeds as a percentage of world records, for instance, can be misleading. A 'good' high school sprinter, for example, may run 100m only 10% slower than the world record, while an 'equally good' high school miler might be nearly 20% off world records. To compensate for this effect, I developed a quantity I call 'performance factor' (not surprisingly, I found recently that this term has been used before for a similar type of rating). This quantity equals a percentage of world records at the mile, but is skewed to yield lower values at shorter distances and higher at longer ones so that, for example, the top high school sprinter in a state should get about the same rating for 100 m as the top cross country runner gets for 5K. In this way, performance factor measures essentially the same thing as do Purdy points, and the runner's best event is approximately the one that yields his/her highest p.f.
As for curves based on only one point, there is some merit as well, since a runner may have been running only one event in recent weeks. For this reason, the program does allow single entries, but to make the curve more realistic, age and training data (in the simplified form of total miles per week, which ideally assumes some appropriate-to-the-event balance ofspeed versus endurance work). Generally, higher mileages are assumed to be associated with greater endurance versus speed and aging is assumed to have a similar effect, though small since both endurance and speed show declines past about 30 years. With this information, the single point curve can be nearly as good as the two point one, or even better if one of the two points was a significantly sub-par performance. Some small sex difference is included as well, though male and female runners of similar ability have fairly similar performance curves.
Perhaps the program's most advanced capability is that it can sort through up to five different performances, weed out the bad ones, and try every possible combination of two points to yield the best one (though on very rare occasions with unusual data a point could get missed). It also can find the likely best performance of all and take into consideration the curve suggested by the remainder of points as well as the training and age data to generate a really accurate curve. Both options can be tried alternately on the same data set. If the races span a large range of distance and were allreally good efforts, the 'best two' method may be better; with narrowly spaced results and/or widely varied race conditions or efforts the 'use all data' method is probably better.
Three other outputs generated are the 'fully aerobic training pace', which is actually used by the program, the 'aerobic threshold pace', and the 'VO2max pace'. The first closely represents an appropriate pace for longer training runs or 'easy' days, with heart rate about 70% of maximum, especially for fairly typical distance runners, though sprinters' (who don't often run far anyway) data may yield 'aerobic paces' that may be too slow. The second refers to the pace at which lactic acid begins to build much more rapidly and is appropriate on certain types of 'hard' days. The VO2max pace produces maximum oxgenuptake, though one can sprint faster. This pace is appropriate forinterval training with the goal of increasing this ability to take inoxygen; faster interval training yields little additional benefit in this area and can be too stressful for optimal training.
In summary, I believe many aspects of this program's approach to be truly unique and remarkably accurate over a very wide range of abilities and distance specialties. A number of other options are available in the registered version.
Additional features of the registered version
Most of the features of the registered version are available to at least some extent in the shareware version. Here are the additional features you get with registration:
When you register, you will receive a unique compilation of the registered
version, with your name visible at boot-up. For this reason, you may want to
specify the first and last name of the runner who will most often use the program.
You can change the name at run-time, however.
Obviously, human beings are not as predictable as, say, solar eclipses, so no model will be accurate for everyone all of the time. Not only does one's physical state vary widely from day to day, but the crucial mental component may be quite fickle as well. The type of runner that this program really models well is the diligent, mentally strong type who trains intelligently and can perform up to potential at will. There actually are a lot of runners like this, even in high school. Self trained runners often have the self discipline to perform consistently as well.
Course conditions and weather can have major and obvious effects as well. The program may have trouble predicting cross country times, as it really requires the consistency of a track or flat course for best results. Heat, while a rather minor factor in races lasting only a few minutes, really takes a toll in longer events. Basically the program can only be as accurate as the runner and conditions are consistent.
One potential problem to avoid is the confusion (to the program!) caused by using results collected over a span of many years, or even weeks if one's condition is changing rapidly. Using a mile time run in high school with a 10K run at age 40 may not be very meaningful, and for a runner peaking for an important race, a result from three weeks previous may give a misleading (often pessimistic) prediction. As for training data, the 'miles per week' is perhaps best thought of as an average over the last two months or so. All this is really fairly obvious, but mentioning it here should serve as a reminder to consider these issues in choosing data to feed the program. Afterall, 'garbage in, garbage out'.
Likewise, there is a situation in which some of your input races will not be fully included in the analysis. This occurs when some of the longer races entered were actually run at a faster pace than some of the shorter ones. This is the result of a procedure which deliberately 'weeds out' races that could not possibly be the runner's best,which of course is appropriate. The problem is that, if this occurs several times in the input data, there may be only one race left! In such cases the output is questionable - but then, so was the input. The solution is to always use the best few races, avoiding 'junk' data that should not be considered in the analysis.
Of course the only way to see what I'm talking about firsthand is to try out the free demo! In case you don't want to scroll all the way back up, it can be downloaded here.
Daniels, Jack PhD., Daniels' Running Formula, Second Edition: Human kinetics, Champaign, Illinois, 2005.
Martin, David E. and Coe, Peter N., Training Distance Runners. Champaign, Illinois, Leisure Press, 1991.
Gardner, James B. and Purdy, J. Gerry, Computerized Running Training Programs. Los Altos, California, Tafnews Press, 1970.
Costill, David L., A Scientific Approach to Distance Running. Los Altos, California, Track & Field News, 1979.
Jarver, Jess (Ed.), Long Distances: Contemporary Theory, Technique, and Training. Mountain View, California, Tafnews Press, 1995.
World Association of Veteran Athletes (WAVA), Age-Graded Tables. Van Nuys, California, National Masters News, 1994.
Maffetone, Philip, The Big Book of Endurance Training and Racing. New York, New York, Skyhorse Publishing, 2010.
See Also: Large Scale Pace Study Web Page
Related links you might want to check out include Gopper's Running PR List, where you can look at the personal records at many distances for hundreds of runners of all abilities and even add your own records. Another page to check out is Patrick Hoffman's Cross Country and Running Analysis page, which discusses several different performance models including Purdy points and modifications thereof. If you want to try an online pace prediction, there's the Team Oregon Pace Wizard. This one is somewhat similar to Runpaces, but uses only one race on which to base your performance curve. You can actually enter three races, but it simply picks the one it thinks is best and apparently (from my experiments on it anyway) ignores the others. I believe it to be accurate only for long distance (i.e. over 10K) specialists and actually the page acknowledges something to that effect. One neat thing about it is that it gives heart rate estimates, too. Runner's World also has a pace predictor on its web page, but this one, which uses a 1.06 power law, is also geared toward high-mileage, long distance specialists and I believe the power law fit to be less accurate than either the Team Oregon model, or Runpaces, which I honestly feel is the best I've seen.
(c) Copyright Thomas J. Ehrensperger 2008