I’ve been working on several projects this offseason to evaluate my NASCAR handicapping process and results, and hopefully to improve the process. I’ve long wondered how much I should weight the prelim data (i.e., the practice and qualifying speed chart numbers) each week relative to the pre-prelim data, and I decided to find out.
First, a quick explanation of my handicapping process. For each NASCAR Sprint Cup race, I compile early driver rankings based on driver loop data and prelim data from prior races at that track and similar tracks. Then, following the prelims for that race, I re-compile the rankings by working in the new prelim data.
The question: When I work in the new prelim data, should I weight it equal to the early data? Or is the prelim data worth more as a predictive factor? Could it be worth less?
To figure out the optimum weighting formula, I examined the most recent 11 Sprint Cup races in which the prelim schedule was Practice 1—Qualifying—Practice 2—Practice 3 (PQPP). In theory, this schedule offers significantly more predictive value than the other NASCAR prelim schedule (i.e., Practice 1—Practice 2—Qualifying (PPQ)) because in PQPP you know every car is in race trim in practices 2 and 3. In PPQ, it’s pretty much impossible to consistently know what trim all 43 drivers are in in either practice because they haven’t qualified yet. But more important, looking at just PQPP races provided an apples to apples comparison.
I went through my rankings for each race and re-calculated them to reflect a wide range of different data weights, including:
- 0% Prelims–100% Earlies
- 10% Prelims–100% Earlies
- 20% Prelims–100% Earlies
- 30% Prelims–100% Earlies
- 40% Prelims–100% Earlies
- 50% Prelims–100% Earlies
- 60% Prelims–100% Earlies
- 70% Prelims–100% Earlies
- 80% Prelims–100% Earlies
- 90% Prelims–100% Earlies
- 100% Prelims–100% Earlies
- 100% Prelims–90% Earlies
- 100% Prelims–80% Earlies
- 100% Prelims–70% Earlies
- 100% Prelims–60% Earlies
- 100% Prelims–50% Earlies
- 100% Prelims–40% Earlies
- 100% Prelims–30% Earlies
- 100% Prelims–20% Earlies
- 100% Prelims–10% Earlies
- 100% Prelims–0% Earlies
I then compared all the new rankings to the actual results for each race and calculated the absolute deviation for each set of rankings. Finally, I charted the average deviations for each weighting formula.
After the first five or so races I crunched, it was clear the average deviation began to soar after the equal, 100%–100% weighting. In other words, it was clear my rankings grew significantly and consistently worse once I began giving Prelims more weight than Earlies. So, to speed the tedious process up, I began calculating only from 0% Prelims–100% Earlies (i.e., 0%-100%) through 100%–100%. And after a couple more races I deemed it OK to narrow it further to 0%–100% through 80%–100%.
The results: After 11 races, my handicapping process produced the best (i.e., lowest) average deviation at the 40%–100% weighting. I’ve included a chart that shows the average deviation from 0%–100% through 80%–100%.
Some caveats: First, there is enormous variation between even just the PQPP races, so 11 races probably isn’t enough to be super definitive about the best weights. Second, these results and inferences are limited to my handicapping process; a different process may produce better prelims data, or better earlies data.
I now intend to use the 40%–100% weight as a starting point for each PQPP race in the 2013 season but will remain ready to tweak it if the prelims appear to offer more or less value than normal. For example, say the weather during prelims is cloudy and cool, it rains following Practice 3, and then on race day it’s sunny and boiling hot. In theory, those factors would reduce the predictive value of the prelims, so I might tweak my weighting to 32%–100%.
One more question: After all this, how much can I actually improve my rankings by using the optimum weighting formula? I looked at each race and concluded I should be able to gain a 5%–10% improvement each week over my performance last year. Not huge, but every edge helps, right?
Now I need to grind out the PPQ races and see if my aforementioned theory for that prelim schedule is correct!
How does actual qualifying speed/position play into this?
Karl, when I crunch the prelims data each week, of course the qualifying speed chart data goes in, and I weight the qualifying data the heaviest. I describe that process here:
https://nascarpredict.com/2013/12/17/my-prelims-scoring-system/
Regarding the track position piece of it: Sometimes I’ll manually move a driver up or down in the rankings due to their track position at a race’s start, but not often, and not too far. Example: a driver wins the pole at Bristol but then changes engines and must start at the rear–I might bump him down a little.
I don’t have a formula for including a comprehensive accounting for each driver’s starting position into my process. Why? First, I don’t know how to quantify how much that initial track position is worth. Second, it’s worth varies widely from track to track. At Talladega, for instance, it’s clearly worth close to zero, but it’s worth more at Sonoma. And third, I get the feeling it’s not worth that much–i.e., it doesn’t hold much predictive value. Guys come from the back all the time, and pole winners frequently drop like stones.
Pingback: NASCARPredict’s 2013 Results | NASCARpredict.com