My task was to find out whether I could be successful at predicting which PGA Golfers were having a good season, and whether this predicted. I recently created a tool where I could define alerts (example: Current Score is + Odds for the machine learning golf betting team is > + Other team is. We present a way to include bets p&l into a neural network classifier using a custom loss function. We believe this is useful for anyone looking to use. Using data analytics and machine learning to create a comprehensive and profitable system for predicting the outcomes of NBA games.
Given the analysis and discussion so far, we can now think of having a set of models to choose from where differences between models are defined by a few parameters. These parameters are the choice of weighting scheme on the historical strokes-gained averages this involves just a single parameter that determines the rate of exponential decay moving backwards in time , and also the weights that are used to incorporate the detailed strokes-gained categories through a reweighting method.
The optimal set of parameters are selected through brute force: we loop through all possible combinations of parameters, and for each set of parameters we evaluate the model's performance through a cross validation exercise. This is done to avoid overfitting: that is, choosing a model that fits the estimating data very well but does not generalize well to new data.
The basic idea is to divide your data into a "training" set and a "testing" set. The training set is used to estimate the parameters of your model for our model, this is basically just a set of regression coefficients [9] , and then the testing set is used to evaluate the predictions of the model.
We evaluate the models using mean-squared prediction error, which in this context is defined as the difference between our predicted strokes-gained and the observed strokes-gained, squared and then averaged. Cross validation involves repeating this process several times i. This repetitive process is again done to avoid overfitting.
The model that performs the best in the cross validation exercise should hopefully be the one that generalizes the best to new data. That is, after all, the goal of our predictive model: to make predictions for tournament outcomes that have not occurred yet. One thing that becomes clear when testing different parameterizations is how similar they perform overall despite disagreeing in their predictions quite often.
This is troubling if you plan to use your model to bet on golf. For example, suppose you and I both have models that perform pretty similar overall i. This means that both of our models would find what we perceive to be "value" in betting on some outcome against the other's model.
However, in reality, there is not as much value as you think: roughly half of those discrepancies will be cases where your model is "incorrect" because we know, overall, that the two models fit the data similarly. The model that we select through the cross validation exercise has a weighting scheme that I would classify as "medium-term": rounds played years ago do receive non-zero weight, but the rate of decay is fairly quick.
Compared to our previous models this version responds more to a golfer's recent form. In terms of incorporating the detailed strokes-gained categories, past performance that has been driven more by ball-striking, rather than by short-game and putting, will tend to have less regression to the mean in the predictions of future performance.
To use the output of this model — our pre-tournament estimates of the mean and variance parameters that define each golfer's scoring distribution — to make live predictions as a golf tournament progresses, there are a few challenges to be addressed. First, we need to convert our round-level scoring estimates to hole-level scoring estimates.
This is accomplished using an approximation which takes as input our estimates of a golfer's round-level mean and variance and gives as output the probability of making each score type on a given hole i. Second, we need to take into account the course conditions for each golfer's remaining holes. For this we track the field scoring averages on each hole during the tournament, weighting recent scores more heavily so that the model can adjust quickly to changing course difficulty during the round.
Of course, there is a tradeoff here between sample size and the model's speed of adjustment. Machine learning golf betting Another important detail in a live model is allowing for uncertainty in future course conditions. This matters mostly for estimating cutline probabilities accurately, but does also matter for estimating finish probabilities.
If a golfer has 10 holes remaining, we allow for the possibility that these remaining 10 holes play harder or easier than they have played so far due to wind picking up or settling down, for example. We incorporate this uncertainty by specifying a normal distribution for each hole's future scoring average, with a mean equal to it's scoring average so far, and a variance that is calibrated from historical data [10].
The third challenge is updating our estimates of player ability as the tournament progresses. This can be important for the golfers that we had very little data on pre-tournament. For example, if for a specific golfer we only have 3 rounds to make the pre-tournament prediction, then by the fourth round of the tournament we will have doubled our data on this golfer!
Updating the estimate of this golfer's ability seems necessary. To do this, we have a rough model that takes 4 inputs: a player's pre-tournament prediction, the number of rounds that this prediction was based off of, their performance so far in the tournament relative to the appropriate benchmark , and the number of holes played so far in the tournament.
The predictions for golfers with a large sample size of rounds pre-tournament will not be adjusted very much: a 1 stroke per round increase in performance during the tournament translates to a 0. However, for a very low data player, the ability update could be much more substantial 1 stroke per round improvement could translate to 0.
With these adjustments made, all of the live probabilities of interest can be estimated through simulation. For this simulation, in each iteration we first draw from the course difficulty distribution to obtain the difficulty of each remaining hole, and then we draw scores from each golfer's scoring distribution taking into account the hole difficulty.
The clear deficiency in earlier versions of our model was that no course-specific elements were taken into account. That is, a given golfer had the same predicted mean i. After spending a few months slumped over our computers, we can now happily say that our model incorporates both course fit and course history for PGA Tour events.
For European Tour events, the model only includes course history adjustments. Further, we now account for differences in course-specific variance, which captures the fact that some courses have more unexplained variance e. TPC Sawgrass than others e. This will be a fairly high-level explainer.
We'll tackle course fit and then course-specific variance in turn. The approach to course fit that was ultimately successful for us was, ironically, the one we described in a negative light a year ago. For each PGA Tour course in our data we estimate the degree to which golfers with certain attributes under or over-perform relative to their baselines where a golfer's baseline is their predicted skill level at a neutral course.
The attributes used are driving distance, driving accuracy, strokes-gained approach, strokes-gained around-the-green, and strokes-gained putting. More concretely, we correlate a golfer's performance i. Attribute-specific skill levels are obtained using analogous methods to those which were described in an earlier section to obtain golfers' overall skill level.
For example, a player's predicted driving distance skill at time t is equal to a weighted average of previous adjusted for field strength driving distance performances, with more recent rounds receiving more weight, and regressed appropriately depending on how many rounds comprise the average. The specific weighting scheme differs by characteristic; not suprisingly, past driving distance and accuracy are very predictive of future distance and accuracy, and consequently relatively few rounds are required to precisely estimate these skills.
Conversely, putting performance is much less predictive, which results in a longer-term weighting scheme and stronger regression to the mean for small samples. With estimates of golfer-specific attributes in hand, we can now attempt to estimate a course-specific effect for each attribute on performance — for example, the effect of driving distance on performance relative to baseline at Bethpage Black.
The main problem when attempting to estimate course-specific parameters is overfitting. Despite what certain sections of Golf Twitter would have you believe, attempting to decipher meaningful course fit insights from a single year of data at a course is truly a hopeless exercise.
This is true despite the fact that a year's worth of data from a full-field event yields a nominally large sample size of roughly rounds. Performance in golf is mostly noise, so to find a predictive signal requires, at a minimum, big sample sizes it also requires that your theory makes some sense.
To avoid overfitting, we fit a statistical model known as a random effects model. It's possible to understand its benefits without going into the details. Consider estimating the effect of our 5 attributes on performance-to-baseline separately for each course: it's easy to imagine that you might obtain some extreme results due to small sample sizes.
Conversely, you could estimate the effect of our 5 golfer attributes on performance-to-baseline by pooling all of the data together: this would be silly as it would just give you an estimate of 0 for all attributes as we are analyzing performance relative to each golfer's baseline, which has a mean of zero, by definition. The random effects model strikes a happy medium between these two extremes by shrinking the course-specific estimates towards the overall mean estimate, which in this case is 0.
This shrinkage will be larger at courses for which we have very little data, effectively keeping their estimates very close to zero unless an extreme pattern is present in the course-specific data. Here is a nice interactive graphic and explainer if you want more intuition on the random effects model. Switching to this class of model is one of the main reasons our course fit efforts were more successful this time around.
What are the practical effects of incorporating course fit. While in general the difference between the new model, which includes both course fit and course history adjustments, and the previous one which we'll refer to as the baseline model are small, there are meaningful differences in many instances.
If we consider the differences between the two models in terms of their respective estimated skill levels i. I can't say I ever thought there would come a day when we would advocate for a 1 stroke adjustment due to course fit. And yet, here we are. Let's look at an example: before the Mayakoba Classic at El Camaleon Golf Club, we estimated Brian Gay to be 21 yards shorter off the tee and 11 percentage points more accurate in fairways hit per round than the PGA Tour average.
This made Gay an outlier in both skills, sitting at more than 2 standard deviations away from the tour mean. Furthermore, El Camaleon is probably the biggest outlier course on the PGA Tour, with a player's driving accuracy having almost twice as much predictive power on performance as their driving distance there are only 11 courses in our data where driving accuracy has more predictive power than distance.
Therefore, at El Camaleon, Gay's greatest skill accuracy is much more important to predicting performance than his greatest weakness distance. Further, Gay had had good course history at El Camaleon, averaging 1. It's worth pointing out that we estimate the effects of course history and course fit together, to avoid 'double counting'.
That is, good course fit will often explain some of a golfer's good course history. Taken together, this resulted in an upward adjustment of 0. When evaluating the performance of this new model relative to the baseline model, it was useful to focus our attention on observations where the two models exhibit large discrepancies.
The correlation between the two models' predicted skill levels in the full sample is still 0. However, by focusing on observations where the two models diverge substantially, it becomes clear that the new model is outperforming the baseline model. As previously alluded to, the second course-specific adjustment we've made to our model is the inclusion of course-specific variance terms.
This means that the player-specific variances will all be increased by some amount at certain courses and decreased at others. Golf in dubai championship betting tips It's important to note that we are concerned with the variance of 'residual' scores here, which are the deviations in players' actual scores from our model predictions this is necessary to account for the fact that some courses, like Augusta National, have a higher variance in total scores in part because there is greater variance in the predicted skill levels of the players there.
All else equal, adding more unexplained variance — noise — to scores will bring the model's predicted win probabilities for the tournament, for player-specific matchups, etc. That is, Dustin Johnson's win probability at a high residual -variance course will be lower than it is at a low-variance course, against the same field.
In estimating course-specific variances, care is again taken to ensure we are not overfitting. Perhaps surprisingly, course-specific variances are quite predictive year-over-year, leading to some meaningful differences in our final course-specific variance estimates. A subtle point to note here is that a course can simultaneously have high residual variance and also be a course that creates greater separation amongst players' predicted skill levels.
For example, at Augusta National, golfers with above-average driving distance, who tend to have higher baseline skill levels, are expected to perform above their baselines; additionally, Augusta National is a course with above-average residual variance. Therefore, whether we would see the distribution of win probabilities narrow or widen at Augusta relative to a typical PGA Tour course will depend on which of these effects dominates.
There are a few important changes to the model. First, we are now incorporating a time dimension to our historical strokes-gained weighting scheme. This was an important missing element from earlier versions of the model. For example, when Graham DeLaet returned in early after a 1-year hiatus from competitive golf, our predictions were mostly driven by his data from and earlier, even after DeLaet had played a few rounds in It seems intuitive that more weight should be placed on DeLaet's few rounds from given the absence of data compared to a scenario where he had played a full season.
Using a weighting function that decays with time e. However, continuing with the DeLaet example, there is still lots of information contained in his pre rounds. Therefore we use an average of our two weighted averages: the first weights rounds by the sequence in which they were played, ignoring the time between rounds, while the second assigns weights based on how recently the round was played.
In DeLaet's case, if he were playing this week Jan 4, , his time-weighted predicted strokes-gained would be Ultimately we combine these two predictions and end up with a final prediction of The difference between this value and the sequence-weighted average is what appears in the "timing" column on the skill decomposition page.
Have a look at DeLaet's true strokes-gained plot to understand why the different weighting methods cause such a large divergence in estimated skill. For golfers who are playing a steady schedule, there will not be large differences between the time-weighted and sequence-weighted strokes-gained averages. However for players that play an above-average number of rounds per year e.
Sungjae Im , the time-weighting will tend to de-emphasize their most recent data. A second change to the model is that we are using yet another method to incorporate the strokes-gained categories into our baseline i. As we've said elsewhere, it is surprisingly difficult to use the strokes-gained categories in a way that doesn't make your predictions worse.
This is because not all PGA Tour and European Tour events have strokes-gained data which reminds me: another new thing for this season is that we have added European Tour SG category data. Therefore, if you were to leverage the category SG data but necessarily only use rounds with detailed SG data, you would be outperformed by a model that only uses total strokes-gained but uses data from all events.
This highlights the importance of recent data in predicting golf performance. Our previous strokes-gained category adjustment method involved predicting performance in the SG categories using total SG in rounds where the categories weren't available, and then estimating skill in each SG category using a combination of the real and imputed SG data.
This worked reasonably well but had its drawbacks. I'll omit the details on our current method, but it no longer uses imputed SG data. Therefore, if a golfer's recent performance is driven by a short-term change in ARG or PUTT, their SG adjustment will be in the opposite direction of that recent change e.
A third model update involves how our predictions are updated between rounds within a tournament. In the past we have been a bit lazy when predicting a golfer's Round 2 R2 performance given their R1 performance plus historical pre-tournament data. Now we have explicity estimated what that update should be, and, interestingly, we also allow the weight applied to a golfer's R1 performance to vary depending on a few factors.
For example, as mentioned above Sungjae Im is a golfer who doesn't take many weeks off; therefore, when predicting his R2 performance, Im's R1 score is weighted less than the typical tour player. Machine learning golf betting It should be clear that this is tightly linked to the ideas behind using a time-weighted decay; the further into the past is the bulk of a golfer's historical data, the more weight their R1 performance will receive when predicting R2.
Similar logic is applied when predicting R3 and R4 performance. This will have obvious implications for our model when predicting performance after a tournament starts e. The stronger the correlation between within-tournament performances e. Thinking through the simulation of a golf tournament can help clarify this: if a golfer plays well in the first round their predicted skill for the second round simulation is increased, while if they play poorly their R2 skill is decreased.
The larger the weight on R1 performance, the wider is the range of their possible predicted skill levels for R2, which in turn leads to a wider range of potential scores. Compared to our previous model, this change in within-tournament weighting does not make a huge difference for pre-tournament predictions, but it will be noticeably different once the tournament starts and skill levels are actually updated.
A final point is that we also incorporate the strokes-gained categories into the within-tournament updates when possible leveraging the fact that, eg, outperforming one's baseline in OTT in R1 is much more informative than doing the same with putting. Finally, the European Tour version of our model in will now include course fit adjustments.
Course fit, along with the aforementioned addition of European Tour strokes-gained category data, should bring the quality of our Euro predictions up to the level of our PGA predictions. There have been enough changes in our model recently to warrant a deviation from our previous schedule of once-a-year written updates.
We also think it's important for our users, as we try to add model customizability to the site, to know what our model is taking into account in order to understand the dimensions along which it can be improved. Therefore this is partly an attempt to put more relevant information about our model in one place.
Now, to the updates. First — and this was actually added sometime in late — the effects of "pressure" are now included in the model. This is an adjustment we make to players' predicted skill in the third and fourth rounds that depends on their position on the leaderboard to start the day.
This is not a player-specific adjustment, and does not vary depending on player characteristics either e. We haven't found meaningful differences in performance relative to expectation when leading across categories of players — it seems everyone performs worse when near the lead. There are a lot more details on the pressure-performance relationship in this blog and on this interactive page.
Second, we recently revisited the question of the optimal decay rates for the various weighting schemes used on a player's historical data. Relative to the market it seems like our model weights longer-term data more heavily. This is for the sequence-weighted average described in the section above this one. Also mentioned in that section, and this was a new addition for , was the time-weighted average.
We have now made that weighting scheme slightly more short-term oriented. The interesting, general, thing that we learned by revisiting these decay rates is that the optimal weighting scheme for a specific weighted average shouldn't be chosen in isolation. For example, if we were only to use a sequence-weighted average to predict performance, then the optimal decay would be larger i.
In this specific case, I think that makes sense, as the role of the sequence-weighted average is in part to pick up longer trends in performance if a player hasn't played much recently. The other weighting schemes we revisited are those used on the specific strokes-gained categories. Omitting the details, we are also now slightly more short-term focused on all the SG categories for the same reason specified above — when using the categories together instead of in isolation, it appears that short-term weighting schemes are slightly better.
The upshot of this is that the strokes-gained category adjustments used to be somewhat biased towards longer-term data. That is, even ignoring differential performance in the categories, which is what we want to be driving that adjustment, if a player was performing well in the short-term they were likely to be receiving a negative SG adjustment.
Going forward this will no longer be an issue. As discussed here , because most of the Euro SG category data are event-level averages we have to impute the round-level values. This is not a huge issue, but it does make it difficult to actually fit a model for predicting strokes-gained categories at the round level on the European Tour.
As a result, we have to rely on our PGA Tour models for the SG categories and hope the relationships in that data also hold in the Euro data. The degree to which this works can still ultimately be tested by looking at whether our overall predictions are improved.
However, again there are issues: we only have 4 years of European tour strokes-gained category data, which is not quite enough to get precise answers. We want to determine if we can use the European SG category data in some way to improve over our baseline predictions which are based off only total SG data; these two estimates of skill will inevitably be very highly correlated, and so over 4 years there actually aren't that many instances where the two methods will disagree substantially allowing for their respective predictive performance to be compared.
In any case, the practical takeway here is that we are decreasing the size of the SG category adjustments we make on the European Tour slightly. With regards to the overall long-term versus short-term focus of our model, it is useful to consider two recent examples of players that we didn't update as quickly on relative to the market: Jordan Spieth and Lee Westwood.
They are instructive examples of cases that our model might not handle well for different reasons. In the case of Spieth, part of the reason the market reacted so quickly was that Spieth had proven himself to be a world-beater early in his career. The idea seemed to be that once Spieth flashed some of his old form we could essentially ignore the data from the two-year slump he was pulling himself out of.
While I obviously don't agree with ignoring the slumping-Spieth data, I do think it's important to account for the fact that Spieth used to be a top player; our current model doesn't do this, as rounds from more than a couple years ago essentially receive no weight. To associate your repository with the sports-betting topic, visit your repo's landing page and select "manage topics.
Learn more. Skip to content. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. You switched accounts on another tab or window. Dismiss alert. Here are 92 public repositories matching this topic Language: All Filter by language. Star 1. Updated May 8, Python.
NBA sports betting using machine learning. Updated May 4, Python. Star Updated Dec 21, Python. Collection of sports betting AI tools. Updated Mar 29, Python. Updated May 2, Python. Python script to scrape Bet odds using Selenium. Updated Feb 4, Python. Updated Feb 2, Jupyter Notebook. Betting on the NBA with data.