Selection Bias in Golf and how to avoid it

By |2018-06-02T16:53:40+00:00June 2nd, 2018|Golf Statistics|0 Comments

We are often prone to something called ‘Selection Bias’ when we talk, discuss and analyze our golf games, and if we are really serious about improving our golf games then we need to try to eliminate the effect of this phenomenon as much as possible.

It is virtually impossible to have a statistically objective view of your golf game without using a tool such as Anova.Golf to measure your performance as our human minds don’t think this way ‘by default’. Furthermore, using incomplete information riddled with selection bias will often result in designing inefficient practice plans, often derailing even the most motivated players in their improvement cycles.

Hence, if you are really serious about improving your game, then you need to accept the fact that Selection Bias is a real deal, and make sure that your decisions about practice and tournament play are based on objective information about your own game.

Selection bias as it relates to golf statistics is the bias that happens when the data that is being collected and selected isn’t representative of the performance intended to be analyzed

Selection bias as it relates to golf statistics is the bias that happens when the data that is being collected and selected isn’t representative of the performance intended to be analyzed. In non-statistics terms this is called “Cherry Picking” and is what happens when we for example only include our “good rounds” in the data and omit everything that we believe isn’t “reflective of how we ‘should’ play”.

“Cherry Picking” and is what happens when we for example only include our “good rounds” in the data

As golfers we mostly feel that we are ‘entitled’ to having a good lie when we hit the ball in the fairway, and if the ball ends up in a divot, we feel that these occurrences should be excluded from the data because it was ‘bad luck’ or ‘an unusual circumstance’.

we tend to discount performances on courses and during weather conditions that we believe are not ‘normal’ and we omit these from our statistics as well.

Similarly, we tend to discount performances on courses and during weather conditions that we believe are not ‘normal’ and many times we therefore omit these from our statistics as well. What we have done here is only counted the good results (as we feel that this is the way we should play) and we have deleted a very important part of our sample size: what happens when we are playing in less than ideal situations.

Examples of this are when we happen to find ourselves in the following situations:

  • My ball is in a divot
  • I had a super thick lie in the rough
  • I had bad luck
  • The ball hit a spike mark and didn’t go in
  • It wasn’t my fault
  • It was too windy
  • Too rainy?
  • But my tee shot hit a tree?
  • I had head wind – I usually hit the ball a lot further than this?
  • The ground was wet?
  • I didn’t have time to warm up

With a big enough sample size, all players will experience what we think of as “bad breaks” and we will experience “good breaks”. The key here is to have a lot of rounds in your database, and to include all of the rounds you play, in order to have golf stats that accurately represent your actual performance level.

On the PGA Tour, players play in a variety of conditions: rain, wind, thick rough, easy rough, narrow fairways, wide fairways.

It’s important to avoid Selection Bias when collecting and analyzing your stats.

On Tour, players play in a variety of conditions: rain, wind, thick rough, easy rough, narrow fairways, wide fairways. All the shots are counted and recorded, regardless of the course setup and weather conditions. All of this information is entered into the big database and used to derive the benchmark. Out of all the players that play during a season on the Tour, there will be players whose tee shots end up in the middle of the fairway in a divot; some players will ‘catch a break’ and their golf shots end up in easy lies where the crowd has trampled the rough down. This is all reflected in the data and with a large enough sample size the individual effects of these ‘outliers’ is small. It is correct to include all of these shots because they actually happened, and omitting them would mean that we are guilty of Selection Bias.

The easiest way to avoid Selection Bias is to record all of your rounds, including the good ones, the bad ones and the ugly ones.

The easiest way to avoid Selection Bias is to record all of your rounds in Anova; you enter your good rounds, your bad rounds, your average rounds; the more rounds you have in your database the better it will explain your actual performance level and it will give you incredible insights you and your coaches can use to design hyper-effective practice plans.

About the Author:

Thomas is a professional golfer and has played events on the European Tour, Web.Com Tour, Asian Tour, PGA Tour of Australasia as well as on PGA Tour China/Canada/Latinoamerica. He built Anova.Golf when no existing products could answer his detailed performance questions. The resulting information from Anova was astonishing: what he thought of as his biggest strength ended up being his biggest weakness.

Leave A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies and third party services. Ok