I hope all of you have had a chance to catch up with the normal subjects of this column in last weeks “By The Numbers: Week 8a”, in part b I wanted to introduce you to one of my favourite prediction tools…
The Poisson Distribution
I’m going to endeavour where possible to make this article as readable as possible for the layman – statistics can be an incredibly nerdy subject and I imagine that there’s only a few readers who’d care about the majority of it. If however there’s anything that you’d like to ask then please do make a comment below.
A brief look at Wikipedia will show you that the Poisson distribution is named after Simeon Denis Poisson and it’s used to “express the probability of a given number of events occurring in a fixed interval of time, if these events occur with a known average rate and independently of the time since the last event”. That might sound a little confusing but for us this should actually read “express the probability of the number of goals a team will score during 90 minutes, if we know how many goals they score on average and there’s no link between the time a goal is scored and the next is scored” – That sounds right up our street huh?
The following graphic shows the actual percentage distribution of scores during last season’s premiership and underneath the forecasted probabilities using the Poisson distribution (Home mean 1.5736, away mean 1.2166).
As you’ll see, with the exception of slight differences for a 1-1 and 0-1 draws, the distribution does a fairly good job of emulating the final figures. I’m sure that many of you will have realised that if the distribution is accurate at the end of the season, it will be less so early in the season where the mean is still increasing or decreasing with each matches performance and you’d be right. Predicting football games early in the season is always a game of incomplete information so I’m afraid that every probability exercise will always gain strength as more concrete results come out - to my mind though the distribution it is still a great indicator of real world results.
If you want the full detail of the Poisson distribution (and you might want to check out the negative binomial distribution as well as some prefer this for football forecasting) then stick it into Google. The only bit of maths I intend on showing you in this article is what happens when you combine probabilities.
Combining probabilities
We have discussed previously that there is approximately a 50% chance of a home win, 25% chance of a draw and 25% chance of an away win. How do you work out then if three specific matches will all finish as away wins? The maths is simple; take 25% expressed as a decimal (0.25) and multiply each probability together – basically 0.25 x 0.25 x 0.25. This gives a total of 0.0156 or, expressed as a percentage, 1.56%. That’s right, amazing huh? Three 25% chances give less than a 2% chance of success! Many readers like to have a bet at the weekend and for those of you who do this should highlight what a bad idea accumulators (parlay bet to our American readers) are. A percentage chance of 1.56% means that you will need odds of 63-1 for you to have a fair chance – have you ever seen a bookmaker offer these?
The reason that I’ve brought up combining probabilities is because when you work out, using the Poisson distribution, the probability of team “A” scoring two goals, you can multiply it by team “B”’s probability of scoring two goals and you’ll know then the probability of having a final score line of 2-2.
I’m sure that you’ll all breathe a sigh of relief when I tell you not to worry because I’ll be putting together all of the probabilities for you but for those who want an easy solution for doing it themselves then turn to Excel. Simply use the following formula (2010 onwards):
=POISSON.DIST(Number of goals, average number of goals, FALSE)
It’s really that simple, just add in the detail, format as a percentage and you’ll have a relatively accurate prediction of your team scoring the desired number of goals. For those of you who use older version of Excel, replace the formula with =Poisson().
This weeks predictions
Firstly I would like to reiterate just how little data these predictions are based on but I’ve looked over them and I think they certainly fit with where I suspect most matches will fall. If everyone likes the format I’ll make it part of my normal article so please do leave comments below the article – both positive and negative.
The heat maps show where the Poisson distribution believes the most likely score lines fall. The further across the x axis, the higher the score for the away team, the further down the Y axis, the more for the home team. As a fantasy manager you should be looking for high scoring distributions such as Crystal Palace vs. Chelsea where the forecast of 1-3 is equal highest as opposed to Stoke vs. Swansea where it appears the away team will take it by just a single goal.
You’ll note that I’ve collated a smaller table towards the bottom of the picture which shows the total probabilities of a home, away or draw. Similarly, by totalling the forecasted score lines with and without goals, you’ll find text at the bottom which should highlight both great and not so great defensive picks.
That’s it for me this week, remember please do let me know your thoughts and if you have any requests for column input then please do let me know.