top of page
Search

Applying Bill James' Formula To Predict Points for ISL Teams & Finding How Many Goals Equals One Win

  • Writer: Ayush Chaurasia
    Ayush Chaurasia
  • Jun 20, 2020
  • 5 min read

Updated: Jul 2, 2020

If you are into sports data analysis, then you must have heard of Bill James and how he revolutionised the game of baseball. If you haven't heard of him then at least make a little effort to watch the Moneyball movie starring Jonah Hill. (Yeah I'm using Jonah Hill's name ahead of Brad Pitt to promote the movie)



What made Bill James special was that he didn't have an advance level of mathematics or statistics skills, but he was still able to come with simple formulas with which baseball teams could make better decisions. He came up with a Pythagorean win expectation formula for baseball teams when he was working as a security guard. This is what the formula looked like-



Pretty basic i must say, no fancy symbols or calculus stuff. The formula estimates the percentage of games a baseball team "should" have won based on the number of runs they scored and allowed. If you want to read more on this, you can head here.

The formula has since then been modified for various sports, and those are used commonly in the NBA and the NFL.


People have obviously tried to come up with a similar formula for football, but it is slightly tricky as a draw is a standard affair in football league games.

Nevertheless, people at Statsbomb (who do a terrific job in the field of football analytics) came up with a rather simple formula. It is something like this.


Now if you want to know how did they arrive on that formula and where did 0.677 and 52.29 come from, you can give this a read. It's a long read, but if you are into mathematics, then you would undoubtedly love it. To be fair, Statsbomb did a terrific job to come up with a simple equation in the end so that even a math noob can me can apply it. I have looked at various other formulas, and some of them seemed gibberish to me given my basic math skills.


The formula is not perfect according to them as it gave an RMSE (Root Mean Square Error) of 4.7 after they tested the equation with league results from the last two decades.

RMSE is used to judge the differences between values predicted by a model or an estimator and the actual value. Hence, the closer your RMSE is to 0, the better your predictor model is. 4.7 RMSE is not too great but not too terrible(Insert Dyatlov's gif from Chernobyl) in Statsbomb's words. Therefore, i decided to apply the formula and see how good it predicted the ISL results


I used the formula and it gave me a massive difference in point prediction(Sigh!). I felt this can't be correct and read up the statsbomb's explanation slightly more in detail and came to find that their formula is based on a 38 game season. Hence, i needed to convert the equation to an 18 game season which we have in ISL. With my basic math skill, i trialled and error and converted the last value of '52.29' to '24.768' and it suddenly turned out to be a Eureka moment for me.


I started getting predicted points much closer to the actual result. This is how the formula fared for the 2019-20 ISL season.



Kerala Blasters turned out to be unlucky according to probability, as they should have had almost four more points given the number of goals they scored and conceded. Mumbai City meanwhile overperformed by almost 4 points.

The prediction is not bad overall though. The RMSE for the season came out to be 2.34. I ran the formula similarly for the other seasons. Had to make some changes when ISL used to have 14 game season. The last value was converted to 19.26 for those campaigns, and the formula fared pretty well. The RMSE for the six ISL seasons came out to be 2.66 which shows that the formula did a pretty decent job and even better than Statsbombs results. However, they did use a significantly large sample size to test their formula, so there's that.


Bill James' was also able to convert the game in small bits. He had evaluated that scoring ten more runs in baseball(and say your runs conceded stayed the same) you would get one more win over the course of a season. Hence, he concluded that 10 runs=1 win.


Similarly, I decided to find how many goals you would need to score to get one more win in ISL.

Using the 2019-20 ISL table, i found that the average goals for and goals against for the season came out to be 27.4.



The formula bar showcases how to calculate the probable win percentage. The ratio is GF/GA, and probable wins is calculated by multiplying the number of games in a league season(18 for ISL) with probable win%.

If you look at it, the formula is again pretty basic. Probable win percentage for an average ISL team is nothing but 1/3, i.e a football game has three possible outcomes- win,loss and a draw, and every result has a 33.33% chance of occurring. That means an average team would have 6 wins, 6 draws and 6 losses for the season.


Using excel's goal seek function now you can try to find how many more goals you will need to score to get one more win, considering your goals against stayed the same.


So the goals for column jumped from 27.4 to 29.69. The difference between the two is 2.29. Hence, if you score around two goals more, then you will be expected to get one more win.


Now you can start playing with this interactive formula, and it can be beneficial to understand how many goals you will probably need or concede to get to the top of the table or whatever the objective is for you for the season.


In the last three ISL seasons, the second-placed team haven't scored more than 34 points. So let's assume that you want to get 35 points for the season. Say you are aiming to reach ten wins in the season and the rest five points you will accumulate by draws. Now let's say in an 18 game season, you are sure that you will be conceding 18 goals at least. So how many goals will you need to score to get to 10 wins?



Around 25 goals are what you will need to get ten wins if you concede 18 goals.


Say you are now not too confident of your defence and you feel that you will be conceding 30 goals in a season. You will need to be unbelievably good in attack to get ten wins, so how many goals will you probably need now?


You will need to score 41 goals approx to get ten wins if you have a leaky defence and are going to concede 30 goals for the season.


You can work things other way round as well. Say this time you are not confident about your attack and you are only hoping to score 18 goals in a season. How many goals can you allow to get ten wins?


You can only afford to concede just around 13 goals if you are hoping to win ten games by scoring 18 goals.


What did we learn?

  • With Bill James and Statsbomb doing all the hard work, i was able to generate a formula to predict the number of points for the season. The formula is (For an 18 game season) Predicted points= (0.677*GD)+24.768. Through this, we can estimate whether a team overperformed or underperformed given the number of goals they scored and conceded.

  • The RMSE for the six ISL seasons came out to be 2.66, which can be termed as pretty decent as Statsbomb's RMSE turned out to be 4.7 (though the sample size we have is small.)

  • We were also able to find that 2.29 goals was equivalent to one win in the 2019-20 ISL season. For the last three seasons combined, 2.1 goal was equivalent to a win.

  • In the English Premier League for 2018-19 season, 4.4 goals was required to get one extra win.

  • We were also able to calculate how many goals one can concede or score to get a desired number of wins.

 
 
 

Comments


Post: Blog2_Post

Subscribe Form

Thanks for submitting!

  • Twitter
  • LinkedIn

©2020 by Footy Analytica. Proudly created with Wix.com

bottom of page