Evaluating IPL batters ability against Spin and Pace
Theory
In this article I want to create an index for batters to see how they perform against spin bowling and pace bowling, to see which batsmen perform the best. I will use different variables to quantify a batsman’s performance, and scale this value into a value between 0 and 100, which will be the batsman’s score.
The first variable I will use is the batsman’s run rate, while facing pace vs spin. A higher run rate against spin implies that the batsman is finding it easier to play the ball, hence should have a higher score.
The second variable we need to take into account is the wicket rate. The wicket rate, will be the amount of times that a batsman has gotten out against the amount of balls that they have faced. The balls faced makes it fair because it means that a seasoned international batter will not be penalised for getting out more times than an emerging bowler.
The last two variables we will take into account is the rate of fours and sixes that the batsman has hit against the type of bowler.
Now we need to standardise each variable so that it is a value between 0 and 1. To do that we can use the following formula:
\[x_N=\frac{x-\min(x)}{\max(x)-\min(x)}\]Once we have normalised each variable we can place it into our equation to get a raw score
\[SI_{i}=\alpha_{1}\cdot RR_{N}+\alpha_{2} \cdot W_{N}+\alpha_{3}\cdot F_{N}+\alpha_{4} \cdot S_{N}\]We can now manually set these weights to an appropriate value. Since a higher wicket rate implies that the batsman is performing worse, we should make $\alpha_2$ -1. The rest of the variables positively affect the score and hence the weights should simply be left to be 1.
Now the problem with this raw score is that there are ways in which anomalies can occur. For example, if a young bowler hit a 6 on the only ball he had to bat in their international debut, then their run rate will be much higher than anyone else, and their wicket rate will also be zero, leading to a very misleadingly high score.
To fix this we need to incorporate a confidence variable. This will depend on the total balls that the batsman has faced. Less balls faced, will imply a lower confidence in the calculation of our raw score and more balls faced will imply a higher confidence in our raw score calculation. We want an equation, that takes the balls faced as an input and returns a decimal between 0 and 1 of the confidence. Furthermore, this equation cannot be linear, because there is much bigger difference between 0 balls faced and 600 vs 2000 balls faced and 2600.
The perfect equation for our requirements is the Sigmoid function, as I have explained in my logistic regression article, because all large values will be close to 1 and all small values will be close to zero.
\[\phi(B)=\frac{1}{1+e^{-B}}\]- where B is the number of balls faced
We also need to linearly scale this to match our data. We can assume the value for the gradient to be 0.008 because we do not want our slope to rapidly increase at the half way point, but rather be smooth. We can also set the bias to be 300 balls faced. This is the median of the amount of balls by a batsman. This helps us with our requirement before, as using this equation, the confidence difference between 0 and 600 is more than 0.5 but between 2000 and 2600, its practically 0.
\[\text{Confidence} = k \cdot \phi(B - M)=\frac{1}{1+e^{-k(B-M)}}\]Finally, we can multiply together the confidence score with the raw score to get a final score. We also need to normalise this so that the range of scores is 0 to 1, otherwise it would be impossible to get a perfect score, and then multiply by 100.
\[\text{Spin index} = (\text{Spin Confidence} \cdot \text{Spin Raw Score})_N \cdot 100\]\(\text{Pace index} = (\text{Pace Confidence} \cdot \text{Pace Raw Score})_N \cdot 100\)
Final Results
I have taken data from the IPL to generate these results. My code can be found at Cricket analysis Github
Top 10 Players by Spin Index
| player | SpinIndex | PaceIndex |
|---|---|---|
| CH Gayle | 100 | 80.66 |
| GJ Maxwell | 97.08 | 73.6 |
| YK Pathan | 93.17 | 66.44 |
| KA Pollard | 89.77 | 73.16 |
| JC Buttler | 89.56 | 74.71 |
| RR Pant | 88.06 | 75.45 |
| Shubman Gill | 87.64 | 62.36 |
| SV Samson | 84.71 | 66.51 |
| DR Smith | 84.38 | 67.42 |
| DA Warner | 84.33 | 66.17 |
Top 10 Players by Pace Index
| player | SpinIndex | PaceIndex |
|---|---|---|
| AD Russell | 50.56 | 100 |
| N Pooran | 76.59 | 88.04 |
| LS Livingstone | 23.17 | 82.38 |
| SP Narine | 53.08 | 81.68 |
| AB de Villiers | 76.35 | 80.67 |
| CH Gayle | 100 | 80.66 |
| SO Hetmyer | 24.9 | 80.38 |
| PD Salt | 27.29 | 78.42 |
| H Klaasen | 49.37 | 77.62 |
| Abhishek Sharma | 50.72 | 77.08 |