Thursday, December 13, 2012

Analytics and Cricket - X : Reader's Response to DRS Debate

It's getting increasingly difficult to post on cricket given that the Indian cricket team is getting ripped to shreds by half-decent opposition despite home-ground advantage. Of course, as noted in an earlier post, home courts can significantly increase the chance of a choke, and this may well be happening. Mahendra Singh Dhoni (if by some chance, still remains the captain of the Indian team after the current cricket series) can win a few more tosses if he can exploit this idea. Desperate times call for analytical measures!

Meanwhile, an astute reader emailed a detailed response to the Bayes-theorem based analysis of the Decision Review System (DRS) used in cricket, which was posted on this blog a few months ago. He made some very pertinent points along with some brilliant comments on the game, which led to an informative exchange that will be carried in the next couple of cricket-related posts. Here is our the 2x2 color coded DRS matrix again for reference.

Raghavan notes:

".... I must question some of the steps in your analysis:

1. In your derivation you use  P(RED|OUT) = 0.95.  I think this is true
only if all decisions are left to DRS.  You have considered only those decisions that are deemed not out by the umpire and referred.  The 95% number does not hold for these selected cases.  It would be lower.  Here's the rationale:
There is a high degree of correlation between DRS and umpires decisions; understandably so, since all those "plumb" decisions are easy for both, the umpire and DRS.  Bowlers would rarely review these decisions.  For the 10% or so cases when the umpire rules the batsman not out incorrectly, the DRS would very likely have a lower accuracy than its overall 95%.  

2. If you assume the "red zone" in the picture is sufficiently small compared to 10%, you would get the accuracy of DRS being about 50% for the cases when the umpire incorrectly rules not out.  Now, this needs a bit of explanation.  

Let's assume that whenever the umpire rules out correctly, the DRS also rules correctly (well, at least close to 100% of the time).  Note that this does not include just the referrals, but also all the "easy and obvious" decisions that are not referred.  Since the overall accuracy of DRS is 95%, of the 10% that the umpire incorrectly rules not out, DRS also gets it wrong for half of those 10% cases giving an overall 95% accuracy.  In case the "red zone" corresponding to incorrect OUT decisions of the DRS is not close to zero, but say 2% (which is large in my opinion), the DRS accuracy in the bowler referred cases we are talking of would by 70% rather than 50%.  Still way lower than the 95% overall accuracy. [I have made some approximations here, but the overall logic hold]

3. Now, if you plug 70% instead of 95% in your next steps, you get P(OUT|RED) = 88.6%.  Nothing wrong with this number, except when you compare it with the 90% accuracy of umpires.  It's not apples to apples.  P(OUT|Umpire says OUT) is not 90% if you only considered referred cases.  It's actually a conditional probability:
P(OUT|Umpire says OUT, BOWLER REFERS).  I don't have enough information to estimate this, but I'm sure you'll agree it's lower than 90% since bowlers don't refer randomly.

4. I think the right comparison is between the what the final decision would be with and without DRS.
There is no doubt that umpire + DRS referrals improve overall accuracy of decisions.  I admit that false positives would increase marginally, which affects batsmen more than bowlers because of the nature of the game (a batsman has no chance of a comeback after a bad decision, while a bowler does).  But I think it is because of the way Hawk-eye is used today.

5. In my opinion, the
main problem with DRS is that its decision are made to be black and white.  There should be a reliability measure used.  A very rudimentary form of this currently used in LBW decisions.  For example, if the umpire has ruled out, to be ruled not out by DRS the predicted ball path has to completely miss the stumps.  But if the umpire has ruled not out, the predicted path should show that at least half the ball is within the stumps for the decision to be over-turned.  Eventually, I feel Hawk eye would be able to estimate the accuracy of it's decision.  I'm sure Hawk eye has statistics on it's estimates.  The standard deviation of the estimate would depend on several factors - (1) how far in from of the stumps has the ball struck the pads (2) How close to the pads has the ball pitched (hawk-eye needs at least a couple of feet after the bounce to track the changed trajectory), (3) Amount of turn, swing or seam movement observed.

If a standard deviation (sigma) can be estimated, then a window of say +/- 3*sigma could be used as the "region of uncertainty". If the ball is predicted to hit the stumps within this region of uncertainty then the decision should be out. Of course the more complicated it gets to explain to the viewer, the more resistance there would be to be accepted.  But if it is the right way, it will eventually get accepted.  Take DL method for example.  A vast majority of viewers don't understand it fully, but most of them know that it is fair.

6. There's another aspect that needs to be investigated.  It's about how the decision of the on-field umpire is affected by the knowledge that DRS is available.

Followups will be carried in a subsequent post. Blog-related emails can be sent to: dual[no space or dot here]noise AT gmail dot com, or simply send a tweet.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.