Showing posts with label cricket. Show all posts
Showing posts with label cricket. Show all posts

Sunday, January 18, 2015

Strike Rate Analysis of AB de Villiers' Innings of 149(44)

A quick blog on today's amazing knock by AB.

Here's the archive of Cricinfo's online commentary for the cricket match where South African batsman AB de Villiers scored the fastest ever ODI cricket century. Here are the stats. He batted just 44 balls for his 149 runs that included 16 sixes. Here's a graph of his cumulative strike rate throughout the innings in runs per 100 balls.

His highest cumulative strike rate was 400 after the first ball, and thereafter, the lowest it ever reached was 200, after 4 balls. His first fifty took 16 balls, and his second fifty was even faster, coming of just 15 deliveries, giving him the record 31-ball century. He scored no runs of the last two balls he faced, causing his final strike rate to dip below 350 and end at 339.

AB was at the crease for just 44 minutes. In comparison, the slowest international hundred (in test cricket, which can be very different) in history consumed 557 minutes.

As far as instantaneous strike rates throughout the innings, we can observe a 13-ball stretch (28th to the 40th ball) where he scored 63 runs to move from 82 to 145, at a strike rate of 485 - suggesting that in that period, like Viv Richards, AB was in two minds - whether to hit the bowler for a 4 or 6.

Here's a grainy YouTube video of today's innings.

 

The previous fastest ODI century consumed 36 balls, which yields a strike rate of less than 300. AB's final strike rate in this innings is higher than the expected book-cricket team strike rate of around 325.



Sunday, July 14, 2013

Analytics and Cricket - XI: Using DRS Optimally

The Ashes
The first Ashes cricket test that concluded earlier today ago in England triggered this post.
(pic source link: guardian.co.uk)

This post is again related to the Decision Review System (DRS) that combines machine and human intelligence to support evidence-driven out/not-out decisions in cricket. The previous cricket-related post can be found here where a reader critiqued an earlier post on the false positives issue of DRS (that saw a lot of visits last week during this cricket match).  It's now apparent that England won the match thanks in part to their superior use of DRS compared to Australia.

DRS
The DRS consists of 3 components:
1. The human (a team of umpires, camera manners, and hardware operators)

2. The hardware (hot-spot, slow-mo cameras, snick-o-meters, etc). These are the data gathering devices.

3. The Analytics and Software (ball-tracking and optimal trajectory prediction, aka 'Hawkeye' based system)

This gives us three separate (but possibly correlated) sources of error:
a. Operator error

b. Hardware error: Technology limitations - resolution, video frame rates, hardware sensitivity, etc. may be inadequate at times for sporting action that occurs as fast as 100 mph or spin at 2000 rpm ...

c. Prediction algorithm error: Given such variations in sporting action, a forecast of the future trajectory of the ball is also subject to uncertainty.

(pic source link: dailymail.co.uk)

A smart user, after sufficient experience with the system, will be able to grasp the strengths and limitations of the system. In test cricket, a team is allowed no more than two unsuccessful DRS reviews per inning. Thus it is a scarce resource that must be cleverly used to maximize benefit.  In fact, the DRS is an example of a situation where the use of a decision support system (DSS) itself involves decision-making under uncertainty using a meta optimization model.

Optimal Usage of DRS
There are several factors that dictate when the trigger must be pulled by a cricket captain to invoke DRS to try and overturn an on-field umpiring decision.
i. The probability of success (p)
ii. The incremental reward, given a successful review (R)
iii. The cost of an unsuccessful review = cost of status quo (set to 0, normalized)
iv. The expected future value of having no more than k reviews still available in the inventory (concave, decreasing f(k), with f(0) = 0).

Reviewing only based on (ii) is a like a "Hail Mary" and banks on hope. On the other hand, paying exclusive attention to (i) may not be the best approach either, since it can result in a captain using up the reviews quickly, reducing the chances of taking advantage of the DRS later when "you need it the most". A person who doesn't use DRS at all (or too late to have an impact) leaves unclaimed reward on the table.

Probability Model
We'll start with a simple model. It's not perfect, or the best, but merely a good starting point for further negotiations.

The value of do-nothing  = f(k).
The value of a DRS review = p[R + f(k)] + (1-p)[f(k-1)] = pR + pf(k) + (1-p)f(k-1).

It is beneficial to go for a review when:
pR > (1-p)[f(k) - f(k-1)] or

p/(1-p) > [f(k) - f(k-1)]/R

i.e., odds must be greater than marginal value of a review / marginal reward

In other words:
it is good to review when the odds of overturning the on-field decision exceeds the ratio of the expected cost of losing a DRS review, to the expected incremental reward.

Use Case: for fifty-fifty calls (p = 0.5) with a single DRS review in the inventory, you would want to review only if you are convinced that the present reward is likely to exceed the value of not having DRS for the remainder of the innings. For a fixed reward, the RHS increases steeply after the first unsuccessful review due to the concave f. To be really safe, you want to risk a second and final unsuccessful review only when you can trigger a truly game-changing decision that greatly increases the chances of winning the match.  In general, R may neither be a strictly increasing nor a decreasing function of time. This is especially true in limited-overs cricket where a game-changing event can occur very early in the game. However in soccer, baseball, or basketball, R can be reasonably approximated as an increasing function over time. In general, it makes sense to save the review for the end-game. In any sporting event, including cricket, which is heading for a close finish, it may be beneficial to delay the use of a review.

In the Ashes test, Michael Clarke, the Australian captain appeared to pay more attention to 'R' and less attention to 'f', and was left without recourse at a crucial stage, and this hurt his team. On the other hand, the England skipper Alastair Cook delayed the use of DRS: The last wicket of a closely contested match fell when the game had reached a climax (R = R_max), and was DRS-induced. Thus, optimally delaying DRS involves the constant assessing and updating of risk versus reward and pulling the trigger when the odds are in your favor.

Analytical Decision Support Systems
A smart organization will aware of the strengths, weakness, and value of DSS based decisions. In some industries that are characterized by shrinking margins, even small incremental gains in market-share or profitability using DSS can alter the competitive landscape of the market. This motivates an interesting question: If two firms employ the same decision analytics suite provided by the same vendor, does it necessarily cancel out? or like in cricket, can one firm do a better job of maximizing value from the DSS to gain a competitive advantage?

Updated July 17, 2013:
It turns out, that wicketkeeper Matt Prior was instrumental in ensuring England's good DRS strategy. As we all know, a good prior saves your posterior during crunch time!

Thursday, December 13, 2012

The King and the Vampire - 2: Cricketing Conundrum

This is the second episode in our 'King Vikram and the Vetaal' (Vampire) series, based on the simple but nice Indian story-telling format. Read the first K&V story here.


Dark was the night and weird the atmosphere. It rained from time to time. Eerie laughter of ghosts rose above the moaning of jackals. Flashes of lightning revealed fearful faces. But King Vikram did not swerve. He climbed the ancient tree once again and brought the corpse down. With the corpse lying astride on his shoulder, he began crossing the desolate cremation ground. "O King, it seems that you are generous in your appreciation for the analytics-based Duckworth-Lewis method used in weather-interrupted cricket matches. But it is better for you to know that there are situations where the D/L method invariably results in complaints, especially in T20 cricket. Let me cite an instance. Pay your attention to my narration. That might bring you some relief as you trudge along," said the vampire that possessed the corpse

(pic source link: pryas.wordpress.com)

The cricketing vampire went on:
In a recent Australian T20 match, the D/L method did not just perform badly, it actually failed. Here's why:
Team-A batted first, played terribly and made just 69 runs.
Team-B chasing 70 for a win, got off to a flyer and scored 29 of 2 overs before rain interrupted the match. However, the D/L  revised target for Team-B only comes into play when at least 5 overs have been completed by both teams... Anyway,  here's what transpired, per cricinfo:
".. Under the Duckworth/Lewis method the target for the Stars (Team-B) was recalculated. The calculation, which itself has been disputed, ensured that the Stars required just six runs from five overs. Even though the Stars had already reached and exceeded the target, given the D/L target had changed when overs were lost play needed to resume to set the revised target. Play resumed at 7.52pm after a minor delay. Hilton Cartwright bowled one ball to Rob Quiney, who allowed it to pass through to the keeper, and the match was over as the Stars had reached their revised target after 2.1 overs."

So tell me King Vikram, What is the correct result? Did the Stars win because they achieved the revised target, or should the points be shared because the required five overs were not completed?  Answer me if you can. Should you keep mum though you may know the answers, your head would roll off your shoulders!"

King Vikram was silent for a while, and then spoke: "Vetaal, unlike the last time, this is a tough one, so first consider this counter-factual: 

The Stars continue to play for another 1.4 overs, and are bowled out for 29. If the revised target when they were 29/9 was 30, then Team-A (Scorchers) would have won the match. Therefore, even though the Stars were temporarily ahead of the revised target, that target is not static since the minimum overs were not completed. The D/L based revised target is computed based on two resource constraints, taking into account the runs to be scored and wickets lost. It can change over time and the Scorchers still had a theoretical chance, however small, of winning the match by taking wickets.
However, here's another situation. Suppose Team-B was 29/9 after 2 overs, and the revised 5-over D/L target was 52. Team-B gets to 57/9 in 4.5 overs, hitting the last ball for a six before rains come down and and stop the game permanently. In this case, the D/L target cannot increase further, given that Team-B is already 9-down. In this case, Team-A has zero chance of winning the match. Team-B should be declared the winner even though the minimum overs have not been bowled.

1. For the current game, the points have to shared. This answers your specific question.

2. If a match ends before the minimum overs are completed, the chasing team can be declared the winner only if they have achieved a score that equals or exceeds the highest of all possible revised D/L targets that can occur for the fixed number of overs possible. In this example, Team-B would have been declared the winner if they scored at least 52.

3. However, I suspect that the cricket council will simply enforce the minimum-over rule as a hard-constraint that must be satisfied before the match can be decided in favor of one team over the other.

No sooner had King Vikram concluded his answer than the vampire, along with the corpse, gave him the slip.


(pic source link: omshivam.files.wordpress.com)

The Vetaal apparently agreed with Vikram's answer, but do you? If not, explain why. There's no penalty for trying!

Analytics and Cricket - X : Reader's Response to DRS Debate

It's getting increasingly difficult to post on cricket given that the Indian cricket team is getting ripped to shreds by half-decent opposition despite home-ground advantage. Of course, as noted in an earlier post, home courts can significantly increase the chance of a choke, and this may well be happening. Mahendra Singh Dhoni (if by some chance, still remains the captain of the Indian team after the current cricket series) can win a few more tosses if he can exploit this idea. Desperate times call for analytical measures!

Meanwhile, an astute reader emailed a detailed response to the Bayes-theorem based analysis of the Decision Review System (DRS) used in cricket, which was posted on this blog a few months ago. He made some very pertinent points along with some brilliant comments on the game, which led to an informative exchange that will be carried in the next couple of cricket-related posts. Here is our the 2x2 color coded DRS matrix again for reference.



Raghavan notes:

".... I must question some of the steps in your analysis:

1. In your derivation you use  P(RED|OUT) = 0.95.  I think this is true
only if all decisions are left to DRS.  You have considered only those decisions that are deemed not out by the umpire and referred.  The 95% number does not hold for these selected cases.  It would be lower.  Here's the rationale:
There is a high degree of correlation between DRS and umpires decisions; understandably so, since all those "plumb" decisions are easy for both, the umpire and DRS.  Bowlers would rarely review these decisions.  For the 10% or so cases when the umpire rules the batsman not out incorrectly, the DRS would very likely have a lower accuracy than its overall 95%.  


2. If you assume the "red zone" in the picture is sufficiently small compared to 10%, you would get the accuracy of DRS being about 50% for the cases when the umpire incorrectly rules not out.  Now, this needs a bit of explanation.  

Let's assume that whenever the umpire rules out correctly, the DRS also rules correctly (well, at least close to 100% of the time).  Note that this does not include just the referrals, but also all the "easy and obvious" decisions that are not referred.  Since the overall accuracy of DRS is 95%, of the 10% that the umpire incorrectly rules not out, DRS also gets it wrong for half of those 10% cases giving an overall 95% accuracy.  In case the "red zone" corresponding to incorrect OUT decisions of the DRS is not close to zero, but say 2% (which is large in my opinion), the DRS accuracy in the bowler referred cases we are talking of would by 70% rather than 50%.  Still way lower than the 95% overall accuracy. [I have made some approximations here, but the overall logic hold]


3. Now, if you plug 70% instead of 95% in your next steps, you get P(OUT|RED) = 88.6%.  Nothing wrong with this number, except when you compare it with the 90% accuracy of umpires.  It's not apples to apples.  P(OUT|Umpire says OUT) is not 90% if you only considered referred cases.  It's actually a conditional probability:
P(OUT|Umpire says OUT, BOWLER REFERS).  I don't have enough information to estimate this, but I'm sure you'll agree it's lower than 90% since bowlers don't refer randomly.

4. I think the right comparison is between the what the final decision would be with and without DRS.
There is no doubt that umpire + DRS referrals improve overall accuracy of decisions.  I admit that false positives would increase marginally, which affects batsmen more than bowlers because of the nature of the game (a batsman has no chance of a comeback after a bad decision, while a bowler does).  But I think it is because of the way Hawk-eye is used today.

5. In my opinion, the
main problem with DRS is that its decision are made to be black and white.  There should be a reliability measure used.  A very rudimentary form of this currently used in LBW decisions.  For example, if the umpire has ruled out, to be ruled not out by DRS the predicted ball path has to completely miss the stumps.  But if the umpire has ruled not out, the predicted path should show that at least half the ball is within the stumps for the decision to be over-turned.  Eventually, I feel Hawk eye would be able to estimate the accuracy of it's decision.  I'm sure Hawk eye has statistics on it's estimates.  The standard deviation of the estimate would depend on several factors - (1) how far in from of the stumps has the ball struck the pads (2) How close to the pads has the ball pitched (hawk-eye needs at least a couple of feet after the bounce to track the changed trajectory), (3) Amount of turn, swing or seam movement observed.

If a standard deviation (sigma) can be estimated, then a window of say +/- 3*sigma could be used as the "region of uncertainty". If the ball is predicted to hit the stumps within this region of uncertainty then the decision should be out. Of course the more complicated it gets to explain to the viewer, the more resistance there would be to be accepted.  But if it is the right way, it will eventually get accepted.  Take DL method for example.  A vast majority of viewers don't understand it fully, but most of them know that it is fair.

6. There's another aspect that needs to be investigated.  It's about how the decision of the on-field umpire is affected by the knowledge that DRS is available.
"




Followups will be carried in a subsequent post. Blog-related emails can be sent to: dual[no space or dot here]noise AT gmail dot com, or simply send a tweet.

Sunday, October 28, 2012

Analytics and Cricket - IX : Book cricket v/s T20 cricket


Introduction
This previous post on cricket in this tab can be found here. We discovered how long a game of snakes and ladders is expected to last a while ago. Calculating the duration of a book-cricket game appears to be relatively simpler. It's a two-player game that used to be popular among kids in India and requires a book (preferably a thick one with strong binding), a pencil, and a sheet of paper for scoring. A player opens a random page, and notes down the last digit on the left (even numbered) page.

 (image linked from krishcricket.com)

A page value of 'zero' indicates that the player is out, and an '8' indicates a single run (or a no-ball). The remaining possibilities in the sample space {2, 4, 6} are counted as runs scored off that 'ball'. A player keeps opening pages at random until they are out.  Here's a sample inning from a simple simulation model of traditional book-cricket:
 6, 2, 2, 1, 4, 4, 1, 2, 1, 2, 4, 2, 1, 4, 6, 6, 6, 4, 0
score is 58 off 19 balls
The counting process terminates when the first zero is encountered. Given this game structure, we try to answer two questions: What is the expected duration of a player's inning, and what the expected team total is (i.e., across 10 individual innings).

Conditional Probability Model
Assume a page is opened at random and the resultant page values are IID (uniform) random variables.

Let p(i) = probability of opening page with value i, where
p(i) = 0 if i is odd, and equals 0.2 otherwise.

D = E[Duration of a single inning]

S = E[Score accumulated over a single inning]

Conditioning on the value of the first page opened, and noting that the counting process resets for non-zero page values:
D = 1*0.2 + (1+D)*4 *0.2
D = 5.
Next, let us compute F, the E[score in a single delivery]:
F = 0.2*(0+2+4+6+1) =  2.6 runs per ball, which yields a healthy strike rate of 260 per 100 balls

SFD = 13 runs per batsman, so we can expect a score of 130 runs in a traditional book-cricket team inning that lasts 50 balls on average.

Introduction of the Free-Hit
The International Cricket Conference (ICC) added a free-hit rule to limited overs cricket in 2007. To approximate this rule, we assume that a page ending in '8' results in a no-ball (one run bonus, like before) that also results in a 'free hit' the next delivery, so the player is not out even if the number of the next page opened ends in a zero. This change will make an innings last slightly longer, and the score, a little higher. Here's a sample inning (a long one):
1, 6, 1, 0, 1, 4, 2, 2, 4, 6, 1, 6, 2, 4, 2, 4, 6, 6, 6, 4, 1, 1, 6, 0,
score is 76 off 24 balls
Note that the batsman was "dismissed" of the 4th ball but cannot be ruled 'out' because it is a free-hit as a consequence of the previous delivery being a no-ball. All such free-hit balls are marked in bold above.

D = 1*0.2 + (1+D)*0.2 + (1+D)*0.2 + (1+D)*0.2 + (1+d)*0.2
= 1.0 + 0.6D + 0.2d
where d = E[duration|previous ball was a no-ball]. By conditioning on the current ball:
d = (1 + d)*prob{current ball and previous ball are no-balls} + (1+D)*prob{current ball is not a no ball but previous ball was a no ball)
= (1+d)*0.2 + (1+D) * 0.8
d = 1.25+D

D = 1 + 0.6D + 0.2(1.25+D)
D = 6.25


Under the free-hit rule, a team innings in book-cricket lasts 62.5 balls on average, which is 12.5 page turns more than the traditional format. A neat way to calculate S is based on the fact that the free-hit rule only increases the duration of an inning on average, but cannot alter the strike rate that is based on the IID page values, so S = 6.25 * 2.6 = 16.25. To confirm this, let us derive a value for S the hard way by conditioning on the various outcomes of the first page turn:
S = 0*0.2 + (S+2)*0.2 + (S+4)*0.2 + (S+6)*0.2 + (s+1)*0.2
= 2.6 +0.6S + 0.2s.
where s = E[score|current ball is a no-ball] and can be expressed by the following recurrence equation:
s = (1 + s)*prob{next ball is a no-ball} + (r+S)*prob{next ball is not a no-ball), where
r = E[score in next ball | next ball is not a no-ball]
= 0.25*(0 + 2 + 4 + 6) = 3

Substituting for r, we can now express s in terms of S:

s = (1+s)*0.2 + (3+S) * 0.8
S = 2.6 + 0.6S + 0.2(3.25+S) = 16.25, as before.

Under the free-hit rule, the average team total in book cricket is 162.5 runs (32.5 runs more than the total achieved in the traditional format). The average strike rate based on legal deliveries, i.e. excluding no-balls, is 162.5 * 100/(0.8*62.5) = 325 per 100 balls. A Java simulation program yielded the following results:
num trials is 10000000 
average score per team innings is 162.422832
average balls per team innings is 62.477603
average legal balls per team innings is 49.981687
scoring rate per 100 legal balls is 324.9646855657353

Result: In comparison to real-life T20 cricket (~ 120 balls max per team inning), book-cricket is roughly 50% shorter in duration, but the higher batting strike rate usually yields bigger team totals in book cricket. The fact that we can even rationally compare statistics between these two formats says something about the nature of T20 cricket!

The cost of front-foot no-balls and big wides in T20
We can use the simple conditional probability ideas used to analyze book-cricket to estimate the expected cost of bowling a front-foot no-ball and wide balls in real-life T20 matches by replacing the book-cricket probability model with a more realistic one:

Assume p[0] = 0.25, p[1] = 0.45, p[2] = 0.15, p[3] = 0.05, p[4] = 0.05,  p[5] ~ 0, p[6] = 0.05, p[7, 8, ...] ~ 0.

E[score in a ball] = 0 + 0.45 + 0.3 + 0.15 + 0.2 + 0.3 =1.4

This probability model yields a reasonable strike rate of 140 per 100 balls)

E[cost | no ball] = 1 + 1.4 + 1.4 = 3.8

Bowling a front-foot no-ball in T20 matches is almost as bad as giving away a boundary (apart from paying the opportunity cost of having almost no chance of getting a wicket due to the no-ball and the subsequent free-hit).  Similarly,
E[cost | wide-ball down the leg-side] = (5|wide and four byes)*prob{4 byes} + (1| wide but no byes)*prob{no byes} + 1.4.

Assuming a 50% chance of conceding 4 byes, the expected cost is 4.4. On average, a bowler may be marginally better off bowling a potential boundary ball (e.g., bad length) than risk an overly leg-side line that can result in 5 wides and a re-bowl.

More sophisticated simulation models based on actual historical data can help analyze more realistic cricketing scenarios and support tactical decision making.

Tuesday, April 17, 2012

Analytics and Cricket - VIII: DRS & Bayes Theorem

In the last post on cricket, we mentioned that the false positive (F+) issue with the Decision Review System (DRS) employed in international cricket could be a deal-killer (see red zone in picture below).

In this post, we work out an illustrative numerical example using a well-known conditional probability model based on reasonable data derived from interviews of ICC personnel to show that the current F+ rate disproportionally reduces the efficacy of the DRS, causing it to operate only marginally more effectively that the human-only (umpire) method, and thus may not be worth the cost of maintenance unless the F+ rate is reduced to a more acceptable level.

For brevity, let's focus on bowler reviews in this example. A bowler will ask for a machine review of an umpire's original decision of not out, hoping to turn that into an 'out'. Umpires in the elite panel are themselves around 90% effective in making the right decision (so on average, only 10% of the subsequent DRS referrals should change the outcome if they work perfectly), so it is really that 10% gap that is the problem.

Today's cricket DRS system is claimed to be around 95% accurate in giving a batsman out, if in fact, the batsman is really out. Suppose the DRS also yields F+ results for just 1% of the bowler reviews, i.e. it gives a batsman 'out' when he is really 'not out' just like the umpire originally said. If 10% of the batsmen subject to bowler reviews are actually out (as obtained in the previous paragraph), what is the probability that a batsman is actually out given that the DRS overturns the umpire's decision to say he is out?

Answer: Let OUT be the event that the batsman reviewed is actually out (its complementary event is NOTOUT), and RED the event that DRS gave him out. The desired probability P(OUT|RED) is obtained using the Bayes formula by:

P(OUT|RED) = P(OUTRED)/P(RED)
Expanding out the terms, we can write this as
= [P(RED|OUT) x P(OUT)] /
[P(RED|OUT) x P(OUT) + P(RED|NOTOUT) x P(NOTOUT)]

= [0.95 * 0.1] / [0.95*0.1 + 0.01 * 0.9]
= 0.095/0.104 = 91%

Observations
1. Even a 1% F+ rate brings down the true efficacy of DRS, and it is not 95% as the ICC claims. The second term is a combination of F+ rate and human accuracy. Thus
P(OUT|UMPIRE SAYS OUT) = 90%
If the bowler asks for DRS review:
P(OUT|DRS SAYS OUT) = 91%
Not much of an improvement

2. The better the umpires get at their job, the worse the existing DRS will statistically perform. For example, if the umpires improve their upon their accuracy by just one percentage point, i.e. to 91%, the conditional accuracy of DRS changes to:
= [0.95 * 0.09] / [0.95*0.09 + 0.01 * 0.91]
= 90%


Thus, the tables are turned now and the DRS makes things worse for batsmen here and we may be better off not using DRS at all even if it is provided free of cost!

This second result may seem puzzling. Why does this happen? If the umpires get better, the frequency of true NOTOUT is 1% higher, and with the F+ rate held constant at 1%, there will be an increase in the total number of false positives over a period of time, in addition to a small decrease in count of true positives, thereby reducing the accuracy rate of the DRS.

You can plug in a variety of numbers to see what the corresponding results are. You can also perform a similar analysis for batsman reviews.

Recommendations:
1. Significantly cut down on the F+ rate and not just focus purely on increasing true positive rate

2. Improve the quality of original human decisions. This will reduce the dependence on DRS, encourage improvements in the DRS to keep pace, and obviously improve player attitude toward umpires.

3. If a brilliant cricketing instinct filled person like Mahendra Singh Dhoni talks about 'adulteration of human and machine', do think twice about it, he's got a useful math model behind this statement!

Reference: Introduction to Probability Models by Sheldon M. Ross. This example is a variation of an example from this book. Hope I did not mangle it.

Saturday, February 25, 2012

Analytics and Cricket - VII : Does DRS have a False Positive issue?

The last post related to cricket was quite a while ago (that the Indian cricket team has been repeatedly thrashed since then is a mere coincidence). This post focuses again on the Decision Review System (DRS), a technology-aided analytical decision-support system to aid cricket umpires. The toolkit includes a set of multiple video cameras,  heat-sensing 'hot spot' technology, and ball-tracking devices that record the point of impact, as well as an additional set of predictive algorithms to forecast the counterfactual trajectory of the cricket ball (you can be forecasted 'out' in cricket). Despite the best efforts of cricket's custodians, considerable user unease with the DRS persists. In fact, it has been recently acknowledged that the use of the decision support system has had a significant impact on the game (user response: altering playing styles and inducing more 'OUT' decisions from umpires), something which this tab predicted a year ago.  Reasons for discomfort also include the lack of uniformity in its deployment, the incremental dollar cost of the DRS versus incremental returns, and equally importantly from a fan and player perspective, DRS reliability (both real and perceived). This post will focus on the last two issues.

The International Cricket Conference (ICC) has focused almost exclusively on improving the technology (e.g. increased number of video frames per second, etc). The main argument here is that while an improvement in the unconditional success rate for the DRS may seem impressive, it would be more helpful if statistics are calculated and presented conditional on the corresponding human decisions made. Toward this, let's look this MBA-ish 2x2 decision matrix (sorry). Strictly speaking, the terms 'correct' and 'incorrect' in the matrix mean 'almost surely correct' and 'almost surely incorrect', respectively .




1. The ICC has a wonderful set of umpires in their 'elite panel' that referee the most important inter-nation test matches (these elite umpires are a scarce resource, and their globe-trotting schedule optimization is yet another operations research problem - perhaps a good topic for part-8 of this series). Prior to the DRS, the umpires achieved a respectable success rate of more than 90%. Consequently in such situations, the DRS getting it right is a relatively uninteresting event. This situation is denoted as the neutral zone (top-left box). Therefore the focus is on the remaining 7-10% of the time when the decisions are contentious.

2. Clearly the case where the umpire is wrong and the DRS is right (as judged by video and predicted-trajectory evidence) is a win-win for the DRS and players. This is the green-zone (bottom left) and appears to be the exclusive area of ICC's focus as far as technological improvements. However, it is not necessarily desirable to accord top priority to the goal of achieving further improvements in this statistic.

3. The problems arise when the DRS occasionally produces visibly and audibly confounding results. This is represented by the top-right box, the 'high conflict zone'. In some instances, it could be because of technological gaps or operator error (there was a recent example where an umpire whose sole job consisted of watching the TV replay and hitting one of two buttons managed to hit the wrong one). However, in other instances, the predictive component of the DRS that is used to probabilistically judge LBW (leg-before-wicket) 'OUT' decisions appeared to be flawed or incompatible because:

a. Greater the required length (or duration) of the predicted values, the more noisier the forecasted trajectory.
b. Lesser the observed portion of the ball trajectory available for 'training' (especially after spinning and bouncing off the cricket pitch), the less reliable the prediction.

The years of prior refereeing experience of the umpire, and other human cognitive powers that help him arrive at the decision is pitted against hardware and algorithmic prediction prowess. The challenge is to be able to be aware of the many degrees of freedom involving a rotating cricket ball in motion while also taking into account the effect of the cricket pitch and local conditions.

4. There may be rare irritable cases where despite best efforts, uncertainty prevails and both the umpire and DRS manage to get it wrong (bottom right box).

If the ICC can provide data on the frequency of observations that fall in each of these 4 boxes, we can of course calculate the conditional probability of a correct decision given the DRS response using well-known conditional probability models and compare with the corresponding results for the manual system. For example, how likely is it that the batsman is actually OUT given that the DRS overruled an umpire's original 'NOT OUT' decision? Such analyses helps figure out the impact of false positives and false negatives that comprise the conflict zone observations. In particular, the false-positive rate, i.e. the case where a batsman tests positive ('OUT') using a DRS when he is actually NOT OUT, should be minimized given the nature of this sport.

Recommendations
The biggest stumbling block appears to the the top-right box (high-conflict zone) that erodes user trust every time the DRS wrongly overrules what appears to be a sound cricketing decision by the umpire. As a priority, the ICC should isolate and eliminate those components that increases the occurrence of such situations. The likely candidates for culling will be the trajectory-predictor and existing flawed versions of 'hot spot'. These innovations should be reintroduced at a later stage only after sufficient improvements have been made (and while also keeping the resultant cost down) to ensure that the expected failure rates are well under control. Viewed from this perspective, a recent decision by the Indian cricket board to do away with the predictive component of the ball-tracking technology is actually the right one.

@dualnoise on twitter

Saturday, March 27, 2010

Analytics and Cricket - II : The IPL effect

This is second in the series of articles on O.R. and cricket. Click here for the first part, done a while ago.

The Indian Premier League (IPL) is close to becoming the number one Indian global brand - not just the number one sports brand. It has overtaken past colonial stereotypes (such as snake charmers, elephants, and Maharajahs), current pop stereotypes (IT outsourcing brands like Infosys, Wipro, et al, knowledge-brands like the IIT graduate, etc). The two newest franchise teams unveiled in this fledgling three-year old league were purchased for $333M, costing more than a couple of current NHL teams. Sports has become big business, even as the cricket fan in me rebels against this. Several owners have 'Bollywood' connections. Not surprising, given that these movie types make so many expensive flops year after year, the risk level for a cricket venture is surely much lower.

This IPL season is on YouTube now after a pioneering deal with Google, and this experiment serves as a nice dress rehearsal for the search engine company toward more such live streaming ventures in the future. In terms of audience size, it's easily a factor of ten-twenty bigger than that for NCAA basketball. India has a lot of cricket-crazy people. I've provided the YouTube link for my favorite match of the tournament so far: Bangalore v Mumbai. This is the shortest form of cricket played where each innings lasts twenty overs and the entire game is completed in three hours.



We will cover two new analytical induced innovations observed in this season's IPL.

First, the number of run-outs (analogous to a baseball strike-out where a player doesn't make it to a base in time) seems to have increased dramatically. Why? It looks like team statisticians have noticed that a traditionally weak area of teams is fielding and the probability of a direct hit on the stumps is low. This reduces the risk of getting run-out and the reward for stealing an additional run against statistically poor fielding teams may be well worth the risk. Teams that do not improve their fielding will probably see this hit-probability decrease. Teams will take more chances against you and more members in your team will have the opportunity to show-case their non-athletic, keystone kops-like fielding prowess leading to a deterioration in stats. Conversely, good fielding teams can improve their hit-probability stats and reap the reward in terms of effecting more run-outs. Teams of both kinds can be seen. The ones adopting better fielding standards are at the top of the points table.

A second analytic innovation is the form of a special T-20 (twenty-over cricket) bat and is now the most famous mongoose in India (that's the brand name for this bat). It has a handle as long as the blade itself, with the total length of the bat itself being constant. Statistics show that in this form of the game, oftentimes, half a bat is often better than a full-one, if optimally designed! Don't believe it? See this YouTube clip of Matt "the bat" Hayden, the first player in the IPL to use this bat. He is certainly not going to be the last.



So why is the mongoose effective? In the most serious form of cricket (test cricket), a full bat is a must. It's a longer game (over 5 days) and the chances of getting out is much, much higher over time and you want a bat as large as a barn door to prevent the ball from disturbing your stumps. From the T20 perspective, the ball travels the longest when it hits the sweet spot of the bat (roughly three-fourth of the way down a bat), and combined with the fact that getting out in T20 is not such a big deal, you end up with the mongoose, which is essentially just a long handle and a reinforced lower half, like a pendulum. It's made of wood just like the traditional bat, just as long, and roughly the same weight. For a given period of time at the crease, you are more likely to get out using the mongoose, but the expected number of runs (specifically in the form of hitting sixers) you could score before that happens can be much higher, thus making it an attractive trade-off in certain T20 match situations.

Friday, December 11, 2009

Analytics in Sport - Cricket and Operations Research - 1

Analytics in Cricket is not a new idea. A lot of OR folks, especially from the so-called Commonwealth nations, would be pleasantly surprised to learn that OR has been an integral part of cricket (in particular, the limited overs versions) for the last 16-odd years. This is because of the official induction of the Duckworth-Lewis rules for weather-interrupted matches into the rulebook. Mr. Duckworth and Mr. Lewis are OR/Math guys. According to cricinfo, the latter is/was the chairman of the western branch of the Operational Research society in the U.K. See this old article in ORMS Today on their work.

Cricket is more than a hundred years old, and is the second-most followed sport on this planet (thanks to more than a billion and a half cricket-mad fans from the Indian subcontinent, including this author). India is the No.1 test cricket team in the world today (after 77 years of hapless performance), and cricket is now big business that is growing in size, what with all the professional leagues like the IPL springing up. Among all sport, cricket mirrors life the most, and its rules suitably reflect this. It's best been described as having a cleverly disguised gentlemanly exterior which hides a series of fierce one-on-one gladiatorial contests of blood, guts, and stamina, which is then wrapped in a chess-game of wits and strategy, and tied up with strings of psychological tactics of 'mental disintegration'.

While D/L rules prescribe revised targets for rain-affected limited-over matches, these rules can also be used as a decision-support tool for teams to figure out an optimal trajectory to achieve the target during a run-chase. More generally, the run-chase problem can also be formulated as (stochastic) Dynamic Programming problem. Constrained resources include batsmen available (10), and overs available (20 or 50, depending on the whether its a T20 game or a 50-over game), with objective being to get at least one run more than the opposition before either of these resources are exhausted. At any stage of the game, the teams can tailor their tactics according to this optimal trajectory that can be recalculated after after every ball or over.

Stastical tools that analyze batting and bowling performances, and for stuff like SWOT analysis, have regularly been a part of cricket in recent times, much like baseball. A new idea proposed here is to analytically decide on how to make optimal use of the newly introduced review system, similar to the challenge system in tennis and perhaps the NFL. Statistically, cricket umpires tend to get 1 in 10 decisions wrong (very impressive given that theres 10 or more ways of getting out in cricket :-). At the highest skill-levels of cricket, i.e., country- versus-country test cricket that is played over 5 days, a bad decision can result in the team at the receiving end of this decision, getting pummelled into a defensive position for a couple of days under the hot sun.

It is quite likely that a probability model can be built around this idea. For example, given that there are 10 wickets available in an inning, using a geometric probability model,
prob(at least one error) = 1 - prob (getting all 10 right) = 1 - (0.9^10) = 0.65.
So there is roughly a 2/3 chance that the average umpire will get at least one decision wrong in a completed inning and this number can increase if the particular umpires in that match are known to be more error-prone. Given that a team is allowed only two (wrong) reviews, the idea is to come up with a reviewing strategy that ensures that you only challenge decisions that maximizes your team's expected advantage-level for the remainder of the game. If you use these challenges frivolously, you are left with no recourse later in an inning. If you do not use it at all, you are more than likely to suffer at least one bad decision per inning.

Interestingly, the international cricket council (ICC), which is the governing body of cricket states that with the review system in place, the statistical error-rate has improved to roughly 95%. This means that the probability of an error in a complete inning is reduced from 65% to 40%, implying that it is statistically more likely that there will no wrong decision in an inning with the review system in place that allows an on-field umpire to change a decision based on evidence from video and audio footage (upon request from either team).

Tennis players at grand-slam events tend to invoke this challenge during key times of the match ('big points'), where you are at a 'cliff'. e.g., at break-point. In the past we have seen great players (McEnroe at a certain French Open comes to mind), who've mentally lost matches from winning positions because of what they perceived to be an unjust call. However, doing so only at these points may not be optimal, since a highly-probable bad call at 30-30 by a line-judge is a good candidate for review. In other words, there is both a probability value as well as a consequence-cost (risk/reward) associate with this decision.

In most situations, these decisions have to made on-the-spot, with limited external-feedback available. Rather than rely wholly on instinct and emotions ("I'm sure I am not out"), it would be nice for the player under the spotlight to have some analytic ammo to go with bat or ball.

Having said all this, the great teams (like the ones from Caribbean in the 80s) and great players like the late Donald Bradman (who is considered to be the greatest Australian who ever lived), and currently, Sachin Tendulkar of India (who is a shoo-in for the greatest living Indian today), are the ones who raise their game to a higher-level when faced with adversity, and overcome such bad calls. On the other hand, a little bit of O.R could make this task a little less difficult.

- correction added: D/L rules have been used for nearly 16 years now.

Monday, June 15, 2009

T20 world cup crash

India was on par with England after 19.4 overs. Harbhajan then bowled a 5-wider with yuvraj misfielding. In fact Harbhajan had two of those in an otherwise excellent spell. The match was probably lost there. A dubious selection - Ishant Sharma is an excellent test/ODI bowler who shouldnt be playing T20 cricket - hasnt done much in this format and should have been replaced by Nehra or I.Pathan.

Anyway, its been an overdose of T20 cricket - even ODIs look attractive now ...

Monday, June 8, 2009

English Cricket momentum

English Humor on cricinfo blogs is back after the t20 victory over Pak. Excerpts:

" ... Physicists among you will know that momentum is the product of mass and velocity. When Rob (Key) propelled himself along the Lord's outfield, those two ingredients were present in abundance. If England can recreate that moment and harness the momentum, they'll win every match for at least the next 10 years .."

Dadaism redefined -
Do people still believe in dada?

another funny sound-bite on tv from the t20wc is the 'yahoooooo' call from the stadium-DJ after every few overs. this event is sponsored by the internet company. And this sporting headline is mildly funny if u read between the lines:
"Boxers off to winning start at Asian Championships"

In the twitter era, headlines are all that's needed. the actual stuff that follows is not going to be read anymore. Finally, a huge thanks to t20wc commentators in England. After the IPL mass-regurgitation, they are under no pressure to do an encore here, and the focus is back on cricket, even if its just 40 overs.

Monday, June 1, 2009

Women Reporters in Cricket

The last few years have witnessed the emergence of many women who have gotten involved with cricket in some form. Let's start at the bottom and work our way up....

Most Indians cannot forget Mandira Bedi (some, not quickly enough). On the other hand, I did like the tricolor peas pulao in her old TV ad. One of the positive side-effects of the IPL has been in this area. We have a couple of rich 'Bollywood' glam-acts who own stakes in IPL teams. Then we had Ms. Tishani Doshi, a dancer (or is that danseuse?) / poet/journalist born and based in Chennai wo-manning one of the many IPL blogs on cricinfo. While she is no cricket expert, the view from the distaff side made for some interesting reading. A rather unexpected piece of cricket-related writing came from Rebecca Lee, one the 'mischief gals', hired as cheerleaders for home-team Bangalore for IPL-II. The blog by Ms. Lee was quite interesting, in that it comes from an American, trans-cultural perspective. Furthermore, their comments on the effort level and mental state of the Bangalore team while it put together a nice, long winning streak before the inevitable final tragedy was noteworthy. While I'm not a fan of this whole cheer-leading stuff, I do hope these gals return next season and blog some more.

IPL aside, we have more cricket-aware, serious lady writers/speakers today than ever before. Many of the cricketers who participated in the recent Women's world cup have blogged regularly on cricinfo during that tournament. It was a pleasant surprise to listen to the commentary on the recent Aus-South Africa (men's) ODI cricket series in SA. One of the persons on the commentary panel was Kass Naidoo (i think), and she was pretty good. Certainly better than all the crappy ex-cricketers throwing up en-masse into the mike during the IPL.

A personal favorite is Sharda Ugra, Deputy Editor of India Today. Her regular cricket columns (such as 'free hit') are among the best in the sports journalist business. She calls a chuck a chuck, and reminds me of Mary Carillo (Tennis), albeit less controversial. Among other things, she has worked with John Wright, former Team-India coach on his wonderful book 'Indian Summers' that captures many moments of the renaissance years (2001-2004) of Indian cricket. She is also the winner of the 'best sports writer of the year 2006' (India).

i've probably left out many more, and we'll have to end with this:
Behold, Mandira and Sharda,
meal-ticket and sticky-wicket
there's room for all in desi cricket,
but fitting 'em to rhyme is harda

Thursday, May 7, 2009

The longest millisecond

Suresh Raina's talent is dazzling, as is Rohit Sharma's. The Latter took a hattrick yesterday, while the former came up with an amazing bowling cameo. In an inspired move, Dhoni brought him in to bowl the 15th and the 17th overs (in a match reduced to 18 overs) at a point when Yuvraj and Mahela were belting sixes like crazy. He came up with this gem:
SK Raina 2 overs, 8 runs, no wickets

12 balls without a single boundary when everybody else was getting hit quite easily during the death overs. How did he do it? This tab has a theory.

He appears to have developed a 'feinting' bowling technique - that can be visually explained best by looking at examples like the famous Brazilian soccer player Socrates before he used to take penalties, or the Pakistani hockey player Shabaz's (or Ronaldinho's) deft dodge-moves at the last instant, leaving a defender behind. Every delivery appears to be a game played out in the milliseconds during which the batsman or the bowler blinks first and commits to a move. Raina is able to delay his delivery that millisecond more, and the results are amazing. It has worked so far, and will work until he comes across a batsmen who can stay still that fraction longer. Would be interesting to see him bowl to Sehwag who has the ability to stay still the longest and spot the bowler's intent before the rest of the stadium catches on.

Sports at the highest level, always appears to be played in these milliseconds.

Sunday, May 3, 2009

IPL-II: Ball of the Tournament

The credit goes to the young Indian, Sudeep Tyagi of Chennai SK, who missed out on the first IPL due to a stress fracture. Cricinfo has a nice piece on him here. The quality of TV commentary has hit a nadir during the IPL and it is painful to see past cricket legends and ex-players donning the role of peanut hawkers. They were so busy playing vending machine that they spent too little time on this ball (cricinfo scoreboard)

AB de Villiers b Tyagi 0 (1 ball)

AB is probably the best of the young international cricketers who have already made their mark in the tough world of Test Cricket. Coming fresh of his success in previous IPL matches, the first ball he received from Tyagi shaped away, but then hit the seam and came in to rattle the castle. Tyagi of course, had just gotten a wicket the previous ball. An outstanding performance on IPL debut and I hope he is consistent enough to play for India in the future. Dhoni and Warne certainly seem to inspire the rookies.

The second ball of the tournament was also bowled in this match:

TM Dilshan b Jakati 13 (13 balls)

- this time by Indian spinner Shadab Jakati, who took out TM Dilshan, another in-form batsman. Both these players were clean bowled on defensive strokes in a T20 match. Good cricket is still alive! Too bad the third-rate commentary (except Bhogle) is too busy selling everything other than the many good cricketing moments (To to be precise, 'many' is relative to IPL-I that was played on flat Indian tracks with no bounce). 'DLF Maximum Swindle' and 'Citi Moment of Madness' would be apt for these two morally and financially bankrupt IPL sponsors. sigh.

Home team Bangor, err, i mean Bangalore, has risen from the ashes and has smartly picked an Indian-South African combination. 7 of the 8 teams are now within a point of each other, so its a dogfight. With classy Indian test players in the form of Kumble and Dravid, B'lore may still make it to the semis - but they will need to win 4 of their remaining 6 contests. For this, they have to keep their current hot streak alive...

And finally, Buchanan must resign. Its not the same without Warne, McGrath, and Gilchrist, is it?

Thursday, April 30, 2009

IPL-II: The Indian cricket perspective

One of the plus points of this IPL version is that its taking place in South Africa, and therefore it provides us a nice way of evaluating the cricketing quality of non-internationals (Dhoni himself has played very little here). The pitches are more balanced, with mishits seldom going for a six, unlike the Indian edition of IPL last year. The number of skiers have been astonishing and the percentage of high catches that have been dropped, even more so.

The young Indian spinners look good, what with the spin-friendly pitches that seem to abound. On the other hand, the fast bowling department looks a mixed bag. The rookie Kamran Khan probably needs to work on his action. One is hoping for a comeback from the smiling Balaji (he appears to be among the wickets again), while RP Singh looks adequate. Malinga and Nannes appear to be streets ahead, and the Aussies are on international duty and havent played yet. In terms of batting, Raina in particular looks awesome, and should be picked for Test cricket right away. Rohit Sharma showed that he can bat well anywhere in the world if he puts his mind to it, while Yusuf Pathan is only limited by the air density and the boundary distance. Dhoni will improve his batting, keeping, and captaincy skills from this experience and this will only help India in the long run. The rest of the young guns have been inconsistent and haven't made much of an impact so far.

The most embarrassing aspect of Indian cricket continues to be the fielding. We are unfit for 21st century sport! Ravi Jadeja and Raina are the stand-outs here. The BCCI needs to do some serious remedial work here.

Let's hope that as the IPL progresses, some of the Indian colts can put up their hands and shine, but for now, there are few test match prospects besides Ojha (spin) and Raina (batting).

A couple of other notes - The umpiring has been poor in a few of games last week.. As predicted earlier on this tab, the ego-ride for the Kolkata team-owner is coming close to its natural end. Hopefully he will sell it off asap to a better owner and walk off with the profits and stay away from cricket forever.

Wednesday, April 29, 2009

A tale of three T20 cricket leagues: ICL, IPL, and the APL

The BCCI has all of a sudden announced an amnesty for the ICL players from India, thereby opening the way for other cricket boards to do the same. This is a very interesting move since the BCCI does not do anything unless it sees money in it. Does the fact the ICL is in the process of morphing into the American Premier League (APL), courtesy of a Mr. Jay Mir alarm the IPL (i.e., the BCCI)? Reports indicate that a huge fraction of the ICL players have signed on to the APL, which can be a potential money spinner since it can bring T20 cricket to the U.S. The US cricket association itself was a fractured group for quite a while and besides its a free country here. Unless BCCI offers some sops to us here, the APL may come to stay. U.S passion for security means that ICL players will feel much more comfortable playing here than in India (sadly).

While its possible that a wolf has turned into a lamb, it will not be unsurprising if change of heart is in fact related to the APL, and if so, the BCCI deserves to sweat it out, given the high-handed vengeful way in which decent cricketing folks associated with the ICL were shunned over the world. Perhaps its payback time, and it will be interesting to see how the ICC, and the BCCI sort this mess out.

Saturday, April 18, 2009

IPL cricket second edition - day one

Two of India's greatest batsman showed their class while the other Indians generally struggled to cope with alien conditions in South Africa, where IPL-II is being staged. The contest is more even due to the nature of the wickets, so the cricket content is going to be more enjoyable than the first edition. Sachin and Dravid anchored successful batting efforts for Mumbai, and my home town, Bangalore, respectively. The interesting moment today was when Dhoni blundered - leaving Murali out of the 11 on a day when Harbhajan bowled really well to Hayden, Kumble took 5-5, and Warne showed his genius. Dravid's classy fifty was marked with his clearly pointing his bat at somebody in the crowd (hopefully asking Hoochman Mallya, the distasteful, pompous owner of his team to just shut up).

Other IPL news involves yet another self-serving vestigial coach, this time in form of John Buchanan spouting some b.s about not one or two, but 5 captains. On assumes the remaining 6 are vice-captains (Kolkata socialism makes its impact :-). Gavaskar, never one to mince words, pointed it out, and J.B tried to rephrase. It's not that big a deal really. India has played with 5-odd (ex-) captains, while Pak in the 90s played with 6-8 of them. Dhoni routinely lets bowlers set fields.

Monday, April 6, 2009

number 182

No fielder (besides the keeper, of course) had taken more than 181 catches in the history of test cricket until yesterday. Rahul Dravid took a couple yesterday in Wellington, New Zealand to touch 183. All the more impressive considering that Mark Waugh held that record, and he was one outstanding fielder. The unbelievable catch that he held to dismiss VVS Laxman (youtube) during India's run-chase in the Chennai Test of 2001 made me quite sick as an indian cricket fan.

Hopefully he can shed his hangdog/stonewalling batting method that's crept in since 2006, quite unlike his 'gritty but positive' batting between 2001-06, both in ODIs and tests. Would love to see the real Dravid at least once before he retires...

Courtesy of 'The Hindu', here's a beautiful photo of the impending 182nd brick in the wall !

Wednesday, April 1, 2009

IPL Safari cop-out

The only good news about holding the cricket junk food league outside India is that the young Indian players waiting in the wings will gain exposure to conditions outside the subcontinent. This will definitely help Indian cricket in the long run. Otherwise, India caves in as usual. No surprise. Cave-in as a strategy hasn't worked for, um... about 1000 years now since Ghazni. Yeah, lets give it some more time, who knows.

Saturday, March 21, 2009

Setting the record straight

New Zealand was in fact India's final frontier as far as the era of bad tourists. Yesterday, India posted its first test cricket win in NZ in 33 years. With this win, India has won at least one test match on every major cricket playing nation's soil in the last 5-7 years.

Rewind to 2002 so that the plight of the indian cricket fan can be put in perspective:
India has not won a test match in Pakistan ever
India has not won a test match in the West Indies since 1976
India has not won a test match in England since 1986
India has not won a test match in Australia since 1981
India has not won a test match in New Zealand since 1976
India has not won a test match in South Africa ever
India has not won a test match in Sri Lanka since 1994

By far, the most important victory in Indian cricket was over Pakistan in 2003-2004. It laid to rest several myths both on and off the field, none more notable than Sehwag's treble in course of which the wizard Saqlain's 'teesra' also went for a six, ending a great career. Saqlain of course broke Indian hearts in that tragi-heroic run chase of '99 in Chennai, with Sachin finally getting that monkey of his back only just a couple of months ago with that dramatic 4th innings century/win against England. India-Pakistan matches tend to make or break some careers. A brief, but incomplete history here:

Zaheer Abbas and Co. terminated the Indian spin era - Bedi, Pras in 1978.
Miandad all but finished off Chetan Sharma's career with that six in 1986.
Wasim Akram ended Srikkanth's test career in 1989.
Sachin all but ended the careers of Akram/Younis after that 2003 world cup match.

Of course, there are several other lesser known players who were sacrificial lambs after a loss to Pakistan or India.