Dualnoise: test cricket

Showing posts with label test cricket. Show all posts

Sunday, July 14, 2013

Analytics and Cricket - XI: Using DRS Optimally

The Ashes
The first Ashes cricket test that concluded earlier today ago in England triggered this post.

(pic source link: guardian.co.uk)

This post is again related to the Decision Review System (DRS) that combines machine and human intelligence to support evidence-driven out/not-out decisions in cricket. The previous cricket-related post can be found here where a reader critiqued an earlier post on the false positives issue of DRS (that saw a lot of visits last week during this cricket match). It's now apparent that England won the match thanks in part to their superior use of DRS compared to Australia.

DRS
The DRS consists of 3 components:
1. The human (a team of umpires, camera manners, and hardware operators)

2. The hardware (hot-spot, slow-mo cameras, snick-o-meters, etc). These are the data gathering devices.

3. The Analytics and Software (ball-tracking and optimal trajectory prediction, aka 'Hawkeye' based system)

This gives us three separate (but possibly correlated) sources of error:
a. Operator error

b. Hardware error: Technology limitations - resolution, video frame rates, hardware sensitivity, etc. may be inadequate at times for sporting action that occurs as fast as 100 mph or spin at 2000 rpm ...

c. Prediction algorithm error: Given such variations in sporting action, a forecast of the future trajectory of the ball is also subject to uncertainty.

(pic source link: dailymail.co.uk)

A smart user, after sufficient experience with the system, will be able to grasp the strengths and limitations of the system. In test cricket, a team is allowed no more than two unsuccessful DRS reviews per inning. Thus it is a scarce resource that must be cleverly used to maximize benefit. In fact, the DRS is an example of a situation where the use of a decision support system (DSS) itself involves decision-making under uncertainty using a meta optimization model.

Optimal Usage of DRS
There are several factors that dictate when the trigger must be pulled by a cricket captain to invoke DRS to try and overturn an on-field umpiring decision.
i. The probability of success (p)
ii. The incremental reward, given a successful review (R)
iii. The cost of an unsuccessful review = cost of status quo (set to 0, normalized)
iv. The expected future value of having no more than k reviews still available in the inventory (concave, decreasing f(k), with f(0) = 0).

Reviewing only based on (ii) is a like a "Hail Mary" and banks on hope. On the other hand, paying exclusive attention to (i) may not be the best approach either, since it can result in a captain using up the reviews quickly, reducing the chances of taking advantage of the DRS later when "you need it the most". A person who doesn't use DRS at all (or too late to have an impact) leaves unclaimed reward on the table.

Probability Model
We'll start with a simple model. It's not perfect, or the best, but merely a good starting point for further negotiations.

The value of do-nothing = f(k).

The value of a DRS review = p[R + f(k)] + (1-p)[f(k-1)] = pR + pf(k) + (1-p)f(k-1).

It is beneficial to go for a review when:
pR > (1-p)[f(k) - f(k-1)] or

p/(1-p) > [f(k) - f(k-1)]/R

i.e., odds must be greater than marginal value of a review / marginal reward

In other words:
it is good to review when the odds of overturning the on-field decision exceeds the ratio of the expected cost of losing a DRS review, to the expected incremental reward.

Use Case: for fifty-fifty calls (p = 0.5) with a single DRS review in the inventory, you would want to review only if you are convinced that the present reward is likely to exceed the value of not having DRS for the remainder of the innings. For a fixed reward, the RHS increases steeply after the first unsuccessful review due to the concave f. To be really safe, you want to risk a second and final unsuccessful review only when you can trigger a truly game-changing decision that greatly increases the chances of winning the match. In general, R may neither be a strictly increasing nor a decreasing function of time. This is especially true in limited-overs cricket where a game-changing event can occur very early in the game. However in soccer, baseball, or basketball, R can be reasonably approximated as an increasing function over time. In general, it makes sense to save the review for the end-game. In any sporting event, including cricket, which is heading for a close finish, it may be beneficial to delay the use of a review.

In the Ashes test, Michael Clarke, the Australian captain appeared to pay more attention to 'R' and less attention to 'f', and was left without recourse at a crucial stage, and this hurt his team. On the other hand, the England skipper Alastair Cook delayed the use of DRS: The last wicket of a closely contested match fell when the game had reached a climax (R = R_max), and was DRS-induced. Thus, optimally delaying DRS involves the constant assessing and updating of risk versus reward and pulling the trigger when the odds are in your favor.

Analytical Decision Support Systems
A smart organization will aware of the strengths, weakness, and value of DSS based decisions. In some industries that are characterized by shrinking margins, even small incremental gains in market-share or profitability using DSS can alter the competitive landscape of the market. This motivates an interesting question: If two firms employ the same decision analytics suite provided by the same vendor, does it necessarily cancel out? or like in cricket, can one firm do a better job of maximizing value from the DSS to gain a competitive advantage?

Updated July 17, 2013:
It turns out, that wicketkeeper Matt Prior was instrumental in ensuring England's good DRS strategy. As we all know, a good prior saves your posterior during crunch time!

Sunday, October 28, 2012

Analytics and Cricket - IX : Book cricket v/s T20 cricket

Introduction
This previous post on cricket in this tab can be found here. We discovered how long a game of snakes and ladders is expected to last a while ago. Calculating the duration of a book-cricket game appears to be relatively simpler. It's a two-player game that used to be popular among kids in India and requires a book (preferably a thick one with strong binding), a pencil, and a sheet of paper for scoring. A player opens a random page, and notes down the last digit on the left (even numbered) page.

(image linked from krishcricket.com)

A page value of 'zero' indicates that the player is out, and an '8' indicates a single run (or a no-ball). The remaining possibilities in the sample space {2, 4, 6} are counted as runs scored off that 'ball'. A player keeps opening pages at random until they are out. Here's a sample inning from a simple simulation model of traditional book-cricket:
6, 2, 2, 1, 4, 4, 1, 2, 1, 2, 4, 2, 1, 4, 6, 6, 6, 4, 0
score is 58 off 19 balls
The counting process terminates when the first zero is encountered. Given this game structure, we try to answer two questions: What is the expected duration of a player's inning, and what the expected team total is (i.e., across 10 individual innings).

Conditional Probability Model
Assume a page is opened at random and the resultant page values are IID (uniform) random variables.

Let p(i) = probability of opening page with value i, where
p(i) = 0 if i is odd, and equals 0.2 otherwise.

D = E[Duration of a single inning]

S = E[Score accumulated over a single inning]

Conditioning on the value of the first page opened, and noting that the counting process resets for non-zero page values:
D = 1*0.2 + (1+D)*4 *0.2
⇒ D = 5.
Next, let us compute F, the E[score in a single delivery]:
F = 0.2*(0+2+4+6+1) = 2.6 runs per ball, which yields a healthy strike rate of 260 per 100 balls

S = FD = 13 runs per batsman, so we can expect a score of 130 runs in a traditional book-cricket team inning that lasts 50 balls on average.

Introduction of the Free-Hit
The International Cricket Conference (ICC) added a free-hit rule to limited overs cricket in 2007. To approximate this rule, we assume that a page ending in '8' results in a no-ball (one run bonus, like before) that also results in a 'free hit' the next delivery, so the player is not out even if the number of the next page opened ends in a zero. This change will make an innings last slightly longer, and the score, a little higher. Here's a sample inning (a long one):
1, 6, 1, 0, 1, 4, 2, 2, 4, 6, 1, 6, 2, 4, 2, 4, 6, 6, 6, 4, 1, 1, 6, 0,
score is 76 off 24 balls
Note that the batsman was "dismissed" of the 4th ball but cannot be ruled 'out' because it is a free-hit as a consequence of the previous delivery being a no-ball. All such free-hit balls are marked in bold above.

D = 1*0.2 + (1+D)*0.2 + (1+D)*0.2 + (1+D)*0.2 + (1+d)*0.2
= 1.0 + 0.6D + 0.2d
where d = E[duration|previous ball was a no-ball]. By conditioning on the current ball:
d = (1 + d)*prob{current ball and previous ball are no-balls} + (1+D)*prob{current ball is not a no ball but previous ball was a no ball)
= (1+d)*0.2 + (1+D) * 0.8
⇒ d = 1.25+D

⇒ D = 1 + 0.6D + 0.2(1.25+D)
⇒ D = 6.25

Under the free-hit rule, a team innings in book-cricket lasts 62.5 balls on average, which is 12.5 page turns more than the traditional format. A neat way to calculate S is based on the fact that the free-hit rule only increases the duration of an inning on average, but cannot alter the strike rate that is based on the IID page values, so S = 6.25 * 2.6 = 16.25. To confirm this, let us derive a value for S the hard way by conditioning on the various outcomes of the first page turn:
S = 0*0.2 + (S+2)*0.2 + (S+4)*0.2 + (S+6)*0.2 + (s+1)*0.2
= 2.6 +0.6S + 0.2s.
where s = E[score|current ball is a no-ball] and can be expressed by the following recurrence equation:
s = (1 + s)*prob{next ball is a no-ball} + (r+S)*prob{next ball is not a no-ball), where
r = E[score in next ball | next ball is not a no-ball]
= 0.25*(0 + 2 + 4 + 6) = 3

Substituting for r, we can now express s in terms of S:

s = (1+s)*0.2 + (3+S) * 0.8
⇒ S = 2.6 + 0.6S + 0.2(3.25+S) = 16.25, as before.

Under the free-hit rule, the average team total in book cricket is 162.5 runs (32.5 runs more than the total achieved in the traditional format). The average strike rate based on legal deliveries, i.e. excluding no-balls, is 162.5 * 100/(0.8*62.5) = 325 per 100 balls. A Java simulation program yielded the following results:
num trials is 10000000
average score per team innings is 162.422832
average balls per team innings is 62.477603
average legal balls per team innings is 49.981687
scoring rate per 100 legal balls is 324.9646855657353

Result: In comparison to real-life T20 cricket (~ 120 balls max per team inning), book-cricket is roughly 50% shorter in duration, but the higher batting strike rate usually yields bigger team totals in book cricket. The fact that we can even rationally compare statistics between these two formats says something about the nature of T20 cricket!

The cost of front-foot no-balls and big wides in T20
We can use the simple conditional probability ideas used to analyze book-cricket to estimate the expected cost of bowling a front-foot no-ball and wide balls in real-life T20 matches by replacing the book-cricket probability model with a more realistic one:

Assume p[0] = 0.25, p[1] = 0.45, p[2] = 0.15, p[3] = 0.05, p[4] = 0.05, p[5] ~ 0, p[6] = 0.05, p[7, 8, ...] ~ 0.

E[score in a ball] = 0 + 0.45 + 0.3 + 0.15 + 0.2 + 0.3 =1.4

This probability model yields a reasonable strike rate of 140 per 100 balls)

E[cost | no ball] = 1 + 1.4 + 1.4 = 3.8

Bowling a front-foot no-ball in T20 matches is almost as bad as giving away a boundary (apart from paying the opportunity cost of having almost no chance of getting a wicket due to the no-ball and the subsequent free-hit). Similarly,
E[cost | wide-ball down the leg-side] = (5|wide and four byes)*prob{4 byes} + (1| wide but no byes)*prob{no byes} + 1.4.

Assuming a 50% chance of conceding 4 byes, the expected cost is 4.4. On average, a bowler may be marginally better off bowling a potential boundary ball (e.g., bad length) than risk an overly leg-side line that can result in 5 wides and a re-bowl.

More sophisticated simulation models based on actual historical data can help analyze more realistic cricketing scenarios and support tactical decision making.

Wednesday, October 6, 2010

Video analysis - followup to previous post on "Should Steve Smith have gone for the run-out?"

Updated Oct 7: youtube video.

As can be seen from the video footage of the last few minutes of the test match, Steve Smith came incredibly close to settling the issue by taking the initiative in a moment of cricketing chaos. He was fielding at point, and fired in the throw at a pretty acute angle - a bold gamble. He brushed against legend but it ended up being India's greatest sporting victory. Another great article by Australian sports writer Peter Roebuck, can be read here.

Tuesday, October 5, 2010

OR and cricket: Should Steve Smith have gone for the run-out? - India versus Australia, 2010

Nearly a year ago, the mathletics blog of Dr. Wayne Winston has an interesting analysis of whether Pats coach Bellicheck should have gone for it on a critical 4th down, and put the coach's decision down to his confidence in Tom Brady. Yesterday, some thing similar (but far more serious) happened on the last day (D5) of one of the greatest test cricket matches in history.

The result
Check out the amazing scorecard. In the 120-year history of test cricket, there have been only 12 such finishes.

The match situation
It is day 5 of the contest. About 26 hours of play time has passed and we are into the bottom of the 4th and final innings. India is batting and has lost 9 of their 10 wickets on a wicked last-day pitch with widening cracks and the ball spitting off the pitch. Many have gotten out to bouncers. Just an hour ago, India was down and out having lost 8 wickets with 92 runs still to get. A remarkable partnership between two injured players brings the match to a screaming knife edge. Australia requires 1 wicket to win. India needs 6 runs. At the crease is the injured Indian genius VVS Laxman overcoming back spasms with painkillers and with sheer grit, he is attempting to steal a miraculous win. But the one who is taking strike is P Ojha, a spin-bowler and India's no.11 player, who can't really bat. Bowling is Australian fast bowler Mitchell Johnson who can bowl past speeds of 95mph (there are others who can crank it up to 100mph).

The event
Here's cricinfo's description of the third-from-last ball of the match (edited version here):
Johnson to Ojha, 4 runs, 90.5 mph, Lbw Shout And oh boy what we get .. Four Over throws! That looked out. Was there some wood on leather? Oh well ... What an insane little game this is! .. Steve Smith fires the throw and the ball misses the stumps and runs through the vacant covers. No Aussie fielder could back that up. But that throw was on. Had he hit - and he didn't miss by much - Ojha would have been run out. .....

(india wins 2 balls later)

Post-mortem
Many cricket fans have criticized Steve Smith's decision to take a shy at the stumps. The Australian captain himself felt it was the right call and praised the rookie. If he had hit the wicket it would have been match over. Indeed several sports writers have called it a gutsy and worthy call. I know that if it was an Indian fielder instead, he would have been roasted by India's trigger-happy media.

Probability Model
Probability that australia wins = A. Probability that India wins = 1-A.
There are two scenarios:
1. If successful hit (probability p), then match over, australia win with probability 1.
2. If miss (probability 1-p) then there are two sub-outcomes:
2a. if fielder backing up, no overthrows, and australia win again with probability A (reset).
2b. if no protection, then overthrows, and australia win with probability A' < a =" p(1-A)"> A.

If (2b) is given:
p(win given hit) = p + (1-p)A'
This is statistically a good decision provided p + (1-p)A' >= A. A simple condition where this holds true would be if he were such a good fielder that statitically he hits the wicket more than 100A % of the time, i.e. if he felt that his chance of hitting the wicket was at least as good as the chance that Australia currently has of winning the game (fixing p = A would result in the LHS being > A)

Let's play with some numbers
1. Let's assume that with every run scored, the chance of success for Australia proportionally drops, so the cost of each overthrow run is A/6. For a worst-case 4 overthrows, this gives A' = A-4A/6 = A/3

2. so steve's accuracy rate had to satisfy:
p(1-A/3) >= 2A/3, or p >= 2A/(3-A)

3. If we assume that with just a single wicket to get, but only 6 runs to score, its anybody's game, and set A= 0.5. then steve's accuracy rate had to be atleast 1/2.5.

4. On the other hand, if you start with the premise that A is lower, say 25%, then steve only required an 18% accuracy to justify the throw. In other words, if you perceive that you have little chance of winning, then it is certainly a great idea to take the risk.

Modeling Ideas
You can also calculate 'A' in a more sophisticated manner by looking at the competing counting process to determine P(wicket falls before 6 runs are scored). The probability that a wicket falls in some 'n' balls is a geometric distribution. Each ball is a Bernoulli trial. The distribution of runs per ball could be Poisson. India scored 6 runs in the previous 3 overs (18 balls), so purely statistically, you can extrapolate that to say there's about 18 balls to get the last wicket. All said and done, a 50-50 chance is the most practical choice for 'A' here.

Conclusion
If Steve Smith was generally able to hit the stumps 40% of the time (i.e. slightly less than even chance), then it would have a good call from a statistical point of view. I haven't gotten a chance to review to video to see if it was an easy versus difficult angle to hit the wicket, but on average, 40% does look like a fairly high required conversion rate.

Statistically it was not a great call especially if he knew there was nobody to back up. All the pressure previously built up on the batsman was released. But cricket is played on the field and making a match-defining split-second decision after 26 hours of exhilarating play is no easy ask. No guts, no glory. It's the Aussie way and the glorious game of cricket is better off with that choice, (especially since India won :-)

Tuesday, July 21, 2009

Test Cricket rises from the Ashes

After the heartbreaking sight of seeing half-empty stadiums during the thoroughly exciting India_Aus test series in India, 2008, it is great to see the support in England for the ashes. T20 has its place for the instant-fun factor, but test cricket is the real deal. Facing multiple spells of hostile bowling from 'freddie' Flintoff at speeds up to 95mph, with the crowd against you, is quite a test, and a thrilling spectacle for everybody else. Test cricket at its best is a series of bone jarring, uncompromising gladiatorial contests involving brute strength as well as subtle guile, waged within a chess-series-like intellectual campaign at the higher level, all of which is encoded using myriad gentlemanly rules of engagement, spread over five days. After all the hard work, you may not get a decisive result at the end of it.

The whites of the uniforms and the greens of outfield dominate the view, but at its core, cricket is bloody red. More than any sport (golf comes close), the injuries to the psyche of a test cricketer are hardest to recover from, as Greg Chappel said. If you get out, you are out of the test match for a couple of days before a second shot at redemption, if at all there is one. Indeed, test cricket is the most realistic reality show the world has invented. And yes, you can tool for hours watching it. Lets hope for many more riveting contest during the ashes and may Test cricket prosper.