Friday, December 11, 2009

Analytics in Sport - Cricket and Operations Research - 1

Analytics in Cricket is not a new idea. A lot of OR folks, especially from the so-called Commonwealth nations, would be pleasantly surprised to learn that OR has been an integral part of cricket (in particular, the limited overs versions) for the last 16-odd years. This is because of the official induction of the Duckworth-Lewis rules for weather-interrupted matches into the rulebook. Mr. Duckworth and Mr. Lewis are OR/Math guys. According to cricinfo, the latter is/was the chairman of the western branch of the Operational Research society in the U.K. See this old article in ORMS Today on their work.

Cricket is more than a hundred years old, and is the second-most followed sport on this planet (thanks to more than a billion and a half cricket-mad fans from the Indian subcontinent, including this author). India is the No.1 test cricket team in the world today (after 77 years of hapless performance), and cricket is now big business that is growing in size, what with all the professional leagues like the IPL springing up. Among all sport, cricket mirrors life the most, and its rules suitably reflect this. It's best been described as having a cleverly disguised gentlemanly exterior which hides a series of fierce one-on-one gladiatorial contests of blood, guts, and stamina, which is then wrapped in a chess-game of wits and strategy, and tied up with strings of psychological tactics of 'mental disintegration'.

While D/L rules prescribe revised targets for rain-affected limited-over matches, these rules can also be used as a decision-support tool for teams to figure out an optimal trajectory to achieve the target during a run-chase. More generally, the run-chase problem can also be formulated as (stochastic) Dynamic Programming problem. Constrained resources include batsmen available (10), and overs available (20 or 50, depending on the whether its a T20 game or a 50-over game), with objective being to get at least one run more than the opposition before either of these resources are exhausted. At any stage of the game, the teams can tailor their tactics according to this optimal trajectory that can be recalculated after after every ball or over.

Stastical tools that analyze batting and bowling performances, and for stuff like SWOT analysis, have regularly been a part of cricket in recent times, much like baseball. A new idea proposed here is to analytically decide on how to make optimal use of the newly introduced review system, similar to the challenge system in tennis and perhaps the NFL. Statistically, cricket umpires tend to get 1 in 10 decisions wrong (very impressive given that theres 10 or more ways of getting out in cricket :-). At the highest skill-levels of cricket, i.e., country- versus-country test cricket that is played over 5 days, a bad decision can result in the team at the receiving end of this decision, getting pummelled into a defensive position for a couple of days under the hot sun.

It is quite likely that a probability model can be built around this idea. For example, given that there are 10 wickets available in an inning, using a geometric probability model,
prob(at least one error) = 1 - prob (getting all 10 right) = 1 - (0.9^10) = 0.65.
So there is roughly a 2/3 chance that the average umpire will get at least one decision wrong in a completed inning and this number can increase if the particular umpires in that match are known to be more error-prone. Given that a team is allowed only two (wrong) reviews, the idea is to come up with a reviewing strategy that ensures that you only challenge decisions that maximizes your team's expected advantage-level for the remainder of the game. If you use these challenges frivolously, you are left with no recourse later in an inning. If you do not use it at all, you are more than likely to suffer at least one bad decision per inning.

Interestingly, the international cricket council (ICC), which is the governing body of cricket states that with the review system in place, the statistical error-rate has improved to roughly 95%. This means that the probability of an error in a complete inning is reduced from 65% to 40%, implying that it is statistically more likely that there will no wrong decision in an inning with the review system in place that allows an on-field umpire to change a decision based on evidence from video and audio footage (upon request from either team).

Tennis players at grand-slam events tend to invoke this challenge during key times of the match ('big points'), where you are at a 'cliff'. e.g., at break-point. In the past we have seen great players (McEnroe at a certain French Open comes to mind), who've mentally lost matches from winning positions because of what they perceived to be an unjust call. However, doing so only at these points may not be optimal, since a highly-probable bad call at 30-30 by a line-judge is a good candidate for review. In other words, there is both a probability value as well as a consequence-cost (risk/reward) associate with this decision.

In most situations, these decisions have to made on-the-spot, with limited external-feedback available. Rather than rely wholly on instinct and emotions ("I'm sure I am not out"), it would be nice for the player under the spotlight to have some analytic ammo to go with bat or ball.

Having said all this, the great teams (like the ones from Caribbean in the 80s) and great players like the late Donald Bradman (who is considered to be the greatest Australian who ever lived), and currently, Sachin Tendulkar of India (who is a shoo-in for the greatest living Indian today), are the ones who raise their game to a higher-level when faced with adversity, and overcome such bad calls. On the other hand, a little bit of O.R could make this task a little less difficult.

- correction added: D/L rules have been used for nearly 16 years now.