Thursday, December 30, 2010

Analytics and Cricket - IV: The Great Indian Coin Toss

We'll end this year with the humblest of analytical models - the coin toss. It is an important benchmark. After all, if your predictive business model can consistently outperform a coin-toss approach, then that could be a big deal in many practical situations. So what do we make of the Indian cricket captain Mahendra Singh Dhoni's (MS for short) performance with the coin? He's lost 13 of the last 14 trials!

A coin toss can be a big deal in cricket, since a 'win' allows you to decide whether to bat or bowl first. A 'flat' wicket means it's a great one to bat on and make best use of it, and the opposition gets to play on the same pitch after potential wear and tear. A 'sticky wicket' or overcast conditions on a 'green' pitch means bowling first could be a great option since batting will be difficult for the first few hours due to the 'swing' and 'seam' movement potentially available to the bowlers.

Die-hard cricket fans like me and players are among the most superstitious in the world due to the long and complex nature of the game. MS gets blamed for "losing" the toss and he's even asked for tips on improving his record :) Useful analytical models are nice to have, but they could go horribly wrong, especially when applied to cricket ... Before the sports fan begins to question his faith in science and even doubt the fundamental idea of Bernoulli trials and the law of large numbers, we note that if MS had lost 14 tosses in a row, that would have been an extreme "achievement" since the probability of that happening would have been roughly 60 in a million, and that did not happen. Phew! that counts as favorable evidence.

With the India-South Africa cricket series tied at 1-1, and with one test match to go, we have no choice but to seek solace in the scientific estimate that our fearless captain still has a 50% chance of winning the toss in Cape Town. I know that doesn't sound encouraging. But there's got to be a point in time when nature is going to bring that win-loss average back close to 50%. Will that happen in 2011? who knows ...

In 2010, MS had several ways of losing 13 of the 14 tosses. More simply, he had 14 ways of winning exactly one toss. We know that the probability of winning m of n tosses follows the Binomial distribution, and we can find out online here that the chance of losing 13 out of 14 is still tiny, at 0.00085. In other words, the chance of him winning 2 or more tosses in 2010 was greater than 999/1000, and yet that did not happen!

Like most great teams, this current Indian cricket team does not depend much on the outcome of the toss. Put into bat on a green, bouncy wicket under overcast conditions, they still managed to defeat RSA in 4 days and displayed amazing skill and resilience in the process. Still, it wouldn't hurt to begin the final match between No.1 in the world (India) and No.2 in the world (RSA), starting on Jan 2, 2011, by winning the coin-toss. If MS loses that toss, then the probability of this extended streak over 15 trials would be around 0.0004, i.e., 50% less than the already dismal number he is at today. Surely, that's unlikely, right? Let's see. What is the probability of the sequence that ends with him winning the toss on Jan 2, i.e., the chance that he wins exactly 1 of the first 14, and then win the 15th? Sadly, that's not very different. Delving into the past does not help the Indian sports fan, and talking to statisticians would not help since none of them wants to see such a rare streak end :)

It is better to look forward to the new year, where 2010 is done and dusted. We can say it again: MS has a 50% of winning the next toss, and relatively speaking, that looks so much more promising and simpler to comprehend.

Happy New Year and Go India!

Wednesday, December 8, 2010

The shortest path between OR jobs

Driving from my old job in the Burlington, MA area to Elmsford, NY (near my new job location at Yorktown Heights) took less than 3 hours. It seemed like a race-course full of caffeine-high jihadi drivers after all those leisurely strolls through the somnolent country roads of Maine. I got a newer GPS product (yet another Garmin) from an e-tailer. Having worked on retail pricing during the past four years, and this being a pre-Black Friday deal, I almost reflexively asked for a price match and sure enough - there was 60$ in savings to be had after pushing against some soft constraints. Since this was Garmin's latest version in the series it wasn't discounted on BF, so it turned out to be a pretty decent deal in the end.

The GPS product, on its short maiden voyage from MA to NY decided to take me through no fewer than four interstate highways: I-95, I-90, I-91, and I-84. The route seemed simpler on paper. I quickly realized that newer does not necessarily mean better. The re-routing algorithm is still ancient even though the newer one allegedly takes traffic congestion into account. To avoid extensive re-calculation of the shortest path in real time, the product continues to merely finds the quickest way to get back to plan. This is an approach typically used in airline online crew recovery ops (even though fancier global optimization algorithms have been available on paper). In general, this is not a bad idea as long as you don't wander off deep into the reservation. Forcing a recalculation enables you to recover the faster (optimal?) route, and my ETA dropped by about 10 minutes. The newer version has an "EcoRoute" option that allows you to find minimal cost paths, in addition to the standard metrics based on distance and time. Looks like you can also plan a trip having multiple intermediate nodes. That looks like a nice TSP structure. An analysis of these new features makes for an interesting post on another day.

Wednesday, November 24, 2010

2G scam followup: True opportunity cost of misallocating scarce resources

Please see the most recent post for the preliminary analysis of this scam. This is a follow up tab posting. Per this article on rediff.com:

" .. The Comptroller & Auditor General has calculated in his official report that the exchequer lost the truly mind-boggling sum of Rs 176,645 crore (Rs 176.64 billion) .. "

So in case there was any well-intentioned doubt that the 1.76*10^12 number was cooked up, it is now very clear that this number is (sadly) official. Actually, i would expect the number to be even higher, when you compare the true opportunity cost (due to a miserably and deliberately bad mis-allocation) relative to the value of optimal allocation.

When we read about scams like this, we realize how important it is that solid OR models be built to perform exploratory studies and simulations be run prior to allocating almost priceless resources. The supreme court of India said that "the 2G scam puts all other scams [in the history of India] to shame". When so many in India are dying of starvation and are homeless, such giga-squandering of public money by a corrupt government is nothing short of a 'monetary holocaust'.

It must be made mandatory for governments and public organizations at any level to conduct an appropriate OR analysis before allocating any scarce resource that belongs to the public. If the government of India had funded an OR group to spent a exaggerated and gigantic (or microscopic if u compare with the final loss) sum of 10 million $ for an OR analytical study, it would have paid for itself many, many times over. Well-run OR projects typically cost much less while providing incredibly impressive value measured in terms of incremental-benefit/project-cost return ratios (read the Woolsey papers for more on this).

Side note
Statistically, #barkhagate is turning out to be the most continually tweeted phrase in virtual India. Ever. It is trending so hot, you can make a virtual omelet there. Social media is making its presence felt in a very real way wrt real world issues in the largest democracy in the world, and consequently, the manipulative mainstream English media in India that had previously closed ranks on this topic so far, is now being forced to cover this critical news.

Monday, November 22, 2010

Measuring the impact of corruption via OR models - the 2G scam in India

The recent 2G spectrum scam in India has taken corruption to epic levels. Large-scale theft is now being expressed as a percentage of India's GDP for convenience of notation. The amount of taxpayer money siphoned off due to the nefarious actions of certain senior ministers that resulted in an inefficient (non system-optimal) resource allocation wrt the 2G spectrum is estimated at 1760000000000 Rupees (1 US$ ~ 45 Indian Rupees), or 1.76 Trillion Rupees. This seems to be a conservative estimate.

If we compare the value of the corrupt allocation with that of the true system-optimal allocation, I wonder if that loss estimate would be even higher?

This large rupee number is something one usually throws out wildly, except that in this case, it is shockingly close to fact. Furthermore, well-known award-winning cable-news journalists (marketed as fair and balanced) have been implicated by an angry public and audio-tapes have surfaced that seem to allegedly point to their dual role as information-sharing lobbyists, working as mediators between coalition partners of the government to ensure a cover-up, as well as scripting and stage-managing TV shows and news articles to alter public opinion. This has been dubbed 'barkhagate' on the Internet - yet another a cliched 'gate' scandal, but this scandal makes Nixon look like an Eagle boy-scout. Twitter-istan is abuzz with #barkhagate.



Obama, during his recent visit to India, referred to the Indian Prime Minister as his 'Guru', partly due to the PM being an economics professor in a past life. Should he now be called the GGuru? He once was an admired man for pioneering India's economic reforms in the 1990s. Sadly, along with that has come scam after scam, and many in India get the feeling that the actual powerful core within the ruling coalition have this 80+ year old ex-professor set up as a fall guy for their series of epic embezzlements (2G is just the latest).

The fair bandwidth resource allocation problem is a very, very interesting OR challenge. Several cool mathematical models, including combinatorial auction, along with clever Benders decomposition based solution approaches have been invented to solve the resultant discrete optimization formulation (e.g., winner determination problem)

So how does corruption impact such OR models? It is an important as well as an interesting question that deserves more formal attention. If corruption is modeled explicitly within a model, then efficiency, cost-minimization, and revenue maximization are no longer the real objectives. Shadow prices and reduced costs will be misleading. Objective function cost coefficients are inflated or discounted based on the intent of the scam. A machine's throughput may be far less that what shows up on paper due to its unknown, substandard quality. The data will be really messy. Ethics-driven regulations and their corresponding constraints will be missing. By definition, optimization algorithms seek out extreme values and push the envelope. Unethically used, such methods will help maximize corruption.

Dubious organizations may simply place the blame on OR models and the analytics, rather than on the crooked ones who misuse it. Like journalism, whose reputation largely lays in tatters, corruption in analytics will have a devastatingly negative impact on the public perception of mathematicians and OR folks who have won respect as truth-seekers. Once lost, such hard-earned goodwill is almost impossible to regain. As OR people, we have a responsibility, both natural and inherited, to maintain high ethical standards and actively seek the truth (or in OR practice, 'the best obtainable version of the truth' as Carl Bernstein would say). After all, the entire theory of optimization and duality is ultimately based on the notion of fairness and rationality. The insidious noise that undermines fair duality has to be recognized early enough, and must be filtered out.

A question will be posted on OR-exchange to initiate a discussion on this important topic.

Sunday, November 14, 2010

OR practice tip: find and eliminate unnecessary constraints

A great example to illustrate this vitally important piece of practical OR would be this classic 1980's movie scene from the 'Policy Academy':



Always knew that cutting all those redundant classes at St. Joseph's in the 80s to watch a projection of a linear "English" comedy in the relaxed atmosphere within the bounds of the adjacent Brigade Road theater in Bangalore, India served the dual purpose of preparing one for a OR career. Is OR great or what :-)

There are many practice instances where a customer has been 'blindly' following the constraint "because". Opening the eyes of your customer (especially the upper management) to this fact could be a huge value add on your part. Furthermore, when it comes to optimization models, such insight into the actual business problem sometimes enables us to bypass a strongly NP-Hard MIP and instead work with something relatively simpler, like an integer knapsack formulation.

Take this wonderful real-world example from the book "The Art of Innovation", where IDEO was redesigning a major medical instrument for heart patients during balloon angioplasty. See the Google books excerpt here. The key observation here was that everybody assumed that the instrument was "supposed" to be operable by one hand. Why? well presumably because the old instrument makers marketed it as such, and after many years it became a "design constraint". By noting that the other hand of the operator was pretty much idle while firing up this instrument, the designers were able to eliminate this unnecessary constraint. This led to a much saner and user-friendly design that also helped eliminate the scary 'ratcheting' sound that used to come out of the older instrument as it booted up, which used to scare the gowns off heart patients! The new product eventually ended up as a win-win for both patients and therapists.

In practice, there are constraints, and then there are constraints.

Tuesday, November 9, 2010

MIP feasible completions and the wheel of fortune

First the 'talk of the town'. The miraculous wheel of fortune solution using a single letter.



Initial feedback suggests rigging etc, but to OR'ers this 'hole in one' should not come as a drastic surprise. After all, this occurrence is infrequent but not impossible. One can pose this puzzle as an MIP (or solve using constraint programming), using binary variables to represent letter-choices for every blank, along with additional constraints that ensure that words are selected from a dictionary, while also ensuring that the sentence is grammatically correct, and so on. Of course, it would be a rather unwieldy MIP and a pure generic-solver approach may take forever. However, by exploiting the language structure and leveraging our learning from prior experience with the kinds of phrases that typically show up in WOF, it should be possible to significantly reduce the number of combinations to be explicitly explored. Note that only one such feasible solution is the right answer to the original WOF puzzle and to guarantee this, one usually needs more letters to be exposed to break ties. Isn't that how the miracle workers at Bletchley park cracked the Enigma codes in WW2 and began solving 'puzzles' fast enough to make that information usable?

Real-world MIPs often have such hidden structures that our customers understand far better than pure OR types. I was constantly impressed that the resident real-time crew schedulers at United Airlines would routinely come up with remarkably good quality feasible crew pairings (partial schedules) at the drop of a hat that would make us OR PhDs look so stuffy! Such crew pairings have to satisfy a myriad of FAA and company-negotiated "nonlinear and non-convex" work rules to confirm feasibility.

Another important aspect of real world decision problems is that the line between feasibility and infeasibility is often blurred. For example, look at the snippet from this email i received recently. It's one of those that get forwarded around and comes back to haunt your inbox once every two years.

"
I cdnuolt blveiee that I cluod aulaclty uesdnatnrd what I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it dseno't mtaetr in what oerdr the ltteres in a word are, the olny iproamtnt tihng is that the frsit and last ltteer be in the rghit pclae. The rset can be a taotl mses and you can still raed it whotuit a pboerlm. This is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the word as a wlohe. Azanmig huh? Yaeh and I awlyas tghuhot slpeling was ipmorantt! If you can raed this forwrad it "

Posed as an MIP, almost every one these partial solutions (words) is tragically infeasible, and yet, can be perfectly interpreted by the end user in real-time and combined into a wholly understandable paragraph.

The point is that many abstract MIPs may be hard to solve, but in almost all real life instances, there is plenty of additional information from the real world that typically helps us generate meaningful business answers via such math models. If we do it the right way, NP-Hardness rarely leads to a business issue. The combination of good OR skills, MIP solvers, and domain expertise can solve complex business decision problems in real-life quickly enough to let customers gain a tangible competitive advantage.

Monday, November 1, 2010

Power-point has no place in an analytics presentation

Most of us have heard about the 'paralysis by power point' in the US Army and how it has resulted in miscommunication and a lack of attention to detail. The display of statistics and results has become a scientific discipline in itself, and for us O.R./analytics practitioners, there is much to learn, and quickly.

Most of us in the world of O.R. run our optimization models, simulations and statistical programs and once we are done, we pay scant attention to how it is presented to an executive or non-technical audience. Boring and static charts, mind-numbing M x N matrices of numbers culled from spreadsheets accurate to the 3rd decimal place, all embedded within power-point slide after lifeless slide only serves to underwhelm the audience. Worse, it threatens to undo all the months of hard work we OR types have put in and undermine the cool results we obtained. The audience tends to shut down and fall asleep after the first couple of ppt slides. The art of the analytical presentation is by far the most neglected aspect at O.R. graduate programs, where unlike the real world, a PhD (candidate) only present results to another PhD, and then mostly within the same department.

O.R. does not end with model building and numerical results. It ends only when we can de-mystify analytics so our customers can truly comprehend what all this means to them in the limited amount of time we have to make our case. Toward this, smart people are coming up with innovative ways of displaying data, results, and statistics. For example, you may not grasp what "4.3689 meters" really means, but if I told you "twice the height of Kareem Abdul-Jabbar", that would give you a better picture.

Let's look at three great examples of presentations of analytical and statistical content.

Exhibit One: Hans Rosling, founder of gapminder, doing a presentation that in less in 20 minutes of power-packed slides and animation, gives the audience a fantastic and insightful overview of socio-economic and standard-of-living data for the world from the past (all the way from 1858) to the present. He then extrapolates this information to predict future economic prospects of key Asian countries (India, China, Japan) relative to the U.S. and the U.K. Watch it till the end. There is a wealth of useful information packed into each slide that integrates into a vivid narrative that is easy to understand. Within a few minutes, he has the audience eating out of his hand.



Exhibit 2: This is a simpler one that in a single picture shows the true size of Africa in way that most of us immediately grasp. The 'relative size' approach again works well. As a side note, the way the different countries fit into the continent of Africa seems to be a great approximate solution to the corresponding non-convex set-packing problem!

Exhibit 3: The well known single chart of Napoleon's disastrous Russian campaign of 1812-1813. Recognized by many as the "best statistical graphic ever drawn". It tells you pretty much everything relevant to the topic. Now imagine the mayhem that would have been caused by using 57 power-point slides filled with numbers and separate charts for attrition, time, temperatures, geography, etc. to show this same thing.

Monday, October 25, 2010

Here we go again! Bees and TSP

First the original story today doing the rounds on the Internet on bees solving a TSP while traversing a sequence of flowers for nectar. Their approach resembles the ant-colony / swarm-optimization approach, and while this 'bee story' is truly astounding, the author further states the computers have to explore every possible path to select the best and that takes days. Obviously, they did not hear about Operations Research, and the fantastic work done on inventing efficient algorithms for solving gigantic and difficult TSP problems to provable (near-) global optimality in quick time. The fantastic book on TSP by Dr. Applegate, Dr. Bixby, et al. gives us a fascinating blow-by-blow account of the work on this topic from the TSP past to the present. You can even try out their Concorde solver. The state-of-the-art is truly amazing and based on decades of breakthrough ideas.

Yet, article after article that seem to come out every couple of years assumes that if a problem is NP-Hard, then it is almost "impossible to solve efficiently in practice" and enumeration is the only way. Nope. A pure application of software programming and hardware strategies certainly does not work or scale. OR leads the way.

An incredible amount of success in OR practice has come via successfully tackling precisely such large-scale and ultra-complex NP-Hard problems. Large-scale TSP-structured problems have been routinely solved in Supply-chain planning, vehicle routing, aircraft routing, etc. Not only are problems solved well and quickly, but we also know how well and how much improvement is still possible. These small optimality gaps matter. A TSP solution that is 1% closer to optimality may save a company another million dollars.

Heuristics of unknown quality based on bees and ants are nice, but they don't really tell you if there is another solution that could your save your company another 10%. And if you were to change your input parameters, it may return an entirely different solution that makes such approaches largely unpractical for use within products. And adding a constraint the prohibits certain paths may well make such approaches useless. This Tab has discussed the practical problems with using such heuristics in previous posts here, and you will find more in Dr. Trick's OR Blog when TSP was "solved" last year by creatures with even smaller brains, i.e. bacteria :-)

There's even TSP art. Check this out as well. Let's give O.R. and humans some credit please!

Saturday, October 23, 2010

Maximizing Dignity: The Traveling Angel Problem

Can one person feed 400 hungry mouths in the temple city of Madurai in South India, every day of the year?

Mr. Krishnan is a good human being. In today's world, we have to start looking for those first before we start looking for heroes. I was happy to see that he was nominated for the "CNN hero of the year" award for this work. Hopefully this publicity will translate into more help for his operation. The talented Mr. K, at the age of 22, was all set to be a five-star chef in Switzerland seven years ago, but chucked all that when he saw an hungry old man in his home town doing the unthinkable. Since then, without taking a single day off, he is up at 4am to cook tasty and fresh vegetarian food. He then drives around the town of Madurai (most famous for it's amazing Meenakshi Amman Kovil - the temple of a 1000 pillars) feeding the destitute and the hungry. He often hand-feeds them and gives them haircuts along with a meal. His focus is on restoring the dignity of a human being. Is that not fundamental? Simply put, in this temple city, Krishnan is the Goddess Annapoorani to these castaway people. Contribute generously to the Sakthi foundation's Askhaya trust if you can (paypal accepted too. cool). Even a small amount goes a long way.



O.R. is the far less important part here, but one wonders how this amazing person chooses his 125 mile driving route? He has a limited quantity of freshly prepared food stocked in his van that has to reach 400 desperately hungry mouths all over his city as effectively and efficiently as possible. OR should be helping in such practical situations. We can't always be for and about profit-maximization models for corporations, weapons and target acquisition for the military, or abstract academic problems. The logistics of Mr. K's 'meals on wheels' operation contains a classical vehicle routing / traveling salesman problem element. A more efficient operation and utilization of scarce resources means that more people can be fed and more shattered lives can rebuilt and dignity restored. The google map of the city of Madurai is shown below.


View Larger Map

Who will optimize Krishnan's supply chain?

Monday, October 18, 2010

Chakravala - Decision Analytics in Ancient India

More than two thousand years ago, many priests in India had to double up as mathematicians. They practiced the Sanathana Dharma, 'the eternal way of ethical living' (or Hinduism as it is popularly known today). Hinduism is richly influenced by nature, as well as the earthly and celestial elements, and fire rituals were quite important in those times. Careful attention was paid to the geometry of the altar, since different shapes were required depending on the objective of the ritual. This naturally gave rise to analytics, and a 'textbook' in those days (800-200 BCE) was the 'Sulba Sutras' to help figure out the correct angles and lengths to optimally design them (that lead to the discovery of Pythagorean triples and trigonometry, among other things). Inevitably, the beautiful natural patterns inherent in numbers awoke the inner geek in some of these priests.

Among the many famous mathematicians who carried forward this rich Vedic tradition was an astronomer named Brahmagupta (~ 600 CE). He extensively explored solutions to linear Diophantine equations that are central to integer programming today. Of course, his bigger distinction is for 'much ado about nothing'. He is known to be the first human being to clearly define, publish, and use zero as a number! He also explored Diophantine equations of the second degree, and came up with ideas that led to a recursive, iterative solution method for such equations; again something that is very useful in modern numerical optimization. He generalized an idea discovered by Diophantus and used this to achieve some success in finding solutions to Pell's equation. This was generalized to the Chakravala (Sanskrit for 'cyclic') algorithm by Jayadeva (950CE) and Bhaskara II (1100CE). A key subroutine at an iteration involves a neat rational scaling operation, followed by the solving of a simple discrete optimization problem that finds an integer m, such that it minimizes |m2N|/k, where N and k are input parameters. Furthermore, they recognize that such problems have degenerate solutions, and in some cases, find minimal integer feasible solutions using this approach. This attention to detail toward handling numerical issues stands out. Recognizing and tackling degeneracy is at the very heart of modern decision-analytics practice!

The Chakravala turns out to be an easy-to-use iterative method to find good approximations for square roots of integers. Indeed, this approach has been recognized for its ingenuity and 'careful simplicity' that allows us to work with well-conditioned real numbers; ideas that we in the OR community know are critical to matrix refactoring within a successful dual simplex implementation, for example.

Tuesday, October 12, 2010

Popular Mechanics and the Frankenstein design principle

Popular Mechanics is a favorite magazine for the sheer variety of mechanical gizmos it throws up inside its pages (check out the Jaguar supercar). It is a permanent fixture in many auto-service shop waiting rooms where we spend a good part of a lifetime. The October 2010 issue of Popular Mechanics was a pleasant surprise - not one but two OR-related topics were covered. The first one was a short article on optimizing waiting experience in queues. The metriclastic US population (is that a new word?) that shuns neat divisions by 10, rightly resents having to spell Q using 5 letters, one of which is Q, and simply calls it a 'line'. On the other hand, 'line theory' is not too informative. Pop-mech doesn't mention OR explicitly, but we know that queuing theory and OR are inseparable.

The queuing article (online version here), among other things, mentions that a smart researcher in Taiwan, Pen-Yuan Liao, derived an equation to compute a 'Balking Index' that tells you when and how many customers are likely to flee a long line and 'defect' to a better one in a multi-Q system. Obviously, information like this helps determine optimal staffing levels to meet the required service levels, minimize costs, and improve the customer experience. Apart from just analyzing people standing in line, queuing models have many and diverse applications and is a whole field of study. There's also some expert comments in the magazine article by Dr. Richard Larson from MIT. His name would be familiar to the OR fraternity.

The second article talks about risk management in the context of the Gulf-Coast oil spill. This tab's summary of the article from an OR perspective is this: When it comes to designing and operating complex systems, there needs to be a greater emphasis on managing conditional expectations associated with low-probability high-consequence events. This is in addition to tracking traditional risk metrics (minimizing expected cost, probability of failure, etc), i.e. we should be tracking multiple risk objectives, something i recall working on many years ago as a grad student.

The Frankenstein principle
Dr. Petroski, a civil engineering professor at the Duke University quotes in the article, "When you have a a robust system, you tend to relax". And it's proven to be true, sadly. BP continually pushed the risk-envelope to boost profits without paying adequate attention to the associated safety trade-off. In part, this was due to a false sense of safety in a historically robust system. There's always a first time, and shockingly, there was no plan 'B' in the event of a catastrophic failure. Bhopal, Chernobyl, Gulf coast and a few more like these have occurred in just the last 30 years. BP engineers were left having to prove that components would fail rather than answer the question "is it safe to operate?" - two completely different propositions. This practice of hastening project completion by placing an unfair burden of proof on the scientists and engineers may be widespread.

The Frankenstein design principle extends Murphy's law. It simply states that if for some crazy reason, you want to build something monstrously complex, then at least design it assuming apriori that at some point it will fall apart and come back to snack on your aposteriori.

Wednesday, October 6, 2010

Video analysis - followup to previous post on "Should Steve Smith have gone for the run-out?"

Updated Oct 7: youtube video.

As can be seen from the video footage of the last few minutes of the test match, Steve Smith came incredibly close to settling the issue by taking the initiative in a moment of cricketing chaos. He was fielding at point, and fired in the throw at a pretty acute angle - a bold gamble. He brushed against legend but it ended up being India's greatest sporting victory. Another great article by Australian sports writer Peter Roebuck, can be read here.


Tuesday, October 5, 2010

OR and cricket: Should Steve Smith have gone for the run-out? - India versus Australia, 2010

Nearly a year ago, the mathletics blog of Dr. Wayne Winston has an interesting analysis of whether Pats coach Bellicheck should have gone for it on a critical 4th down, and put the coach's decision down to his confidence in Tom Brady. Yesterday, some thing similar (but far more serious) happened on the last day (D5) of one of the greatest test cricket matches in history.

The result
Check out the amazing scorecard. In the 120-year history of test cricket, there have been only 12 such finishes.

The match situation
It is day 5 of the contest. About 26 hours of play time has passed and we are into the bottom of the 4th and final innings. India is batting and has lost 9 of their 10 wickets on a wicked last-day pitch with widening cracks and the ball spitting off the pitch. Many have gotten out to bouncers. Just an hour ago, India was down and out having lost 8 wickets with 92 runs still to get. A remarkable partnership between two injured players brings the match to a screaming knife edge. Australia requires 1 wicket to win. India needs 6 runs. At the crease is the injured Indian genius VVS Laxman overcoming back spasms with painkillers and with sheer grit, he is attempting to steal a miraculous win. But the one who is taking strike is P Ojha, a spin-bowler and India's no.11 player, who can't really bat. Bowling is Australian fast bowler Mitchell Johnson who can bowl past speeds of 95mph (there are others who can crank it up to 100mph).

The event
Here's cricinfo's description of the third-from-last ball of the match (edited version here):
Johnson to Ojha, 4 runs, 90.5 mph, Lbw Shout And oh boy what we get .. Four Over throws! That looked out. Was there some wood on leather? Oh well ... What an insane little game this is! .. Steve Smith fires the throw and the ball misses the stumps and runs through the vacant covers. No Aussie fielder could back that up. But that throw was on. Had he hit - and he didn't miss by much - Ojha would have been run out. .....

(india wins 2 balls later)

Post-mortem
Many cricket fans have criticized Steve Smith's decision to take a shy at the stumps. The Australian captain himself felt it was the right call and praised the rookie. If he had hit the wicket it would have been match over. Indeed several sports writers have called it a gutsy and worthy call. I know that if it was an Indian fielder instead, he would have been roasted by India's trigger-happy media.

Probability Model
Probability that australia wins = A. Probability that India wins = 1-A.
There are two scenarios:
1. If successful hit (probability p), then match over, australia win with probability 1.
2. If miss (probability 1-p) then there are two sub-outcomes:
2a. if fielder backing up, no overthrows, and australia win again with probability A (reset).
2b. if no protection, then overthrows, and australia win with probability A' < a =" p(1-A)"> A.

If (2b) is given:
p(win given hit) = p + (1-p)A'
This is statistically a good decision provided p + (1-p)A' >= A. A simple condition where this holds true would be if he were such a good fielder that statitically he hits the wicket more than 100A % of the time, i.e. if he felt that his chance of hitting the wicket was at least as good as the chance that Australia currently has of winning the game (fixing p = A would result in the LHS being > A)

Let's play with some numbers
1. Let's assume that with every run scored, the chance of success for Australia proportionally drops, so the cost of each overthrow run is A/6. For a worst-case 4 overthrows, this gives A' = A-4A/6 = A/3

2. so steve's accuracy rate had to satisfy:
p(1-A/3) >= 2A/3, or p >= 2A/(3-A)

3. If we assume that with just a single wicket to get, but only 6 runs to score, its anybody's game, and set A= 0.5. then steve's accuracy rate had to be atleast 1/2.5.

4. On the other hand, if you start with the premise that A is lower, say 25%, then steve only required an 18% accuracy to justify the throw. In other words, if you perceive that you have little chance of winning, then it is certainly a great idea to take the risk.

Modeling Ideas
You can also calculate 'A' in a more sophisticated manner by looking at the competing counting process to determine P(wicket falls before 6 runs are scored). The probability that a wicket falls in some 'n' balls is a geometric distribution. Each ball is a Bernoulli trial. The distribution of runs per ball could be Poisson. India scored 6 runs in the previous 3 overs (18 balls), so purely statistically, you can extrapolate that to say there's about 18 balls to get the last wicket. All said and done, a 50-50 chance is the most practical choice for 'A' here.

Conclusion
If Steve Smith was generally able to hit the stumps 40% of the time (i.e. slightly less than even chance), then it would have a good call from a statistical point of view. I haven't gotten a chance to review to video to see if it was an easy versus difficult angle to hit the wicket, but on average, 40% does look like a fairly high required conversion rate.

Statistically it was not a great call especially if he knew there was nobody to back up. All the pressure previously built up on the batsman was released. But cricket is played on the field and making a match-defining split-second decision after 26 hours of exhilarating play is no easy ask. No guts, no glory. It's the Aussie way and the glorious game of cricket is better off with that choice, (especially since India won :-)

Monday, October 4, 2010

Is this umbrella optimally designed?

Building retail pricing products for a living, i spend an inordinate amount of time analyzing online deals. And you get to see lots of innovative new gizmos on sale. The umbrella shown in the video below was on sale and caught my attention.


It was introduced in the market a couple of years ago and has won multiple design awards. It's aerodynamic and looks like a bicycle helmet extended into an umbrella. I checked out the wind tunnel tests - impressive. It doesn't turn inside out. It's got an eye-protective design built-in and also provides better frontal vision. If you are a geek, you'd feel that you would never want to be caught under the old dome again. It is well-marketed. Check out these product stress test videos. I mean the 300 Spartans could have survived using the Senz as a shield.

Let's apply some practical OR tests.

First look at the actual human feedback at Amazon.com. There aren't too many data points sadly. Some like it and some don't. Next we look at the analytics hidden inside the design of the umbrella. We also examine its external functionality. A really nice analytical innovation allows the umbrella to not fight the wind but mildly orient itself along the path of least resistance. Several stress tests empirically prove this.

Conclusion: Handles wind really well and better vision, so you don't feel like you are in a tent.

However, there's something missing in these tests. Let us turn to robust optimization analysis. All the talk is about gale force winds, wind tunnels. etc. What about some simple vertical rain at terminal velocity? How would it perform in Cherrapunji, India or the town of Mawsynram in the nearby Indian state of Meghalaya, which received 1000 inches of rain in 1985. How does it handle those good old pesky, incessant drizzles?How about sleet and randomized rain loads? How about the inevitable sidewalk drenching from the tires of a car on a wet road?

Conclusion - It appears that it is does not perform the basic, everyday function of an "umbrella for a rainy day" as robustly as it tackles wind. There are a couple of feedbacks in Amazon which highlight this basic limitation.

Question: In a randomized rain situation, can i run in a straight line with an aerodynamic umbrella like the Senz to introduce a favorable directional bias in the forces and improve its rain-performance?

Note that when it is steadily raining, and without an umbrella of any kind, the amount of water that your body/dress absorbs is practically the same, whether u walk or run to cover a fixed distance.

In many places in the world, people regularly use an umbrella to block out the hot sun. Since the total area is relatively smaller when compared to the traditional umbrella, it covers less.

Conclusion: less protection against the sun.

Finally, the most interesting comment was a person who said two people couldn't fit under this umbrella. It's going to be a disastrous end to a date when your partner is under your aerodynamic umbrella and getting all wet in the rain.

Final conclusion: The old dome is on average, more robust for most normal rain/sun/slush situations, and is much less expensive. The inner geek will compel you to get a Senz for the cool wind analytics. Or if you are some weather channel tv journo who shows up in hurricane/tornado hit areas. But guys, don't take the Senz with your girlfriend on a rainy day, and certainly not your wife (and most certainly not both - under any umbrella. sorry, couldn't resist).

Thursday, September 30, 2010

An OR analysis of the Ayodhya Temple-Mosque Verdict in India

Q. What happened in India earlier today when three judges delivered a judgment on a 60-year old court case associated with the mother of all controversial religious property disputes, with roughly a billion people anxiously waiting to see if justice was going to be done?
A. Nothing.

It doesn't sound like a big deal to folks outside India, but roughly 20 years ago, 2000+ people died in riots related to this very dispute when one group tried to take matters into their own hands. This dispute probably goes back to around 1528 when Babur, a chieftain from Uzbekistan won a battle at Panipat in Northern India a couple of years earlier to found the well-known Mughal dynasty. It is probable that one of his generals destroyed a Hindu temple and built a mosque. We can't be 100% sure, but there is several anecdotal and archaeological evidences that indicates this. If you disregard these evidences, there is a chance that a mosque was built on the pre-existing ruins of a temple. And the contention is that this is not just any Hindu temple - it is the sacred birth place of the legendary Hindu God Rama in Ayodhya, the hero of the Indian epic, the Ramayana. There is also a possibility that the mosque like structure could have been built at any point in time until around 1776, when a clear written description of this mosque shows up in records.

Modern day litigation started around 1885 and continue to this day!
For quite some time (decades) in the 19th century, both Hindus and Muslims offered prayers in the same complex without much issue, but a riot in 1934 (trust the Brits to screw things up all over the world!) put paid to this and since that point its been a contentious issue. The biggest democracy in the world saw a new political party come to power in the 1990s partly based on this issue, and to this day has a major political presence in the Indian electoral map.

So what did the judges do today? Well, they awarded 1/3rd the land to each of the three parties (A 'mainstream' Hindu group, a Sunni-Muslim group, and an ascetic Hindu religious sect called the Nirmohis). Nobody had any legal documents to prove ownership obviously. On average, nobody was truly happy, nobody was truly sad, and most importantly, nobody died in India today because of this.

This Solomon-esque verdict looks good from a practical OR perspective. If the objective is to minimize the expected value of the maximum discomfort for any religious group in India, and given that there is some finite probability that each group can lay claim to the property over a substantial amount of time during the last few hundred years (!), then this fractional solution of x = 1/n to this stochastic 3-SAT-like problem appears to be a sound compromise. Besides, it also reminds me of the initial interior solution used in Karmarkar's algorithm!. It allows each group to build their place of worship and exercise their right to religious freedom in democratic India. Rather than being totally indifferent to religious aspirations (i.e. bluntly secular), the judges have tried to reasonably accommodate it (i.e. multi-religious), and this made a huge difference. An all-or-nothing binary solution would imply that certain (reasonable) probabilities that favor one group are being completely disregarded, which in turn would probably result in riots at some point in time, with a little bit of encouragement from India's famously left-wing English language media, or a right-wing group that back any of the defendants.

A bit of history here.

The judges consisted of two Hindus and one Muslim, and I personally believe that they did an amazing job in arriving at a practical solution to a confounding, ill-defined decision problem that directly impacts a billion or more people, and indirectly, many more. It's not a perfect legal job; it's not bullet-proof; it can be contested in India's supreme court, but from an OR perspective, it's a stable solution that can work on the ground. Any substantial change to this 1/n formula may make the solution look more legal or technically more 'jurisprudentially' polished, but it is also going to cause havoc in real life.

In 1946, three Indian patriots - A Hindu, a Sikh, and a Muslim who waged a honorable military campaign for India's independence against the Brits as part of the gallant Subhash Chandra Bose's Indian National Army, (a la George Washington) were charged with treason (well of course it had to be!). However, the public outcry against that trial was so overwhelming that they eventually were freed. That heroic INA was formed against Mahatma Gandhi's wishes of a non-violent struggle, but nevertheless, that incident did unite the people of India and Bose is a much revered figure in India to this day (not to be confused with the one who makes cool speakers or the one who shows up among subatomic particles - Bosons).

Hopefully, the three Indians in today's trial have managed to do something similar. Time will tell.

Thursday, August 26, 2010

OR Theory, as opposed to OR Practice

An interesting exchange at where else but OR-exchange on "will OR take over the world" ? I would suggest a read-up there and to post your views to keep the discussion going. As one of the participants, a suggestion was made that OR has gone well beyond its mandate and overtaken the world (which gives me a convenient excuse to link to the good old pomo generator. We should build an OR-theory version some time :-), and that there was a crying need to get back to Terra firma.

Sadly, OR practice has very little to do with OR theory now-a-days. It doesn't matter if the department is part of the school of engineering, humanities, or management. Yes, theory has its place. OR theory is particularly enchanting, but an equal part of OR-academia has to be about training kids to think about practical solutions for the real world today, where OR enables a better understanding of real problems for real human beings. This balance is skewed today.

In short, academic OR (today) is almost all about answers, but in OR practice, the 'value-add' comes from asking the right questions. Given the right question, finding a good prescriptive answer in practice is, relatively speaking, a piece of cake (NP-Hard or not, bleeding-edge solver or not). Going to the customer and saying they have an unsolvable problem scenario because the solver that sliced thru your complex MIP model said it was 'infeasible' is just not done. Playing the good OR guy and walking your customer through a well-intentioned slide slow of analyzing an irreducible inconsistent subsystem to explain it is not much better. All that we have done so far is show how ineffective our efforts has been so far! After all, the customer was in business for decades without OR and did pretty well. So a first step in OR practice is to somehow get the customer to tutor us on how they optimized their business well enough to feed their families in this world (when we were happily experimenting with n-dimensional representations in hyperspace until our advisor kicked us out of school with a PhD), so we at least don't make a complete fool of ourselves. A second step would be to apply common sense, but then OR programs rarely get ranked and papers seldom get accepted based on such uncool stuff.

Monday, July 26, 2010

Analytics in Acadia

Living close to the Acadia national park has turned me into a local tourist guide for friends and relatives who consider any visit to Maine incomplete unless they spend a day there. By applying OR and data analysis, I've kinda figured out the right days to visit Bar Harbor in summer and escape the tourist crowd. On days having a very high bliss factor, an OR person can relax there and think about the naturally elegant analytics hidden all around.


The main scenic route (the park loop road) is a nice 'hard-coded' feasible solution to the TSP that requires us to touch all the popular spots in the park while constraining the grade in the resultant road to be within a safe level, while also reducing the amount of earth to be moved. Of course, too short a path is not very enjoyable, while too long a path will find tourists driving relatively fast to skip the repetitiveness, so the total length of the Hamiltonian circuit should ideally be within some desired bound.


As we think about path lengths, there is an equally interesting observation one can make about the beautiful Maine coastline while driving along the circuit - it certainly looks long. Now, if you look at the map of the US, the California coast seems much longer in comparison. However, if you take into account the bays and undulations and increase the level of detail, the Maine coastline is comfortably longer - the fractal coast length of Maine is more than 2000 miles long (or so i heard). The fractal dimension of the Boothbay, ME coastline was calculated to be 1.27.

Per Wikipedia, 'Maine' is the only single-syllable state-name and also the only state to share a border with a single US state (New Hampshire), which means that we require just two colors to map this region of the US. We share a border with New Brunswick, CA, whose DOT, along with Remsoft, Inc., were Edelman finalists this year. I attended their presentation at the INFORMS practice conference, and it was a really fine one, and I hope our US DOTs learn from them to ensure that the economic stimulus money is optimally spent.

Sunday, June 27, 2010

Oracle Crosstalk 2010: Yes, OR can help retail

Analytics is a hot topic for retailers. Several presentations during this conference returned to this theme and is a huge positive sign for OR in this industry. However, how to apply science smartly to the tons of sales data being collected (and retailers collect this like holy dust nowadays) and to automate robust decision making is another question altogether - 99% of the retailers do not have the right skill sets to pull this off. It is not easy building this system in-house from the ground up, and companies like Oracle Retail can do this better and faster, allowing retailers to focus on their core business areas. Few if any, were even aware of 'OR' (is OR by any name still OR ?). On the other hand, the general consensus was that revenue and supply chain optimization is a must to compete in this tough economy (honey, i shrunk the margins). A panel of investment experts with representatives from major financial institutions, both government and private, felt that in addition to top-line growth, being able to optimize the use of scarce resources to achieve operational excellence is one of the keys for survival in this new economy.

The level of hype surrounding the virtual region that lies at the intersection of social networking and mobile technology is staggering with numbers going off the charts (literally). How can OR grab a share of the science pie here? For example, I liked the way Wet Seal has been innovating and surely this area is going to see new and innovative applications being developed soon. Experts from Google and other tech-heavy companies weighed in on this as well and all signs point to an interesting retail shopping experience in the near future. Get your scan-ready smart-phone out and interactively and collaboratively shop in a real store with your virtual friends. The line between brick-and-mortar stores and the virtual shopping world is rapidly blurring.

On a side note, it was great visiting Chicago after a few years. The single most beautiful landmark there is the dazzling Swaminarayan Mandir (loosely translates to 'temple') in Bartlett. Hand-sculpted from Italian Marble and Turkish limestone by thousands of Indian craftsman, this Mandir does not rely on machine-tools and does not contain a single metal part or fastener in keeping with the sacred Indian tradition of Mandir architecture. Interlocking stones are used throughout, with individual stones weighing between a few grams to 5.2 tons. Pretty amazing. It is open to the public and is a must-see for visitors. The gardens and fountains surrounding the temple are well-maintained and pleasing to the eye. Inside the complex, smartly designed fiber-optic lighting and hidden underground heating add an invisible high-tech layer to this construction, and combines with the silence that must be observed within the ethereal sanctum-sanctorum to create a deeply personal and spiritual experience. The food served in the community center in the basement is excellent, and is strictly vegetarian, of course.

Friday, June 18, 2010

Oracle Cross-talk, June 22-24, Chicago - strong OR presence

Operations Research at Oracle Retail will have a strong presence in this year's edition of Oracle Cross-talk, to held next week between June 22-24 in Chicago, where (paraphrasing the press release) " ... retail executives and retail industry experts from around the world will share insights, demonstrate innovations and better understand the role of technology in new retail business strategies...".

There are a few analytics-heavy sessions planned, and being an attendee as well as an 'optimization facilitator' this time, I hope to spread the OR gospel and gain insight into the progress of OR-usage in this industry over the last 5 years, the level of user acceptance, and value added.

Sunday, June 13, 2010

The Reverse What-if: more hard-won insight into OR practice

The real fun in O.R practice is when your optimization product (say a recommendation system) is evaluated by it's first customer. Usually this is the point where you throw away the last 6 months of your textbook work and build something that your customer can actually use rather than what you wanted to build.

If you are in the business of providing software-as-a-service, you can do several iterations with a customer and that practice is more forgiving, but i am learning the hard way that building an air-tight OR product and then handing it off to a customer is mighty hard and you only have one or maybe two chances to get it right.

(note: some dramatic license exercised for illustrative purposes)
Today's tab focuses on the 'what-if', the practice of walking a customer thru a choreographed sequence of optimization scenarios. An obvious choice is to start with a somewhat unconstrained model, say, only having boxed variables, and then show the user or the consultant how the solution changes as you throw in, one by one, all those beautiful constraints your system can handle. At the end of this process, if everything goes well, you end up with a real and optimized solution that business can use and will save them many $$.

Unfortunately, this approach of "increased complexity" rarely works. The procedure described above is nice if you are trying to teach a student about OR. The average business user doesn't really care about this. He/She probably understands real-life solutions much better than you. She/we wants your system to work well and maybe help get a bonus at the end of the fiscal year. In fact, you are better of looking at its dual, the reverse-what if.

The reverse what-if, as you would guess, starts with the best approximation of the customer's reality and then you navigate from there. Otherwise, the customer has little patience for all the n-1 steps in the 'straight' what-if which represent imaginary solutions he doesn't care about. The n'th step brought him to reality, which he already understands better than you, so you haven't helped him much.

The reverse what-if perfectly ties in with Rosenthal and Brown's practical philosophy of starting with a legacy solution. Your recommendation system should be able to represent the legacy recommendation, else your product is not going to earn your customer's vote. Even if they purchased it, they may end up just using your GUI and throw your analytics into the trash. Unless you are providing an order of magnitude improvement in business value (in which case, you should start your own company!), revolutionary solutions are a hard sell. Yes, a customer values seemingly non-intuitive solutions that your OR unlocks to make him more money, but to truly ensure user-adoption, you need demonstrate this improvement from the legacy solution.

It also goes to user experience. In fact, it's about avoiding home-sickness. Being able to return to something familiar is critical for user acceptance. How many times have we initially cursed websites who suddenly redesigned and changed everything and you cant find your favorite menus in the same place? Windows OS always provides us the option of returning to something that resembles their previous release, even though they know they fixed a million bugs since their prior release. And we gradually outgrow the old if the new performs as advertised. I still use my favorite music player Winamp's classic skin. Feels like home.

Comments welcome.

Wednesday, June 9, 2010

June sports roundup with an O.R eye

May was spent writing and rewriting analytical specification documents for some exciting new retail O.R products, and when it was done, it is June. which tends to be the best month of the year for international sports fans. Apart from the main course of cricket, there's the NBA finals, which looks like going the distance, and the tennis at Roland Garros and Wimbledon, where baseline power-duels are becoming increasingly boring to watch. We had the spelling bee during the weekend, where Anamika Veeramani ('stromuhr') won this year's edition, while another Indian-American Aadith Moorthy won the other bee - The National Geographic ("where would you speak Tswana?"). Amazingly, both had perfect scores throughout the competition. The spelling bee in particular seemed more interesting this year. Unlike the past where gifted kids with 10Tbs of RAM seemed to do a "total enumeration" by memorizing entire dictionaries, this time the words were chosen such that you invariably had to apply analytical methods - 'word root signatures' - and construct an optimal prediction from pieces of noisy data, without a computer and within the time limit. Consequently, total enumeration failed this time, and made for better TV viewing on ESPN, apart from being a small victory for O.R.

It is increasingly important to read and experience other cultures, with so many words from all corners of the world. Any fan of the Asterix comic book series would have spelled 'Menhir' right, while anybody who's dined at a Punjabi Indian restaurant would have drank 'Lassi' for lunch. On the other hand, the word expert who walks the kids through the definitions needs to do his own homework. Either he needs to brush up his pronunciations of these new words, or there needs to be more than one expert doing the talking (Only french-rooted words ever seem to be get pronounced right). More and more analytic job descriptions require you to work harmoniously with cross-cultural teams in this new world. Now, in addition to knowing your AMPL, you have to correctly spell or pronounce the interviewer's name (without the middle initial, to make it easy), Aransanipalai K. Ananthapadmanabhan, to get the job. Good luck.

Talking about rainbow warriors, the FIFA football world cup starts in a couple of days in the diverse nation of South Africa (yes, Invictus is based on a true story). A whole bunch of analytical work has been devoted to the deadly business of penalty kicks, which seems to determine winners in recent times. A conclusion is that a spot-kick taker can increase the odds in his favor by acting upon the recommendations from this analysis. It is not a total crap-shoot, and some teams are consistently better at this, e.g., blood and guts Germany, while others such as the soccer hooligans from across the pond are abysmal. Of course, once the goalkeeper and the striker both starting applying analytics-based prayer techniques at this sports ritual, its going to cancel out and we are going back to square one. Until then, sports analytics witch-doctors bearing stochastic gifts can ride the gravy train.

update: also see here for more on the science of penalty kicks.

Monday, May 3, 2010

Analytics and Cricket - III : Forecast Models gone wild

The cricket 20-20 world cup is going on in the West Indies, and a potentially great match between the West Indies and England was cut-short today by rain. Upon resumption after the rain stopped, the chasing team's target was reduced using the D/L rules from 192 @ 9.6 runs per over to just 60 in the 6 overs that was possible, with all ten wickets available. What a farce! As mentioned in an earlier post (as well as in ORMS today by the creators - Operations Research (O.R.) guys whose names appear in the title of this post), The D/L model is used to forecast the runs target for the second team. This is a fantastic analytical model that works splendidly for the 50-over format. Why? Because this was adopted about 20 years after 50-over cricket was popularized, and D/L had plenty of varied data to calibrate their model and estimate goodness of fit. On the other hand, everybody assumed that the same model would work like a charm for the 20-over format, since D/L works with the % of overs remaining, etc, so its just a case of using a different multiplier, right? Wrong.

Nobody in the International Cricket Committee (ICC) bothered to even do a cursory analysis of how this model would perform in T20 games - a typical O.R. case study of blind trust in a black-box solution that works fine in normal conditions but fails when the problem slightly changes. And thus we recognize a weak spot in this model. It is going to take time to gather the data needed for better calibration, but what do we do until then?

Like any parameter estimation problems in statistics, O.R and econometrics, this one also requires a significant number of strongly good quality, non-collinear historical observations to work really well. The three years so far has been insufficient. Is 7 more years of international T20 cricket sufficient? 17 more years? While T20 is also cricket (at least when Sachin or Mahela bat), the dynamics is quite different from the 50-over format. Teams are 'all-out' or close to all-out far less frequently compared to the 50-over game, and every cricket fan knows that a wicket in a 50-over game is disproportionately more valuable compared to a wicket in a 20-over game. Does 2 wickets in a T20 game equal in value, the loss of one wicket in a 50-over game? As we being to think about this, we realize that the risk-reward-resource model used by D/L could be quite different, what with just 120 balls per innings. Or it could be the same in principle and its just a simple recalibration. On the other hand, An T-20 over is 5% of an inning, compared to 2% for the longer format. Does this huge reduction cause some boundary condition effects that need to accounted for? Is there a possibility that that we never find the amount of good quality data in my lifetime to make this same model work reliably for T20 games? I think it is time to look inside the model and confirm first if the fundamental assumptions and modeling constructs continue to hold in a T20 situation. Clearly with the 3 years of data we have had so far, it appears to be off the mark. In fact, even in the 50-over game, the model is known to have some bias the favors the team batting second, which however, is not severe enough to warrant replacement. However, we should be looking at fundamental modeling extensions if we find intrinsic problems with the D/L model applied to T20.

Cricket is perhaps the most unpredictable of all sports and is called the game of 'glorious uncertainties'. I just saw an international team lose 5 wickets in a single over and yet end up winning the match comfortably. It's also embraced a modern and sophisticated O.R. solution to weather-interrupted matches, but please, lets get our modeling straight. This kind of uncertainty is great for the O.R person in me, but not at all enjoyable as a cricket fan, and unlike business, cricket is far too serious to left totally to O.R types like me.

Thursday, April 29, 2010

Can OR provide strategy for the World Chess Championship Players?

Among the very many great sporting events that remain hidden away from the island of the USA is the ongoing battle for the World Championship in Chess. The classy defending champion, Vishy Anand is a sentimental favorite, given that he's based in Chennai, India (Madras) where I studied. The challenger is the Bulgarian Veselin Topalov, who before the title bout started, had a slight head-head advantage over Anand. To add to this, The volcano in Iceland meant that Vishy had to endure a several-day road trip across Europe to get to the venue in time after the chess authorities only granted him an one-day extension instead of a three-day break he asked for. Vishy promptly lost the first game, but won two of the next three to open a slender one-point lead. For those who remember the Fisher-Korchnoi-Karpov-Kasparov days in the cold war era, 21st century chess still remains an incredible mental sport where supreme ego, psychological gamesmanship, and sharp analytical intellects clash to create some amazing drama.

A great blog to cover chess is maintained by Susan Polgar (one of the famous trio of Polgar sisters from Hungary) now residing in Texas. An incredible talent herself, she won an under-11 girls chess competition in her country undefeated at the age of 4, and is arguably the world's greatest female player. She asks the question - How should Topalov plan his strategy for the remaining games?

The first to reach 6.5 points in this 12-game series wins, and with Anand at 2.5 currently, a risky approach may cost Topalov many games, whereas a placid approach may enable Anand to force some quick draws (0.5 points each). I wonder what statistics, OR, and game theory has to say with regards to the optimal policy to adopt for either player?

If you are far behind in points, then it may pay to throw caution into the winds, since there is little to lose, while in the current situation, the risk and reward is still somewhat balanced.

Does a player with a lead of one point or more simply play to force quick and safe draws?

Interesting questions. Some answers would be nice.

correction (April 30) - Susan Polgar may not even be the best chess player in her family, let alone the world :-) That credit probably goes to her sister Judith Polgar.

Wednesday, April 21, 2010

The Informs Practice Conference and the OR think tank misses a trick or two

How can Informs make the Practice conference even better? An advantage of being an unofficial reporter is that I can avoid self-congratulatory blog posts and actually criticize without any sugar coating - in the hope that we get out of our comfort zone and make this an even better event next year.

Clearly some things were out of their control. All the OR folks in the world would not have been able to predict the impact of a volcano in Iceland on the travel plans of overseas visitors to the conference. Also, Dr. Micheal Trick, whose pioneering web page on O.R was the main source of information as well as inspiration for graduate students like me in the 1990s, and motivated me to join this exciting field, was missing, and one can't fault Informs for this. I was really looking forward to shaking his hands and thanking him for his service. 'Marketing in Online Social Spaces,' by Kevin Geraghty, Vice President, Research & Analytics, of 360i was a really good one (somehow I forgot to cover this in my daily conference tab). Kevin was providing an example of marketing campaigns using social networking data. He found out (using completely public domain tools!) that in the OR blog world, to the surprise of many, a certain Lieutenant in the Navy had more 'online friends' than Dr. Trick, so if one were to promote some hypothetical OR product, then he should be chosen as a first reviewer, assuming that those friends were OR types rather than 'sailors'. He also obtained other funny personal trivia from public domain, that I'll just leave out.

A second peeve I had was the highly limited lunch and dinner options for vegetarians (Two boiled asparagus roots and a turdy-looking cuboid of tofu does not an Edelman banquet make!). This should not be difficult to fix. If this doesn't change, I frankly don't see much point in shelling out two grand and semi-starve most of the conference. Thankfully, due to the purely individual initiative of the obviously superb Hilton staff, i was not totally inconvenienced. Kudos to those guys. They got their 'hospitality management OR' right.

Third, attendees should be able to obtain access to the video archive of the talks. Static slides don't cut it anymore. Unlike academic conferences, the value of practice-oriented conferences often lies in what is said in between slides.

On the positive side, the posters were a big hit. One can engage the presenters informally and in 5-10 minutes get a high-level idea of what their innovation is about. And the good thing is that you can visit them in your own time and network too. For example, I found out that the Sandia Labs in beautiful New Mexico, has this really cool Python-based modeling language (PYOMO?) that they used for stochastic programming. Can't wait to try it out. At the MPL booth, I found out that that they are making software available for free on a Windows environment. In tandem with COIN-OR (which they package MPL with, i think), you have a solid modeling and optimization package, free!

My best talks in no particular order - Sanjay Saigal (Intechne) on uncertainty , Jeffrey Cramm (Univ of Cincinnatti) on practical OR, Kevin Geraghty (360i) on social networking, John Osborne (Kroger) on OR innovation against all corporate odds, and any Edelman presentation.

overall grade: 7/10

I'll end on a warning note. The bottom line goal is that if people are thinking analytics, then O.R should not be far off from their thoughts. Well so far, O.R has been losing this battle on many fronts. Clearly, we do not want to lose our existing membership in any OR-friendly industries (most representative of the ones who showed up). However, we should be doing much more to attract members from the non-traditional, emerging industries (very few of those). At the end of the day, OR is an applied field, and while the analytics turf can be defended in journals, textbooks, and conferences, it can only be won in hard-fought battles by in-the-trenches OR foot soldiers, who need be to well-equipped and trained to build innovative, scalable, practical products and solutions for real-world problems in the 21st century - that is increasingly going to be marked by many terabytes of noisy data. When we start with "min z = c.x + y: ax <= b", O.R academic programs should first be teaching how and where to get the "a, b, c" in this and what is really means, rather than taking a short-cut straight to 'x, y, z' in the abstract world, like we have been doing the last few decades. If you have other questions, ping me and I'll tab it here ...

Tuesday, April 20, 2010

Informs Practice Conference 2010 - Day 3

The day started off with an encore presentation by the edelman winner - Indeval. Interesting talk. The theoretical stuff was not particularly interesting, but the fact that they got some OR stuff to work in real time in a mission-critical system, and involving billions of dollars is really cool. Next, i managed to attend a couple of optimization-focused talks on Approximate Dynamic Programming at Schneider Trucking, followed by 'the practice of the alternative' by Dr. Jeffrey Camm from the U of Cincinatti. He is from the Brown-Rosenthal school of practical OR, which i heartily subscribe to as well, and it was probably the most informative talk of the meet for me.

The rest of the day was devoted to the energy industry. We had another plenary by Richard O'Neill, Chief Economic Advisor to the Fed Energy Regulatory Commission. This guy went into some depth on electrical circuits and mixed integer programs. Quite unexpected, but it was great for us optimization practitioners. Then I got to listen to more presentations on energy-related topics including analytics for the smart-grid.

All in all, it was an enjoyable conference, even if one can attend only 10-12 of the 80 presentations on offer. Great location, excellent hotel service. Good job, Informs!

Monday, April 19, 2010

Informs Practice Conference 2010 - Day 2

The day started off with a plenary by a senior guy in Walt Disney. Equally interestingly, he worked at PeopleExpress decades ago, now part of Airline Revenue Management folklore. the key takeaway was that smart OR ultimately improves the odds in your favor by one or two percentage points, and that is a really big deal. Following that, there was an incredible variety of interesting topics to choose from, many of which were scheduled at the same time. So I tried to avoid MBAs, vendors, as well as academic types and listen to the in-the-trenches practice guys. The first one was the head of R&D in Kroger, a group that's 2 years old in an 126-year old company. This talk focused on how to cut thru the (126 years of ) red tape to get genuinely valuable work done. Very interesting. Quotes included "you should be willing to bet your job that your project idea works ..." and a need for passion. Every body's hand in the audience went up when he asked how many people in the audience liked their jobs. Not surprising. Practical OR is fun.

The next interesting talk was by Dr. Sanjay Saigal on probability management. He is a non-conformist and funny, and he put on a real show, and i really wished this talk had continued for another 15-20 mins. Great topic.

All in all, I missed several great talks. If anything, the practice conference has an abundance of riches in terms of the high-quality content presented. I'm distraught that I may have to skip a talk by the uber-brilliant Dr. Ellis Johnson tomorrow to catch another one at the same time that is equally exciting and pertinent to my current line of work.

One of the the 'birds of a feather' discussion in the evening focused on the role of O.R in analytics. I've already talked about the identity crisis facing OR'ers in a prior post, and INFORMS, as well as OR academic programs should act soon to fix this gap. The master of ceremonies for the Edelman awards later mentioned (or paraphrased) that OR is the most important invisible profession in the world today.

Finally, i sat in on an Edelman finalist presentation by the New Brunswick department of Transportation, Canada, since they were my sentimental pick - NB is just three hours further east of my place in Eastern Maine. In the end, the bankers won it. Interestingly, almost every single entry featured a company partnering with a university or a OR software vendor.

Today, I managed to spot two OR all-time greats, Dr. Cynthia Barnhart, and Peter Kolesar. Too bad I did not get a chance to interact with them, given that they were involved as an Edelman judge, and finalist, respectively.

Sunday, April 18, 2010

Informs Practice Conference 2010 - Day 1

Getting from North Eastern Maine to Orlando involved going thru Detriot. For some reason, this US carrier seems to 'dynamically' assign gates at DTW to arriving aircraft, so we "arrived" 30 minutes ahead of schedule, but arrived 30 minutes later. This is not the first time it's happened. Anyway, the weather is Orlando is great compared to Maine which was in the low 40s when I left ...

The workshops on Day 1 were quite useful. Forio Business solutions had some nice system dynamics tools for building snazzy looking web-simulations. I managed to get through one Markdown Optimization example, simulating different price elasticities. The next workshop was enjoyable as well as informative. Getting to to see the legendary Dr. Bixby in person was cool. Gurobi 3.0 now has a parallel barrier solver in place, and I verified that this one is deterministic. Their dev team is sure keeping a fast pace of major releases and their benchmark results continue to impress and I resolved to learn Python. Finally, the third workshop was with OPTMODEL, SAS's versatile modeling and optimization language / procedure. They displayed some nice decomposition approaches to a Kidney exchange and ATM optimization problems, all deployed within OPTMODEL. I felt that the Kidney exchange model (KEM) could have benefited from some specialized TSP subtour constraints, but then again, some nice work on display by the young OR experts from this company.

It was nice to catch up with old airline colleagues, and INFORMS had some vegetarian food, thankfully. Finally, it was nice to meet Dr. Ravi Ahuja, another O.R. giant, in person. These were the stand-out moments for day 1 - a rare chance of interacting with the stalwarts of our discipline in person.

Monday, April 12, 2010

Doogie, Darwin, Dowry, and the TSP

A couple of teenagers from the U.S. visited the beautiful IIT campus in Madras (Chennai), India in 1989-90. They were not there to attend the popular collegiate cultural festival 'Mardi Gras' as it was known back in those days, but to present a research paper on AIDS. They happened to be brothers, Balamurali Ambati, and Jayakrishna Ambati, who completed medical school at a fairly young age. Per Wikipedia, BA graduated from the Mount Sinai school of medicine at the age of 13, and become a qualified doctor at 17 in 1995.

Today, the Ambani brothers hog the media space in India as they seek to become richer, but for a brief while in the 1990s, the elder Ambati brother got entangled in a 'dowry harassment' scandal. Dowry harassment reports was big news in India, with the per-capita dowry-deaths in line with the number of 'murder for insurance' cases in the US, or the wife-beating cases in Switzerland. Anyway, reports indicate that the case fell apart after the bride's father was recorded on tape trying to extort a few hundred big ones in blackmail money. Unfortunately for the elder brother, it looks like like he had to cool his heels in India until this case was wholly resolved, losing a good two years in the "youngest achiever" race, which has since become an idiotic, even deadly craze in Southern India. This is in contrast with the more comical approach in Northern India and Pakistan, where many kids are 2-5 years older than their official age. If you were that skinny, baby-faced runt in a middle school in Bangalore, he would be that guy with the stubble in the last bench, and the captain of your school's football (soccer) and (field-) hockey teams.

Pardon the digression. Around the time the Ambati brothers visited the IITM campus (A former student reminisces here) to talk about AIDS, they were also the primary authors of this published paper on the traveling salesman problem. The title is exciting, but a tad misleading, in that it hints at a polynomial time algorithm for the NP-Hard TSP. It resembles a randomized heuristic approach based on the theory of natural selection, and appears to possess good computational properties, and has been cited more than once in followup research in this area. On the other hand, I don't think even Doogie did any OR work, real or fictional.

Sunday, April 11, 2010

All Set for the INFORMS Practice Conference

Wonders never cease. One advantage of working for a solvent company is that it provides a rare chance of attending a major conference within the U.S. The INFORMS practice conference seemed like a good choice. Besides, the annual INFORMS conference is a few months away, and one never knows how the travel budget is gonna change. A greedy approach works better here... It's been eons since the previous conference - not surprising if you spent dog-years in the mostly-bankrupt airline industry.

Given the short notice, I'm presenting absolutely nothing, get four days off from work to listen to cool OR guys talk, and the plan is just to learn as much as possible and be an on-site reporter. Please email me at shivaram (dot) subramanian (at) gmail.com, if you are interested in talking OR during the meet. I will be posting daily tabs of the conference here, so watch this space. If you would have liked to be at the conference but could not make it, please email me any topics you would like me to cover here, and I will do my best. As always, any tips on the optimal way to cover conferences is welcome.

For those practitioners interested in the costs involved, here's the lowdown. Airfare is about 400$. 3-4 day hotel stay is about 700-900$ (now I know how it feels to be on the receiving side of Pricing optimization). I'm staying an additional day (Sunday) to take advantage of the technology workshops and network. Registration fees for non-members is about 900$. The total cost, including daily expenses is in the ballpark of $2500. It remains to be seen if the feedback and new ideas that one can get out of this outweighs these costs. Last year's Edelman work was fantastic, and hopefully this year will be just as good.

Saturday, March 27, 2010

Analytics and Cricket - II : The IPL effect

This is second in the series of articles on O.R. and cricket. Click here for the first part, done a while ago.

The Indian Premier League (IPL) is close to becoming the number one Indian global brand - not just the number one sports brand. It has overtaken past colonial stereotypes (such as snake charmers, elephants, and Maharajahs), current pop stereotypes (IT outsourcing brands like Infosys, Wipro, et al, knowledge-brands like the IIT graduate, etc). The two newest franchise teams unveiled in this fledgling three-year old league were purchased for $333M, costing more than a couple of current NHL teams. Sports has become big business, even as the cricket fan in me rebels against this. Several owners have 'Bollywood' connections. Not surprising, given that these movie types make so many expensive flops year after year, the risk level for a cricket venture is surely much lower.

This IPL season is on YouTube now after a pioneering deal with Google, and this experiment serves as a nice dress rehearsal for the search engine company toward more such live streaming ventures in the future. In terms of audience size, it's easily a factor of ten-twenty bigger than that for NCAA basketball. India has a lot of cricket-crazy people. I've provided the YouTube link for my favorite match of the tournament so far: Bangalore v Mumbai. This is the shortest form of cricket played where each innings lasts twenty overs and the entire game is completed in three hours.



We will cover two new analytical induced innovations observed in this season's IPL.

First, the number of run-outs (analogous to a baseball strike-out where a player doesn't make it to a base in time) seems to have increased dramatically. Why? It looks like team statisticians have noticed that a traditionally weak area of teams is fielding and the probability of a direct hit on the stumps is low. This reduces the risk of getting run-out and the reward for stealing an additional run against statistically poor fielding teams may be well worth the risk. Teams that do not improve their fielding will probably see this hit-probability decrease. Teams will take more chances against you and more members in your team will have the opportunity to show-case their non-athletic, keystone kops-like fielding prowess leading to a deterioration in stats. Conversely, good fielding teams can improve their hit-probability stats and reap the reward in terms of effecting more run-outs. Teams of both kinds can be seen. The ones adopting better fielding standards are at the top of the points table.

A second analytic innovation is the form of a special T-20 (twenty-over cricket) bat and is now the most famous mongoose in India (that's the brand name for this bat). It has a handle as long as the blade itself, with the total length of the bat itself being constant. Statistics show that in this form of the game, oftentimes, half a bat is often better than a full-one, if optimally designed! Don't believe it? See this YouTube clip of Matt "the bat" Hayden, the first player in the IPL to use this bat. He is certainly not going to be the last.



So why is the mongoose effective? In the most serious form of cricket (test cricket), a full bat is a must. It's a longer game (over 5 days) and the chances of getting out is much, much higher over time and you want a bat as large as a barn door to prevent the ball from disturbing your stumps. From the T20 perspective, the ball travels the longest when it hits the sweet spot of the bat (roughly three-fourth of the way down a bat), and combined with the fact that getting out in T20 is not such a big deal, you end up with the mongoose, which is essentially just a long handle and a reinforced lower half, like a pendulum. It's made of wood just like the traditional bat, just as long, and roughly the same weight. For a given period of time at the crease, you are more likely to get out using the mongoose, but the expected number of runs (specifically in the form of hitting sixers) you could score before that happens can be much higher, thus making it an attractive trade-off in certain T20 match situations.