Dualnoise: decisioneering

Showing posts with label decisioneering. Show all posts

Saturday, August 30, 2014

The Best Decisions are Optimally Delayed

The lessons learned from the last few years of practice have convinced me that analytics and OR (OR = Operations Research), or at least MyOR is mainly about learning the art and science of engineering an optimally delayed response. Good analytics produces an optimally delayed response.

But why introduce a delay in the first place? Isn't faster always better? 'Science of better' does not always mean 'science of faster'. From age-old proverbs we find that in between 'haste makes waste or knee-jerk reaction' and being 'too clever by half' lies 'look before you leap'. If we view Einstein from an OR perspective: "Make things as simple as possible, but not simpler", the reason seems clearer. We must make situation-aware and contextually-optimal decisions as fast as possible, or as slow as necessary, but not faster, or slower, i.e., there exists a nonzero optimal delay for every response decision. A middle path in between a quick-and-shallow suboptimal answer, or a slow-and-unwieldy 'over-optimized' recipe. Of course, one must work hard during this delay to maximize the impact of the response, and put Parkinson's law to good use, as suggested below:

#orms methods often *delay* product response to find superior answers. Tricky to pick optimal (delay, improvement) pair :)
— (@dualnoise) August 1, 2014

See this old post on 'optimally delaying an apology' to maximize benefit to the recipient, or recall the best players of every sport being able to delay their response by those few milliseconds to produce moments of magic, or Gen. Eisenhower delaying the call to launch D-Day. In the same way, a good OR/analytics practitioner will instinctively seek an optimal delay. For an example of this idea within an industrial setting, read this excellent article by IBM Distinguished Engineer J. F. Puget on taxicab dispatching that he shared in response to the above tweet. Implication: If your analytics system is responding faster than necessary, then slow it down a bit to identify smarter decision options. The 'slower' version of this statement is more obvious and is a widely used elevator pitch to sell high-performance analytics and optimization tools.

The history of the 'optimal delay' is many thousand years old, going back to the writing of the world's longest dharmic poem, the Mahabharata, which also includes within it, the Bhagavad Gita, one of the many sacred texts of Hinduism.

(pic link: http://www.indianetzone.com)

The story about how this epic came to be written is as follows:
Krishna Dvaipana (the Veda Vyasa) wanted a fast but error-free encoding of the epic that could be told and re-told to generations thereafter. The only feasible candidate for this task was the elephant-faced and much beloved Ganesha, the Indian god of wisdom and knowledge, and remover of obstacles. The clever Ganesha agreed to be the amanuensis for this massive project on the condition that he must never be delayed by the narrator, and must be able to complete the entire epic in ceaseless flow. Veda Vyasa accepted but had his own counter-condition: Ganesha should first grasp the meaning of what he was writing down. This resulted in a brilliant equilibrium.

Veda Vyasa composed the Mahabharata in beautiful, often dense verse that Ganesha had to decipher and comprehend even as he was writing it down without lifting the piece of his tusk that he had broken off to inscribe, from the palm leaves. If Ganesha was too slow, it would potentially give Vyasa the opportunity to increase the density and frequency of incoming verses that may overload even his divine cognitive rate. If he went too fast, he would risk violating Vyasa's constraint. Similarly, if Vyasa was too slow, he would violate Ganesha's constraint. If he went too fast, his verse would lose density and risk becoming error-prone, and of course, then Ganesha would not have to think much and perhaps write it down even faster. Imagine if you will, a Poisson arrival of verses from Vyasa divinely balanced by the exponentially distributed comprehension times of Ganesha. Writer and composer optimally delayed each other to produce the greatest integral epic filled with wisdom ever known; written ceaselessly in spell-binding Sanskrit verse, without error, and flowing ceaselessly to this day without pause.

I can think of no better way to celebrate Ganesha Chathurthi than to recall and apply this lesson in everyday life.

Monday, July 1, 2013

Are Operations Researchers Eulerian Decision Makers?

It is interesting to read Venkatesh Rao's book 'Tempo' (and the blog) from an Operations Research (OR) perspective and explore its ideas on human decision making. To better comprehend the themes in the book, I've been trying to map the examples in the book to familiar OR-type problems. For example, timing the purchase of consumer products at Amazon.com over a period of time while also managing to satisfy the free-shipping threshold to the extent possible feels like playing a game of tetris where we try to optimally delay the inevitable ("entropic optimization").

A recent tempo blog post compares Lagrangian versus Eulerian decision makers (let's call them LDMs and EDMs). These two approaches appear to have their origin as choices of frames of reference for studying fluid flow. An LDM tracks a single 'parcel' of flow though its entire journey while the EDM examines many parcels that flow through a particular location. Let's apply this to an airline OR (crew scheduling) scenario:

an LDM follows a particular crew operating a sequence of flights through a network before returning to her/his base, while the EDM looks at many crews arriving and departing through a particular airport. LDMs track flow-paths, while EDMS zoom in on activities within a specific node in the network. Another example from airline network revenue optimization: an LDM may focus on a passenger flowing and the accumulated revenue through an itinerary consisting of multiple flights, whereas the EDM zooms in on the variable price and fixed cost associated with a seat on a specific flight that is a part of many passenger itineraries.

Intuitively, it feels as if the LDMs focus on the clearly visible primal decision variables, whereas the EDM focuses on the latent, underlying dual problem. Column generation as an iterative optimization approach works well when the EDMs compute and transmit useful 'dual' signals from specific locations to LDMs (e.g., the rate of pilots arrive and depart at their node) who in turn, collect and apply this updated information to stitch together new and improved flow-paths across the network (i.e., columns), which in turn ensures more balanced and smoother flows across the nodes (i.e., rows). For some problem classes, if we solve the dual problem, our primal decisions are also at hand, and if the dual is unsolvable, the primal is likely to be problematic as well. Conjecture: Lagrangian and Eulerian decision makers build mental models that represent (informal) primal and dual formulations of the same problem. Depending on the context, one or the other approach may be more convenient or effective.

The ability to extract information from noisy dual signals is useful in solving complex or large-scale industrial optimization problems. For example, in the airline revenue optimization example, it is known to be more convenient and reliable for airline revenue/scheduling planners to take decisions based on the shadow prices of flights, rather than rely on the optimized prices of the millions of possible itineraries. Dual methods are insightful and can help solve a decision problem where a frontal assault fails. To the best of my knowledge, the default out-of-box method invoked by the leading mathematical optimization software packages today for solving linear programs is their dual algorithm.

In general, Eulerian decision making or as this post sees it, 'dual-driven decision making' may be the more attractive first-choice approach for human decision making. The tempo blog notes:
"... When flow gets turbulent, the fluid mixes a lot. To properly follow a “parcel”, you have to let it expand as flow lines diverge and churn. This means there is more fluid in your parcel than you started with, more “noise.” Eventually you are trying to analyze world hunger — the entire body of fluid.
But the Eulerian static parcel stays the same size. It just bleeds causal structure and gets more entropic. The action gets a lot more random and choppy, but still tractable in size. It is also easier to shrink what you’re paying attention to when things get complex — it’s called focusing – than it is to reduce the ambition of a model you’re tracking (generally called pruning)..."

Are Operations Researchers Eulerian Decision Makers?

Monday, November 26, 2012

Review of WVLPSolver: A Java-based Linear Programming Library

Pure Java based optimization tools released under benign licenses are popular in decision-analytics embedded industrial applications. They reduce maintenance efforts and deployment costs, keeping IT managers, analytical support engineers, and developers happy. No platform-specific issues to deal with. No license and patent infringement headaches.... It's a win-win for everybody, provided these 'free' analytical tools withstand the rigors of a production environment.

Recap: This tab summarized a set of pure Java tools a while ago here, talked about a DIY Java solver here that gets you to within a factor of 10-20 of CPLEX on combinatorial LPs, and chatted about "GooPLEX v1.0" some years ago here.

The new WVLPSolver library is introduced here (courtesy: Bharath Krishnan, data scientist at CQuotient). This new solver is worth a look because its v1.0 appears to be more well thought out compared to the initial version of 'Gooplex'. In particular, WVLPSolver seems to employ a revised simplex scheme that would allow it to work with a minimal simplex tableau (see an old but remarkable (if extreme) illustration of this advantage). Per the introductory post, it employs the reliable Colt library, and is released under the Apache license. These three factors alone may allow this tool to become a useful workhorse in specific situations ("think small"). Great work by the developers, and hope they continue to improve upon this initial release.

Big Data produces a variety of decision problems (recent post related to this topic). In some situations, and depending on the problem structure, you may end up needing to solve tons of small but independent optimization problems, and in other cases, relatively fewer giant monoliths where a considerable amount of sophistication is required to produce consistently good answers. In the former case, an IT manager has to decide whether he needs to budget for a million-instance licensed version of an industrial-strength solver that works cross-platform, or hack-a-heuristic that generates "user-optimal solutions" but may end up confusing users, or go with a nice, pure Java LP brew. Each option has its advantages and disadvantages depending on the context. For analytics start-ups that offer solutions having a relatively small OR footprint, the last option may sound appealing.

Tuesday, October 23, 2012

The Gaussian Hare, the Laplacian Tortoise, and Big Data

Alternative Optimal Solutions
A recurrent theme of this tab is to highlight an important contribution of O.R decision modeling: alerting us to the presence of alternative optimal solutions (AOS). Prior posts relating to AOS can be found here and here.

Customers unfamiliar or uncomfortable with the jagged-edged linear programming (LP) models often seek refuge within the smooth world of classical calculus (a world now known to have been initially crafted by Kerala mathematicians, but i digress). Here, alpine slopes of gracefully changing functions invariably allow you to rapidly ski down to a uniquely best solution. Like the truism "Football is a simple game; 22 men chase a ball for 90 minutes and at the end, the Germans win", an applied math legend is that the Gaussian hare invariably trumps the Laplacian tortoise. Unless of course, modern day optimization solvers and decision science come into play and allow Laplace to overcome the various complications posed by non-smoothness. (Image below linked from http://www.econ.uiuc.edu)

Degeneracy
Well specified decision models in a variety of industrial situations admit plenty of good quality answers, so the problem is usually one of having 'too many' rather than too few options (my favorite academic researchers tackle increasingly difficult unsolved problems, whereas my favorite OR practitioners compete on identifying the easiest unsolved problems). A fundamental thumb-rule of practical decision optimization modeling is to advantageously exploit and defuse this often-deadly problem of 'degeneracy' that characterizes practical LP formulations, and a reasonably skilled analyst can turn a potential numerical liability of the LP model into a business analytical asset as follows.

AOS often hide in plain sight
The presence of alternative answers forces us to revisit our translation of business rules into an LP and devote some time toward gainfully analyzing the differences in the potential business impact of these seemingly equal solutions. The customer can use this feedback to re-examine and further refine his/her business goals and priorities. This process of iteratively improving the design specification provides valuable insight to customers, and helps setup a richer, and a more practical and robust optimization model. Not that a similar exercise is impossible to accomplish using smooth approximations - just that the flexibility afforded by an LP model is often superior, and the tool kits for analyzing LP solutions have gotten amazingly better over the years.

Means and Ends
It just doesn't make sense to crunch through 21st century "Big Data" analytics using bleeding edge data mining, econometric, and machine learning methods on the one hand, and on the other hand, downgrade to 19th century techniques, or random black-box methods to manage the underlying decision optimization problems and produce severely suboptimal and non-robust solutions. Using such shortcuts because "a quick one-iteration improvement is all that is needed" brings along with some risky side effects and potentially leaves a big chunk of 'Big-Data' value on the table. Do everybody a favor and upgrade to a Laplacian tortoise (e.g. CPLEX) and you will be surprised to see how fast it runs, especially on Big Data.

Thursday, August 2, 2012

Optimal Location of Speed Limit Signs

One of the nice things about the relatively more recent automobile GPS products is that they are able to inform us of the current speed limit. Sometimes, when we take our eyes of the road for a second to grab a soda, we may have driven past a speed limit sign and be unaware of the new speed limit. Of course, there are times when the GPS unit itself does not display the speed limit for certain areas, and at other times, it is off by 10 mph, perhaps due to recent road updates. Like any decision support system, the GPS unit cannot be a fail-safe backup for user negligence.

The problem of optimally locating speed limit signs on a network must have been studied and solved a long time ago especially since the analysis of traffic and transportation networks has long been popular research area among ORMS folks. Perhaps there exists a practical combinatorial optimization problem in terms of determining minimal/safest/least ambiguous ways of locating speed signs in the presence of scarce resources and budget limits. A partial list of assumptions and constraints include:
- Speed limits change in discrete quantities of 5mph or 10 mph
- Every driver will treat the last speed limit sign (or update) they saw on the network as the prevailing speed limit
- Every driver at every point in the road network must be in unanimous agreement on what the speed limit is. This is the ideal situation.
- In practice, perhaps the above requirement can be relaxed to stipulate that any non-trivial ambiguity must be resolved with a certain time or distance threshold whose value is location-specific
- Different types of speed limit signs are possible - they may be time-dependent (day/night) as well as location-dependent (school, bridge, tunnel) - Speed limit values can be fixed or variable ("smart roads"). This requirement can potentially inject a dynamic optimization aspect into the problem.

Perhaps a greedy method may be sufficient to generate a good answer to the (fixed value) speed-limit signpost location problem. There are of course, numerous other related location optimization problems on road networks:
locating advertisement hoardings, direction/information signs, emergency vehicles, detours, etc. All these models appear to be well studied in the literature. The proliferation of smart-phones can also have an impact on future 'smart road network' design in general. All in all, location, location, location science continues to be a very interesting sub-area of Operations Research.

Sunday, July 22, 2012

Ekthetikophobia: The Fear of the Exponential

More specifically, the fear of having to solve an NP-Hard optimization problem to save your job.

Do you or anyone else in your organization exhibit symptoms of Ekthetikophobia?

Sample Responses by Ekthetikophobes
1. Resign: Capitulation is by far the most common response, especially if the person does not have an ORMS background and is oblivious to the 'science of better'. Throw hands in the air, lose faith in humanity, and slurp spoonfuls of the nearest meta-heuristic algorithmic prescription provided by bees, ants, bacteria, squeaky wheels, mutant turtles, ... any nature cure that uses pseudo random numbers. This response is best captured by the Hindi proverb "Naach Na Jaane, Aaangan Teda",i.e., a bad dancer blames the uneven floor.

2. Controlled Panic. This is pretty typical of a generation of OR practitioners weaned on CPLEX. Like MDs trying to decode a deadly new strain of flu, the overriding urge is to throw money at the problem by ordering the latest versions of the baddest bunch of industrial strength optimization solvers in the market with matching supercomputing accessories to run massive benchmarks, and generally scare the pants off their IT managers who work with depression-era budgets.

3. Code. This is relatively more common to programming gurus and follows exactly one rule: If you throw sufficiently well-written code at any problem, it must work. This is so crazy, these guys are on to something here.

4. Publish. The median response from E.phobic research folks is to put this exhibit on a pedestal for everybody to gaze at. It's akin to a biologist discovering a new specie or an astronomer sighting a new planet that shouldn't have been there. Half of these people (surely OR types) will go on to demonstrate the monstrosity of this new problem based on diabolical worst case instances that even a God who plays dice would not inflict on mankind. The other, more elegant half (theoretical CS E.phobes) propose 'factor of 2' approximations that cover all remaining difficult instances missed by the Muggles, yet just as useful practically, before declaring victory. Then over the next two decades: keep shaving of that factor; rinse & repeat; it's a veritable cottage industry.

Exaggerations aside, ranked at the top of the most systematic, resourceful, innovative, and practically useful responses toward 'new' NP-Hard optimization problems must be the approach adopted by Dantzig, Fulkerson, and Johnson (1954) to tackle a 49-city Traveling Salesman Problem using 1950s computing technology and fearless minds that dared lasso the exponential.

July 22: minor fix: added missing text.

Tuesday, July 17, 2012

Alternative Optimal Solutions to the Coin Mix Problem

The solution to the 'optimal change' problem analyzed in the previous post is further examined from a practice perspective. Again, we use CPLEX to do this since it is quite convenient for such analyses.

Problem 1 is an integer knapsack problem that is known to be theoretically NP-Hard but fairly easy to solve in practice using Dynamic Programming. For this specific coin instance, enumeration with simple pruning rules that exploit the problem structure would work too.

Globally Robust versus Scenario-Optimal
Problem 2 is relatively trickier. It attempts to minimize the maximum weight across the feasible solutions to each of the change scenarios. The value of the 10-coin min-weight solution in the previous post:

Pennies:Quarters:Dimes:Nickels::4:3:2:1 is slightly more than a dollar (1.04$). This coin mix will be able to satisfy any random change request in the range 1-100c. Note: If we only cared about change between 1-99c, an optimal solution (35.412g, 99c) is: PQDN::4241.

However, while an optimal solution to Problem 2 is robust in terms of this ability to meet any request, it is not necessarily optimal in terms of meeting specific coin requests. For example, if we only ever use change for a 3pm $1.68 cup of Starbucks in the cafeteria, then the 4-3-2-1 solution may not be the best. However, if by chance, we choose to downsize to a smaller $1.29 cup one day, then the 68c solution may not work.

First Among Equals
We only used a single goal within the optimization formulations: either min-sum, min-count, or min-weight. As mentioned in a recent post, practical optimization models almost never go into production carrying just a single goal. Consider these alternative optimal optimal solutions for 96c:

v   #     wt         p n d q
96 9    24.046    1 0 7 1
96 6    24.046    1 0 2 3

The second solution has the same weight but fewer quarters + dimes, and does the same job a little more efficiently. On the other hand, the first solution perhaps allows greater flexibility in terms of meeting other change requests (e.g. 30c, 40c).

Continuing along this path,
a) if we set the lower bound on the number of dimes in Problem 2 = 7, we obtain the following 13-coin alternative to the 4-3-2-1 answer, having the same weight and value (36.546g, $1.04) PQDN::4171

b) If we want a least-value solution to Problem 2, then an optimal objective function value whose corresponding weight is only 1.5g more than the min-weight is 1.0$ (37.912g)
PQDN::5241 or PQDN::5091
(If you ever change a dollar bill in the future, do so in a robust manner via this mnemonic: PQDN 5241 :)

Interestingly, these 1$ solutions include five pennies (a feasible 4-penny solution has a value of 1.04$ or more).

Depending on the context, one of these solutions is more preferable than the others, and this preference can be quantified by suitably incorporating secondary and tertiary goals within the objective function. In contrast, a pure constraint satisfaction model will solely focus on generating feasible solutions ignoring any preference. These principles are the cornerstone of many mission-critical optimal planning systems and operational decision support applications deployed across industries.

Saturday, June 30, 2012

Alternative Optimal Solutions and Combinatorial Risk

Optimization in practice is usually not just about setting up a well-behaved model with an objective function and finding 'the optimal answer'. While that is an interesting exercise, the real 'value add' comes from the subsequent process of recognizing the business reality that a practical decision problem typically has many answers. Consequently, analyzing alternative optimal solutions in a way that is useful to the client is quite important. In other words, practical optimization is more often about analyzing feasible alternatives that initially appear to be equal to us, but in reality have vastly different qualities from our client's point of view (note: returning 'infeasible' is not really an option, and showing why our model returned 'infeasible' is only slightly more useful). As we initiate a dialogue with our client to understand these differences and the context in which some alternatives are better than the others, we can see our lab model gradually transform into a useful business analytics tool.

Cutting across industries, I have not yet come across a single optimization problem deployed in production that does not have multiple objectives. Every seller loves to maximize profit, but not at the cost of losing out on volume or sales dollars in the process. Over time, the number and priority of such considerations change. For example, in the airline industry, it is not uncommon for large-scale crew schedule planning problems to have hundreds of different goals and priorities. The richer the solution space, the more the number of goals it seems. In fact, optimizing just a single measure is risky because such a gain ("extreme point") almost always comes at the expense of other metrics that haven't been included within the analysis yet. Which leads us to:

Combinatorial Risk
This hidden problem of 'risk', even within a deterministic modeling context, is exacerbated in combinatorial (or global) optimization situations. Here, our model analyzes multiple inter-linked decisions that can produce solutions that radical differ from current practice and looks great numerically, but in reality, can potentially hurt our client's business if actually used. 'Locally optimal' does not always and automatically mean 'inefficient'. Like globalization, combinatorial or global optimization based holistic decision making can and does bring in more efficiency and profitability compared to that obtained by combining multiple locally optimal decisions when things go as per forecast. On the other hand, if the alternative (near-) optimal scenarios along with their corresponding risk of failure are not well mapped out and thought through, the resultant machine-generated combinatorial solution can cascade the risk of a bad decision through the system.

Part-3 here.

Wednesday, May 23, 2012

Optimizing your Microwave Oven Performance using an Appalam

A far-out post on 'optimization' and 'parameter estimation'.

No two microwave ovens behave the same way even if they have the same power rating and capacity, and over time they show increased randomness in terms of energy usage and solution quality. Each one has its idiosyncrasies and as you move between apartments over the years, it is irritating to adjusted to a new oven. After you invest all the time in figuring out that the optimal settings to warm your cold coffee is 50 seconds in the previous home, using the same timing with the new one results in a lot of spilled coffee. Furthermore, objects get heated slightly quicker when placed in certain locations of the oven. What is a quick way to optimize your vessel placement and minimize your turn-around time? Enter, the Appalam.

The Appalam or the Pappad as it is called in Northern India is a circular, flattened, wafer thin dry mixture of lentils and spices. It is an almost fat-free, delicious snack if you microwave it although it can also be fried. In particular, the best brand for this test is the Lijjat Pappad. This one is hand made (or used to be). The Lijjat company rose from a tiny all-women cooperative start up in rural India (based on Mahatma Gandhi's principles of self-reliance, theirs is a remarkable and inspiring success story) and is still going strong, producing outstanding Pappads in a variety of flavors.

The idea is pretty simple: The Lijjat Pappad (LP) has a relatively large surface area that covers 75-100% of the circular glass tray in most residential microwaves. Microwave the LP for about 60 seconds and note which of its parts get heated up first (visually seen via a change in color) and how the cooking progresses. Within 60 seconds, you should be able to figure out the hottest and the coldest spots in the oven. In my current residence, the middle of the oven turned out to be the coldest, which was the exact opposite of the result in my prior residence. Of course, this is not intended to be a universal setting and only applies to a subset of foods being heated up.

Why the LP works relatively well for such a test:

1. The consistency of the LP mix is quite remarkable. It is neither thick to resist microwaving, nor too thin and very rarely exhibits any significant warping even after 120 seconds of microwaving (it will simply get carbonized before it warps).

2. The material naturally does not conduct heat well and convection doesn't help much either, so the localized heating effects show up visibly.

3. You can make a meal of your experimental subject once your test is complete. No test goes waste!

Tuesday, April 17, 2012

Analytics and Cricket - VIII: DRS & Bayes Theorem

In the last post on cricket, we mentioned that the false positive (F+) issue with the Decision Review System (DRS) employed in international cricket could be a deal-killer (see red zone in picture below).

In this post, we work out an illustrative numerical example using a well-known conditional probability model based on reasonable data derived from interviews of ICC personnel to show that the current F+ rate disproportionally reduces the efficacy of the DRS, causing it to operate only marginally more effectively that the human-only (umpire) method, and thus may not be worth the cost of maintenance unless the F+ rate is reduced to a more acceptable level.

For brevity, let's focus on bowler reviews in this example. A bowler will ask for a machine review of an umpire's original decision of not out, hoping to turn that into an 'out'. Umpires in the elite panel are themselves around 90% effective in making the right decision (so on average, only 10% of the subsequent DRS referrals should change the outcome if they work perfectly), so it is really that 10% gap that is the problem.

Today's cricket DRS system is claimed to be around 95% accurate in giving a batsman out, if in fact, the batsman is really out. Suppose the DRS also yields F+ results for just 1% of the bowler reviews, i.e. it gives a batsman 'out' when he is really 'not out' just like the umpire originally said. If 10% of the batsmen subject to bowler reviews are actually out (as obtained in the previous paragraph), what is the probability that a batsman is actually out given that the DRS overturns the umpire's decision to say he is out?

Answer: Let OUT be the event that the batsman reviewed is actually out (its complementary event is NOTOUT), and RED the event that DRS gave him out. The desired probability P(OUT|RED) is obtained using the Bayes formula by:

P(OUT|RED) = P(OUT∩RED)/P(RED)
Expanding out the terms, we can write this as
= [P(RED|OUT) x P(OUT)] /
[P(RED|OUT) x P(OUT) + P(RED|NOTOUT) x P(NOTOUT)]

= [0.95 * 0.1] / [0.95*0.1 + 0.01 * 0.9]
= 0.095/0.104 = 91%

Observations
1. Even a 1% F+ rate brings down the true efficacy of DRS, and it is not 95% as the ICC claims. The second term is a combination of F+ rate and human accuracy. Thus
P(OUT|UMPIRE SAYS OUT) = 90%
If the bowler asks for DRS review:
P(OUT|DRS SAYS OUT) = 91%
Not much of an improvement

2. The better the umpires get at their job, the worse the existing DRS will statistically perform. For example, if the umpires improve their upon their accuracy by just one percentage point, i.e. to 91%, the conditional accuracy of DRS changes to:
= [0.95 * 0.09] / [0.95*0.09 + 0.01 * 0.91]
= 90%

Thus, the tables are turned now and the DRS makes things worse for batsmen here and we may be better off not using DRS at all even if it is provided free of cost!

This second result may seem puzzling. Why does this happen? If the umpires get better, the frequency of true NOTOUT is 1% higher, and with the F+ rate held constant at 1%, there will be an increase in the total number of false positives over a period of time, in addition to a small decrease in count of true positives, thereby reducing the accuracy rate of the DRS.

You can plug in a variety of numbers to see what the corresponding results are. You can also perform a similar analysis for batsman reviews.

Recommendations:
1. Significantly cut down on the F+ rate and not just focus purely on increasing true positive rate

2. Improve the quality of original human decisions. This will reduce the dependence on DRS, encourage improvements in the DRS to keep pace, and obviously improve player attitude toward umpires.

3. If a brilliant cricketing instinct filled person like Mahendra Singh Dhoni talks about 'adulteration of human and machine', do think twice about it, he's got a useful math model behind this statement!

Reference: Introduction to Probability Models by Sheldon M. Ross. This example is a variation of an example from this book. Hope I did not mangle it.

Saturday, February 25, 2012

Analytics and Cricket - VII : Does DRS have a False Positive issue?

The last post related to cricket was quite a while ago (that the Indian cricket team has been repeatedly thrashed since then is a mere coincidence). This post focuses again on the Decision Review System (DRS), a technology-aided analytical decision-support system to aid cricket umpires. The toolkit includes a set of multiple video cameras, heat-sensing 'hot spot' technology, and ball-tracking devices that record the point of impact, as well as an additional set of predictive algorithms to forecast the counterfactual trajectory of the cricket ball (you can be forecasted 'out' in cricket). Despite the best efforts of cricket's custodians, considerable user unease with the DRS persists. In fact, it has been recently acknowledged that the use of the decision support system has had a significant impact on the game (user response: altering playing styles and inducing more 'OUT' decisions from umpires), something which this tab predicted a year ago. Reasons for discomfort also include the lack of uniformity in its deployment, the incremental dollar cost of the DRS versus incremental returns, and equally importantly from a fan and player perspective, DRS reliability (both real and perceived). This post will focus on the last two issues.

The International Cricket Conference (ICC) has focused almost exclusively on improving the technology (e.g. increased number of video frames per second, etc). The main argument here is that while an improvement in the unconditional success rate for the DRS may seem impressive, it would be more helpful if statistics are calculated and presented conditional on the corresponding human decisions made. Toward this, let's look this MBA-ish 2x2 decision matrix (sorry). Strictly speaking, the terms 'correct' and 'incorrect' in the matrix mean 'almost surely correct' and 'almost surely incorrect', respectively .

1. The ICC has a wonderful set of umpires in their 'elite panel' that referee the most important inter-nation test matches (these elite umpires are a scarce resource, and their globe-trotting schedule optimization is yet another operations research problem - perhaps a good topic for part-8 of this series). Prior to the DRS, the umpires achieved a respectable success rate of more than 90%. Consequently in such situations, the DRS getting it right is a relatively uninteresting event. This situation is denoted as the neutral zone (top-left box). Therefore the focus is on the remaining 7-10% of the time when the decisions are contentious.

2. Clearly the case where the umpire is wrong and the DRS is right (as judged by video and predicted-trajectory evidence) is a win-win for the DRS and players. This is the green-zone (bottom left) and appears to be the exclusive area of ICC's focus as far as technological improvements. However, it is not necessarily desirable to accord top priority to the goal of achieving further improvements in this statistic.

3. The problems arise when the DRS occasionally produces visibly and audibly confounding results. This is represented by the top-right box, the 'high conflict zone'. In some instances, it could be because of technological gaps or operator error (there was a recent example where an umpire whose sole job consisted of watching the TV replay and hitting one of two buttons managed to hit the wrong one). However, in other instances, the predictive component of the DRS that is used to probabilistically judge LBW (leg-before-wicket) 'OUT' decisions appeared to be flawed or incompatible because:

a. Greater the required length (or duration) of the predicted values, the more noisier the forecasted trajectory.
b. Lesser the observed portion of the ball trajectory available for 'training' (especially after spinning and bouncing off the cricket pitch), the less reliable the prediction.

The years of prior refereeing experience of the umpire, and other human cognitive powers that help him arrive at the decision is pitted against hardware and algorithmic prediction prowess. The challenge is to be able to be aware of the many degrees of freedom involving a rotating cricket ball in motion while also taking into account the effect of the cricket pitch and local conditions.

4. There may be rare irritable cases where despite best efforts, uncertainty prevails and both the umpire and DRS manage to get it wrong (bottom right box).

If the ICC can provide data on the frequency of observations that fall in each of these 4 boxes, we can of course calculate the conditional probability of a correct decision given the DRS response using well-known conditional probability models and compare with the corresponding results for the manual system. For example, how likely is it that the batsman is actually OUT given that the DRS overruled an umpire's original 'NOT OUT' decision? Such analyses helps figure out the impact of false positives and false negatives that comprise the conflict zone observations. In particular, the false-positive rate, i.e. the case where a batsman tests positive ('OUT') using a DRS when he is actually NOT OUT, should be minimized given the nature of this sport.

Recommendations
The biggest stumbling block appears to the the top-right box (high-conflict zone) that erodes user trust every time the DRS wrongly overrules what appears to be a sound cricketing decision by the umpire. As a priority, the ICC should isolate and eliminate those components that increases the occurrence of such situations. The likely candidates for culling will be the trajectory-predictor and existing flawed versions of 'hot spot'. These innovations should be reintroduced at a later stage only after sufficient improvements have been made (and while also keeping the resultant cost down) to ensure that the expected failure rates are well under control. Viewed from this perspective, a recent decision by the Indian cricket board to do away with the predictive component of the ball-tracking technology is actually the right one.

@dualnoise on twitter

Sunday, December 11, 2011

Solver precision: a consumer perspective

This is a quick post motivated by a useful question posed in this nice blog on OR software that wondered about the need for so many decimal-points of precision in commercial solvers. As Prof. Paul Rubin touched upon in his brief response, there's always going to be a few numerically-unstable instances that may justify this level of precision. The question as well as the response was quite instructive and this post mostly adds context.

Speaking from the perspective of a consumer who deploys such solvers, this quest for precision is not just an academic exercise. Ill-conditioned instances are inevitable in certain business situations and are not (yet) rare. A customer (e.g. a store manager using an algorithmic pricing software product) routinely stress tests their newly-purchased/updated decision-support system that has such a solver hidden inside it. Customers will sometimes specify extreme input values to gain a 'human feel' for the responsiveness of the product that their management asked them to use. In fact, even routine instances can occasionally defeat the best of data-driven scaling strategies, and the old nemesis, degeneracy, always lurks in ambush.

Such solvers also have to be robust enough to work 'off the box' for millions of varied instances (to the extent possible) across industries and changing business models within any given industry. Being 'optimally precise' may pay off in terms of reduced support and maintenance costs for the vendor who writes the solver code, as well as for the consumer who deploys it, and less headaches for the end-user, the customer.

With the information available in journals and the Internet, it is realistically possible for a DIY practitioner to assemble a reasonably fast and robust solver to handle their own company's LPs, but the task of building high-precision, enterprise-level MIP solvers is best left to the specialists.

Tuesday, November 29, 2011

American Airlines Chapter 11: An OR Opportunity within a crisis

The news of American Airlines' Chapter11 may sadden some members of the OR community given their pioneering work in revenue management apart from their amazing impact on the overall ORMS practice in the airline industry. However, this crisis is also an opportunity for O.R. The company will be taking a close look at their business model as well as their more tactical operations and ring in a chain of meaningful improvements. Speaking from a similar 'chapter 11' experience during the last decade, this filing represents an opportunity for OR practitioners that must be grabbed. This situation highlights what this tab has talked about before: having a well trained ORMS team with domain expertise is invaluable and i'm sure AA has retained a lot of their experts.

Specific example
The renegotiation of the existing collective bargaining agreements (CBAs) with various unions is a great application example that can generate significant 'true dollar' returns. OR can be gainfully used to analyze rule changes that significantly improve efficiency while also improving the quality of work life of stakeholders to produce win-win deals. During 2003-2006, OR folks at a similar legacy US carrier were able to quantify and successfully redefine a set of highly complex work rules that ensured better utilization and compensation without sacrificing quality of work life. What's more, such feedback can help prevent CBA negotiations from stalling and win some trust from both sides of the table. In such situations, the management is likely to be more flexible in 'opening up' old rules, allowing the full use of powerful solvers and large-scale optimization models to determine the maximum level of improvement possible by replacing such legacy constraints with more sensible ones. Every (deterministic) dollar saved during such stressful times is precious and is highly appreciated. In short, Chapter-11 is a time when OR folks on the ground can make their presence felt and directly contribute to the bottom-line. A friend in need is OR indeed.

update: some typos fixed

Sunday, November 13, 2011

The best solver award in a decision support role goes to ..

Major ORMS conferences typically coincide with new releases of existing commercial optimization software products like CPLEX and Gurobi, as well as the introduction of new ones. In the last few days, we have seen the entry of an exciting new solver product, Sulum Optimization. Dr.Bixby's eagerly awaited updates on the state of the solver universe speak of truly breathtaking improvements. So how important are such off-the-shelf solvers in practice?

A complementary 'state of the union' question that is worth asking is 'How much progress has been achieved by humans as far as analyzing decision problem formulations in practice ?' While glancing through the journal papers in the last three decades and comparing the work with the approaches taken by researchers in the 1940s-1970s, I couldn't help but feel that there is a non-improvement, perhaps even a retrogression along this dimension. It's tough to quantify the answer, but we can try to describe reasonable boundary conditions.

1. A terrible formulation is one which cannot be readily analyzed to determine an acceptable solution to the original business problem. This is not the same as 'finding a feasible solution to the mathematical optimization model', although these questions are somewhat related. Yet, a whole lot more time and publicity is directed toward the latter question. Apparently quite a few MI-Hard (mission impossibly hard) problems have been discovered out there, especially in the last three decades. These instances are so mysterious that even an acceptable solution is elusive despite the obviously massive progress in solvers and hardware. What does this tell us? An acceptable solution is either already available or can be found using OR / business insight. I certainly haven't yet come across such a MI-Hard instance across multiple industries.

2. A viable formulation is one which you can analyze to find an excellent answer to the original problem by partial or complete enumeration, within a reasonable amount of time. In other words, you are doing your job as a practitioner well if your design simply avoids crappy solutions to the original problem, leaving you with the task of scanning just the few remaining good ones. This process is merely a refinement of the 'acceptability analysis' in (1) (Watson would ask what is 'constraint programming' ?). Not only is the process of being a 'human solver' quite enjoyable, it also goes a long way in convincing your employer and a customer that a competitor cannot simply buy their own solver and achieve parity. Plus it builds street-cred for you and OR. On the other hand, a few journals love to have MI-Hard problems subjected to 'shock and awe' within their pages.

Summary
There may be rare exploratory formulations that require the full power of such solvers to generate insight when very little is practically known about the original problem. In general, solvers are useful in practice like health insurance is useful in health care. Some feel that buying the best or most expensive health insurance will somehow translate into an equivalent improvement and stability in our health, forgetting that fundamental choices like exercise, lifestyle, and diet are far more important factors. In fact, if we find that we are using our health insurance a lot, it is all the more important to question those fundamental choices.

'Which is the best solver today' is a interesting question, but the best decision model that you build will be one that most certainly does not depend on the answer.

Friday, January 7, 2011

About a microanalytical startup with 'OR Inside'

Continuing with the new year theme, we take a first look at CQuotient, an analytics-driven start-up in the retail industry based in the Boston area. I just finished reading this interview on a business info site. It's interesting to hear what the founder and CEO Dr. Ramakrishnan has to say (he's also got a blog listed on the roll at the bottom right of this tab. I don't expect frequent updates for a while :). A couple of things were eye-openers.

They seem to be among the very first to sharply focus on individual customer behavior. Are we seeing some of the first practically viable applications of microanalytical (copyright, 2011 :) techniques this year ? Retail science normally thrives on aggregating individual customers into sufficiently big bunches so that the law of large numbers kicks in. Then you can reliably analyze statistical and econometric models to realistically predict and optimize based on these high-level purchase patterns. A retail microanalytical approach that drills down to the individual customer level looks pretty challenging to pull off in reality, but looking at the team assembled at CQuotient and the computing power available today, I wouldn't be surprised if they are onto something here.

Next, CQ will provide an 'optimal prescriptive' answer to a retailer. This convinces me that they have an application with "OR Inside" and their 'coolness quotient' just went up :). Rather than just dump a bunch of charts and qualitative insights on a tired, caffeine-deprived store manager-type and wish good luck, CQ seems to take it a step further and provides optimal decision recommendations to the retailer. Practical decision analytics can give you a pretty powerful edge since it can potentially eliminate or minimize a lot of costly guesswork. In the retail industry, which is characterized by wafer-thin margins and brutal competition, such OR-based innovations can be a big deal.

A minor grouse is that the word "OR" doesn't show up in the interview, but the content shows that all the good stuff is likely to be hidden inside. The scope for OR in the new world remains undiminished, especially if somebody is brave enough to dip their hands in messy data and put their money where their model is!

Sunday, November 1, 2009

On decisioneering and dealing with sneering detractors

Part of an O.R practitioners job involves selling O.R to non-believers in the organization. Yet many of us in the O.R comfort-zone are firm non-believers that there even exist such non-believers. After all, isn't 'science of better' or its applied counterpart 'decisioneering' self-explanatory? It isn't. The 'analytics' bandwagon is going to ensure that. Last time we looked at the identity crisis facing the poor OR guy. Today, we'll examine more related aspects.

When we say a product has got 'O.R inside', what do we really mean? Is it because it's been autographed by that lost O.R scientist whose owlish ^oo^ spectacles always makes u think 'infinite loop', or, is it the bullet-proof C++ codes of O.R algorithms, the fiendishly reformulated optimization model, or the brand-new, low-latency, 16M$, 32-node, 64-bit, 128-GB SMP RAM parallel machine (yummy!) that smashes thru all your Lagrangian subproblems in a jiffy? or perhaps it's all in the GUROBI or CPLEX solvers that implements the fundamental algorithms?

The old bilateral debate of man v machine, in this context, starts with 'Math v Programming', and in true O.R fashion, cascades into some NP-complete combinatorial debate. heh. The obvious answer to many may be 'all the above', but called me biased - I feel that its the well-trained O.R grad, her/his model and solution approach that seals the deal here. Everything else is essentially a commodity, and can be quickly purchased, and therefore form the supporting cast (The real answer of course is 'none of the above'. It's the power point decks that made it all happen).

Seriously, a practitioner has to have all the soft skills to ensure that O.R gets some small share of credit in such projects, especially when things go right. After all, when its fails, its because of the O.R inside. It's because of you. Everything else was purchased and they work just fine! Suddenly, you alone know which constraint is hurting profits the most, or why a few more discrete variables kill run-times, or if the exponential service time assumption holds. Which brings me to probabilistic 'OR inside' models in practice (more on that another day). By design, its going to give you 'wrong' answers some of the time - unlike deterministic models that provide the illusion of correctness all the time. Good luck selling that!