Sunday, December 11, 2011

Solver precision: a consumer perspective

This is a quick post motivated by a useful question posed in this nice blog on OR software that wondered about the need for so many decimal-points of precision in commercial solvers. As Prof. Paul Rubin touched upon in his brief response, there's always going to be a few numerically-unstable instances that may justify this level of precision. The question as well as the response was quite instructive and this post mostly adds context.

Speaking from the perspective of a consumer who deploys  such solvers, this quest for precision is not just an academic exercise. Ill-conditioned instances are inevitable in certain business situations and are not (yet) rare. A customer (e.g. a store manager using an algorithmic pricing software product) routinely stress tests their newly-purchased/updated decision-support system that has such a solver hidden inside it. Customers will sometimes specify extreme input values to gain a 'human feel' for the responsiveness of the product that their management asked them to use. In fact, even routine instances can occasionally defeat the best of data-driven scaling strategies, and the old nemesis, degeneracy, always lurks in ambush.

Such solvers also have to be robust enough to work 'off the box' for millions of varied instances (to the extent possible) across industries and changing business models within any given industry. Being 'optimally precise' may pay off in terms of reduced support and maintenance costs for the vendor who writes the solver code, as well as for the consumer who deploys it, and less headaches for the end-user, the customer.

With the information available in journals and the Internet, it is realistically possible for a DIY practitioner to assemble a reasonably fast and robust solver to handle their own company's LPs, but the task of building high-precision, enterprise-level MIP solvers is best left to the specialists.

Thursday, December 1, 2011

OR models for discouraging criminal intent

Dr. Subramaniam Swamy, the brilliant lawyer who exposed the 2G scam in India is trying to raise awareness about this issue amongst the Indian public via a series of talks around the country. In one of his talks, he white-boards a simple but useful high-level model (shown below) to determine the level of financial penalty that should be imposed to deter future scams. This is especially important in the case of crooked politicians who tend to be thick-skinned and are willing to ride out their years in jail in order to enjoy their ill-gotten stash after 'serving out their term'. Conditioning on the chances of getting caught (p),

E[reward] = (1-p)R - p.D

where R = reward from the scam, and D = penalty to be paid if you are caught (primary decision variable). Swamy's point was that, in the Indian context, even though 'p' is relatively small in value for various reasons, all is not lost if 'D' can be made sufficiently large (proportional to 'R' times the odds of evasion) to ensure that the LHS value is unattractive enough.

Let's apply this model to the roads of India, where traffic violations are the norm. A given O-D (origin-destination) path is formed by several links, where link (i) is associated with a probability p(i), reward R(i), and penalty D(i). Crooks try to find the path of least resistance, e.g, smallest cumulative 'p' or max cumulative E[reward]. The police can counter this by a solving p-median type problem (this is a different 'p') where they can optimally position their scarce resource (traffic cops) to maximize an aggregate measure of expected penalty based on link traffic volumes. Given budget restrictions, these scarce resources have become even more scarce. However, it appears that in many other parts of the world, police departments have been smart enough to create a multiplier effect via an 'illusion of ubiquity' achieved via a 'ghost archetype' that periodically and randomly enforces the law in prominent locations in a particularly visible and pitiless manner. Recognizing the 'human element', in this case, the nonlinear 'customer response' to certain deployment scenarios can lead to a solution that increases the perceived probability p(i) for certain key links in the transportation network (beyond its actual statistical rate) without increasing actual capacity. Similarly, the presence of crooked 'bribe friendly' cops at busy intersections can lead to a nonlinear drop in public trust and perceived p(i), regardless of an increase in overall capacity.

To summarize, OR based decision support models can be effective in making a dent in corruption via analytically driven decisions that help maximize the perceived penalty for criminal/corrupt acts that in turn, leads to an actual incremental reduction in the relevant crime/scam rate, despite a scarcity of resources.

Tuesday, November 29, 2011

American Airlines Chapter 11: An OR Opportunity within a crisis

The news of American Airlines' Chapter11 may sadden some members of the OR community given their pioneering work in revenue management apart from their amazing impact on the overall ORMS practice in the airline industry. However, this crisis is also an opportunity for O.R. The company will be taking a close look at their business model as well as their more tactical operations and ring in a chain of meaningful improvements. Speaking from a similar 'chapter 11' experience during the last decade, this filing represents an opportunity for OR practitioners that must be grabbed. This situation highlights what this tab has talked about before: having a well trained ORMS team with domain expertise is invaluable and i'm sure AA has retained a lot of their experts.

Specific example
The renegotiation of the existing collective bargaining agreements (CBAs) with various unions is a great application example that can generate significant 'true dollar' returns. OR can be gainfully used to analyze rule changes that significantly improve efficiency while also improving the quality of work life of stakeholders to produce win-win deals. During 2003-2006, OR folks at a similar legacy US carrier were able to quantify and successfully redefine a set of highly complex work rules that ensured better utilization and compensation without sacrificing quality of work life. What's more, such feedback can help prevent CBA negotiations from stalling and win some trust from both sides of the table. In such situations, the management is likely to be more flexible in 'opening up' old rules, allowing the full use of powerful solvers and large-scale optimization models to determine the maximum level of improvement possible by replacing such legacy constraints with more sensible ones. Every  (deterministic) dollar saved during such stressful times is precious and is highly appreciated. In short, Chapter-11 is a time when OR folks on the ground can make their presence felt and directly contribute to the bottom-line. A friend in need is OR indeed.

update: some typos fixed

Sunday, November 13, 2011

The best solver award in a decision support role goes to ..

Major ORMS conferences typically coincide with new releases of existing commercial optimization software products like CPLEX and Gurobi, as well as the introduction of new ones. In the last few days, we have seen the entry of an exciting new solver product, Sulum Optimization. Dr.Bixby's eagerly awaited updates on the state of the solver universe speak of truly breathtaking improvements. So how important are such off-the-shelf solvers in practice?

A complementary 'state of the union' question that is worth asking is 'How much progress has been achieved by humans as far as analyzing decision problem formulations in practice ?' While glancing through the journal papers in the last three decades and comparing the work with the approaches taken by researchers in the 1940s-1970s, I couldn't help but feel that there is a non-improvement, perhaps even a retrogression along this dimension. It's tough to quantify the answer, but we can try to describe reasonable boundary conditions.

1. A terrible formulation is one which cannot be readily analyzed to determine an acceptable solution to the original business problem. This is not the same as 'finding a feasible solution to the mathematical optimization model', although these questions are somewhat related. Yet, a whole lot more time and publicity is directed toward the latter question. Apparently quite a few MI-Hard (mission impossibly hard) problems have been discovered out there, especially in the last three decades. These instances are so mysterious that even an acceptable solution is elusive despite the obviously massive progress in solvers and hardware. What does this tell us? An acceptable solution is either already available or can be found using OR / business insight. I certainly haven't yet come across such a MI-Hard instance across multiple industries.

2. A viable formulation is one which you can analyze to find an excellent answer to the original problem by partial or complete enumeration, within a reasonable amount of time. In other words, you are doing your job as a practitioner well if your design simply avoids crappy solutions to the original problem, leaving you with the task of scanning just the few remaining good ones. This process is merely a refinement of the 'acceptability analysis' in (1) (Watson would ask what is 'constraint programming' ?). Not only is the process of being a 'human solver' quite enjoyable, it also goes a long way in convincing your employer and a customer that a competitor cannot simply buy their own solver and achieve parity. Plus it builds street-cred for you and OR. On the other hand, a few journals love to have MI-Hard problems subjected to 'shock and awe' within their pages.

There may be rare exploratory formulations that require the full power of such solvers to generate insight when very little is practically known about the original problem. In general, solvers are useful in practice like health insurance is useful in health care. Some feel that buying the best or most expensive health insurance will somehow translate into an equivalent improvement and stability in our health, forgetting that fundamental choices like exercise, lifestyle, and diet are far more important factors. In fact, if we find that we are using our health insurance a lot, it is all the more important to question those fundamental choices.

'Which is the best solver today' is a interesting question, but the best decision model that you build will be one that most certainly does not depend on the answer.

Friday, October 14, 2011

Long-distance convergence

The first picture tracks the achieved percentage reduction in a certain minimization objective function value (a shortest path distance) relative to it's initial value, over time.

Not monotone convergence, but still pretty good, and the relative gap finally goes close enough to zero (within a tolerance of 0.5%). Most would be happy to see this solution performance on any O.R decision problem instance in practice.

The next picture tracks the raw values corresponding to the first plot - the shortest path in miles, between the location of my ORMS job and that of my wife's (non ORMS job). The time scale is in years. This also happens to be the commute distance (logarithmic scale).

Average convergence rate = 581 miles/year
Total cost incurred = 27,500 mile-years (counting from year "-2")

Sometimes, superlinear convergence feels terribly slow, but in this era of labor restrictions, we'll chalk this down in the 'wins' column.

Saturday, October 8, 2011

Jumbo decision models to protect the environment

As the population increase in India leads to scarce resources becoming scarcer, villagers in rural India increasingly encroach on jungle areas and vice-versa. In Orissa, the government has decided to take the help of elephants to manage this conflict.

This is a true story.

These magnificent, emotional multi-tuskers with a long trunk and an even longer memory, are much loved and venerated in India. Loved this particular song sequence as a kid.

Elephants require relatively larger habitats to survive, and in recent times, we increasingly hear of elephants and other wild animals in India going on the rampage among civilian populations located close to their territory.

To tackle this problem, the government of Orissa (a beautiful, culturally rich state in eastern India) came up with a novel control method based on self-governance: train 50 Col. Hathis and position them at vantage points. These smart jumbos serve dual purposes:

Online: actively discourage militant pachyderms from venturing too far out

Offline: speed up the learning curve for newbie wild elephants undergoing training.

An initial pilot has been successful.  Read this brief but amazing report for the real costs involved, benefits that can be realized, and the encouraging preliminary results.An interesting O.R. challenge is to maximize the effectiveness of this method without adding additional capital expenses. How should the forest officers position these limited number of smart elephants to maximize their effectiveness in controlling the herds? For example: given p 'Kunki' elephants, we can determine an optimal positioning of these elephants by solving the corresponding p-median like location-allocation problem to better control intrusions, cover more (convex?) territory, or utilize fewer Kunkis to effectively accomplish the same task.

Is this a first example of how OR and elephants can combine to help protect the environment?


A Jumbo-sized thanks to the brilliant and multi-talented Vijayendra Mohanty for inspiring this post.

This post is humbly dedicated to Shehla Masood (1973-2011).

How long is a game of snakes and ladders?

The detailed answer can be found here. You can stop right now or roll the dice and play on ...

The game originated in India more than 800 years ago and is still popular there. Other board games (Chaturangam or chess, and Pachisi or Ludo) also originated here, and a future post will explore the familiar stochastic OR topic of 'a gambler's ruin' in ancient India. Here's a picture of a native SNL board. The original Indian name of the SNL game translates to 'the path to salvation' and was supposed to be morally educational as well as enjoyable for kids. It's actually pretty profound.

From an Operations Research perspective, the game can be modeled as a Markov Chain. The moral lessons for kids in the Markovian property of SNL seems to be that no matter how terrible a path you took to get to a certain point, you retain the same possibility of salvation by consistently performing good deeds. Note the absence of any non-Markovian historical baggage of original sin. Similarly, a lifetime of virtuousness can be temporarily undone by a single big act of indiscretion that can literally bring you back to square one. As kids, we always felt nervous about the long snake lurking around the 99th square waiting to yank us down.

Per the paper (linked above) published in 1993, simulations indicated that the average number of moves for a 10X10 square game of SNL was around 39. More precisely, using the Markov Chain state transition equations, the authors (Althoen, King, and Schilling) calculate the exact expected number of moves in the "Milton Bradley version of Chutes and Ladders" to be equal to this impressive but twitter-unfriendly ratio:


or approximately 39.2. Other interesting results based on a sensitivity analysis of this value wrt adding or dropping snakes and/or ladders include:
- a six-sided die based game of SNL with no snakes or ladders lasts around 33 moves on average. A unit snake seemingly prolongs the game more than a ladder can speed it up. I felt that pretty acutely as a kid. Yet again, in true Steve Jobsian fashion (or maybe not), it turns out that the seemingly idle SNL time in Bangalore, India was a solid foundation for a career in O.R practice.

Wednesday, September 21, 2011

Anatomy of a scam

Once, even twice is a coincidence. But three is a pattern worthy of a second look.

Exhibit 1: The financial monoliths that rode the cash wave in an ocean of American tax payer money have pretty much gotten away scot-free. Their criminal greed has been marketed as a simple combination of market upredictability, non-robust math models, and corporate irresponsibility.

Exhibit 2: The response from the Pakistani government after the double-tap delivered to Osama Bin Laden a few hundred yards away from their West Point, was to admit 'gross incompetence' rather than confess to any willing participation in hiding a notorious fugitive. No punishment and continued pouring of billions of dollars down the drain. They happily take out a full page Ad in the Wall Street Journal on the 11th of this month to celebrate.

Exhibit 3: The government of India is for all practical purposes run by the Darth Vader-like Italian-born Sonia Maino, who has propped up an equally complicit 80+ year old mute puppet as the prime minister to take the heat. The current regime that has ruled India for 50 of its 60-odd post-independence years is neck deep in a series of corruption scandals and midnight arrests of peaceful anti-corruption activists. The most blatant of these is the so-called '2G scam', where billions of dollars (1.86 trillion ₹) worth of public money in form of lucrative bandwidth was all but given away to friends and family. Only the most junior ministers (belonging to the coalition-party :) are in jail. Their defense is rather innovative but inevitably based on this same theme - 'negligence' and 'uncertainty', rather than admitting to any criminal wrong-doing or fraud.

Case 3 is an interesting example. This tab has already touched upon it twice before and arguments outlined turned out to be in line with what the Harvard-affiliated anti-corruption lawyer Dr. Subramanyam Swami used in his own article to describe the reasons for the mess.

The defense in all three examples of colossal fraud essentially argue that they merely maintained the 'status quo', claiming ignorance of the true value of doing the (obviously) right thing. Their second line of attack is to plead down the severity of the charge all the way to a misdemeanor. In exhibit 3, this is being done along the following flimsy lines "since the true value of the resource can only be determined if an auction had actually taken place, the figures quoted are cooked up by vested interests". Two factors go against such an argument:

1. This figure was calculated by a government agency (!) - the Comptroller and Auditor General (CAG).

2. Unless the CAG has performed a rigorous math-based analysis and run simulations to determine the maximal revenue obtainable from selling a scarce resource in a gigantic market like India, the quoted figure that was based on an average or a reasonable auction scenario is more likely to be a lower bound on the true cost of the swindle.

How dependable are 'opportunity cost', and more generally such "if" based decision models? A paper that I co-authored as a student of civil engineering many years ago happens to be based on this idea and was used in highway resource planning in Virginia. Often times, business value of an analytics idea can only be viably demonstrated by calculating 'what would have happened to a set of past outcomes had this OR method been used instead". On the other hand, if a researcher were to build up a ladder that consists of several degrees of conditional dependence to arrive at a final value, then such chain-of-events driven claims have to closely scrutinized to ensure that we simply do not end up with 'noise'.

Sunday, September 18, 2011

Book Review: Five Point Someone

5.someone is not a book about Operations Research. It is the title of a wildly successful first book by Chetan Bhagat in 2004. He is now India's most successful English paperback novelist. While 5.s is primarily an entertaining waiting-room read, it is also an indictment of India's higher education system that has so successfully strangled creativity. Nowhere is this 'success' more visible than in India's elite Institutes of Technology (IIT) that have incredibly low acceptance rates. While the IITs have indeed become a great global brand name, they have also failed to meet India's genuine domestic engineering and technological R&D needs. Having scraped through this very system a couple of decades ago (only recently have the resultant PTSD symptoms gone away!), the urge to relive the past was non-existent until recently when the hilarious movie version came out and commented on here. Also found some O.R connections in the plot, which is the main reason for this post.

The story is about three brilliant high-school students "who never came second in their class ever" until they make it into the IIT's mechanical engineering program, where their Grade-Point Average (GPA) is in the 5.0/10.0 region that relegates them to the back-benches. The story is about the combined desperate attempts of these 'three idiots' to push that magic number up into a respectable region before they graduate, while at the same time, allocating a respectable amount of time for all-night weed-Vodka sessions with Pink-Floyd on the rooftop of their dormitory building. The resulting series of comic failures within this GPA mouse race (since "rats are smart") includes a near-suicidal jump from the 9th floor of a campus building. Furthermore, one of the three 'idiots' is dating the daughter of their arch nemesis, the Mech HoD (see 'translations' at the end of this post) who happens to teach Indem (i.e. Operations Research and Management Science).

The story took an interesting turn for me when i discovered that the villain of the piece is an OR guy. His first scene in the book has him lecturing about scarce resource scheduling and management in a garment-tailoring factory setting. The resultant mathematical analysis leads to 'optimization equations' which at least one of the idiots finds extremely interesting. Great. However, the smartest of the three idiots differs and criticizes this abstract reduction of human effort to equations saying that 'these are people we are talking about, not robots'. Nevertheless, this doesn't discourage them from optimally pooling their resources and partitioning their classroom attendance to reduce their effort by 66% while also maintaining their GPA. They also estimate the probability-weighted cost of traffic cop fines if they were in fact caught triple-riding on the motor scooter owned by one of them to be far less than the accrued benefits.

Response from Critics
The style of writing is informal and filled with campus-slang Indian English, which may affect a few sensibilities and result in some wincing for those who read 'seriously good' contemporary Indian authors like Aatish Taseer. The author (then a rank amateur and an engineering techie) typed the whole thing up himself using MS Word. The book is available at Amazon, where the critical ratings, mostly based on all too-serious analysis of the language style and delivery, are at odds with the unprecedented popularity of the substance of the book within India itself. One reason is that most Indian students are required to be reasonably fluent in 2-4 Indian languages (apart from English, C, C++, Java, ...) in addition to being exposed to dangerously high radiation levels of math and science. Consequently, they never get a chance or feel the need to master any one of these languages until it is too late (this tab is a classic example). The book can be enjoyable if you can cut through the almost blog-like writing style as well as some awkwardly written situations, and let your inner undergrad-rebel escape. The incidents described in the book are something that engineering students across India, both in and out of the IIT system, can readily identify with. The U.S education system with its growing and almost exclusive emphasis on standardized testing and credentialing (i.e the "degree"), along with a reduction in the overall number of quality tech jobs, may well be on its way there too.

Mech = Mechanical Engineering
HoD = Head of Department
Indem = Industrial Engineering and Management
Mugger = One who excels in the art of rote-learning
Ragging = Hazing
Arbit = Arbitrary
Parantha = An unleavened Indian flat bread.

Update 9/19/2011: usual typos and more OR ideas in the book.
Update 1/19/2013: Just watched Paper Chase (1973). Let's just say that the plot and some of the scenes described in Chetan Bhagat's book resemble those in the movie, with a low probability of this being a coincidence. (Interestingly, CB was displeased with the low-key credit given to his book in the Bollywood version.)

Saturday, September 10, 2011

In Memoriam: 10 years after O.R's Apollo-13 mission

Two years ago, this tab recognized the amazing but well-hidden contribution of OR practitioners to the post 9-11 recovery plans at US airlines. Does your company have the right people to handle such an emergency? It's almost ten years to the day since OR's finest hour, and after journeying through a couple of different industries since then, my personal conclusion is that it was not a coincidence that those US airlines that best overcame this nightmare also had on their payroll seasoned OR experts.
So it's worth revisiting this topic in a Q&A form to examine the impact that OR can have during such situations. It is not intended to be an 'OR hagiography' - the stock of 'traditional' OR has in fact shrunk drastically since 2001, while some different new avenues have opened up.

Q. Was OR really critical to the post 9-11 schedule recovery effort?
Absolutely. The key ingredient to finding a solution to this 'mother of all one-time decision problems' was the presence of a significant number of experienced personnel in the airline with strong OR skills. why? Large airlines that operate thousands of flights on a daily basis use expensive aircraft that burn precious jet fuel, and are crewed by tens of thousands of trained professionals both on and off the ground. All this requires a well-synchronized and manageable 'program of activities', i.e., schedule of tasks that airplanes and crews follow. Every tiny portion of schedule involves complex combinatorial and constrained decision making to make the best use of these scarce and expensive resources. Nobody in an airline understands and applies the fundamental science behind this (yes, there is one) better than OR folks.

Q. Other than the tragic occasion itself, what was so unique and challenging about this problem in particular?
The scope: mammoth - hundreds of planes and thousands of crews stranded across North America in all kinds of airports.

The stakes: huge - the quicker an airline could get back on its feet to resume normal operations, the less money it bled (the slide into eventual bankruptcy began around the same time period).

The technical challenge: very high, but this was just a part of the problem. Research to design a brand new decision feasibility and optimization model that would be actually operational, and manage a gazillion possibilities that satisfy several new constraints, work well within a brand new implementation that gets around tons of IT and engineering issues within a couple of days - then to be used for a few days before being discarded for ever. This was the perfect R&D storm.

The task: a two-stage problem that would first bring our planes and crews back to their domiciles and then get them back to a regular operational mode through the end of the month.

Q. I'm intrigued. What the heck is "OR" and what kind of OR skills were needed to weather this perfect storm?
I forgot. O.R stands for Operations Research. OR folks tend to be reclusive but do make some noise about duality once in a while. As a rule, we generally believe in hiding our best achievements as well as the real name for our field. Dissident, mediocre OR types believe that we ever so often give away our flagship award ('Edelman') to bungling senior non-OR execs (who then hire OR grads to fix the problems they created), and we all know how well they have run this planet the past decade. Sour grapes aside, OR folks deal with the science and engineering of better. For example, we try our best not to tell you how to run your business, but we can and will tell you how you could run it more efficiently and effectively. Visit for details.

To the second part of your question: Airline-OR research experience combined with solid implementation skills was the need of the hour. During the high-flying days, airline R&D intellectual property in certain business areas was 2-3 years ahead of the best output from any university. Consequently, the OR crisis team was able to quickly take apart existing business-critical decision systems (that OR guys built and managed in those days) and extract big chunks off the million line monolith of mess-C code for reuse within this one-time model.

Q. I'm now an OR expert, so i'm jumping gears. Computationally, what optimization technique was crucial?
Column generation (and emacs) turned out to be the most important weapon in the arsenal 10 years ago. We were already routinely solving mammoth monthly planning instances and steadily pushing the envelope through intense research into improved algorithms as well as million$ cutting edge parallel hardware that airlines could afford then, so multiplying the problem size by another trillion did not really matter in the end. Now there is flexibility for you.

Q. I never realized that OR, in the right hands, and for the right reasons, would be so powerful and practical. Should every company maintain an OR team?
Any medium to large sized company, and not just airlines, would be well advised to maintain and hold on to an experienced and skilled OR team having some implementation experience. OR is ultimately about practical problem solving rather than an consultancy exercise that leaves critical (and insightful) details as an exercise to the reader. Routine planning problems may well be handled using OR-commodity software packages, but the day that even a mild Apollo-13 like decision problem threatens to disrupt your company's fundamental operations and plans, nobody can help put Humpty-Dumpty together again (and in an optimal and efficient manner) like an OR team can; seen this happen just ten years ago.

Thanks, OR guy. I heard the analytics was the real flavor of this decade and ...
Sorry to cut you short, we are analytics too, for a long time now. We change our names like winamp skins.


9/13/11: fixed typos

Friday, August 19, 2011

On some decision optimization models for micro-credit planning and operations

Thanks to modern-day micro-credit pioneer Muhammad Yunus and e-commerce, it's become much easier to invest in and provide loans to entrepreneurs who do not have access to formal credit channels. Encouraged by Mr. Yunus' success story, India witnessed a bunch of new micro-credit banking start ups around 2006. Today, most of them have failed or barely survive, but is a different story. It operates like a non-profit organization that attracts 'social investors' and if success is measured in terms of the number of people helped, their track record has been amazing so far. There are lots of mischievous as well as helpful NGOs (non-governmental organizations) in India. RangDe is one of the good guys! Kiva is another well known, really good, global micro-credit organization.

Repayment rate
RangDe's payback statistics are pretty good. For example, if the rate of repayment R = 100%, and your (one-time) loan amount (support-level) is A, then the average number of people helped at this support-level after N-1 additional investments without injecting additional funds, is of course N and you still have your principal amount (plus any interest). If R is less than 1, then the upper bound on the eventual number of 'persons helped' at this support level before the money disappears would be = 1/(1-R). For example, if the average repayment rate is 90%, then you can expect to cover no more than 10 'A'-sized loans over the lifetime of this the initial amount. After 2 years, my experimental small initial sum of 55$ via paypal has funded rural Indian entrepreneurs five times and counting. Overall, RangDe's repayment rate cited on their website is 98.5% (Kiva's is 98.84%).

Pooled Disbursement Models
Loan payments have to be rounded to the nearest 100 ₹ (roughly 2$ increments). The investor has the option of selecting their loan recipient. If they did not avail of that option, then RangDe could best match the total pool of funds (S) accountable at the start of each day (say) to the uncovered demands for that day. They have to decide the integer amount X(i) in 100 ₹ increments that will be allocated to entrepreneur i. A simple knapsack version of this task would be to maximize total value of these decisions
Max Σ V(X(i))
Σ X(i) ≤ S

V(X) could be a metric that depends on, say, the time-critical need for the loan, whether the incremental X added fully completes the loan requirement and gets the recipient off the 'waiting list', 'credit score' of the recipient, etc. Additional constraints can be added to capture other business rules and governmental/accounting regulations governing the disbursement of funds, perhaps leading to a Mixed-Integer Programming (MIP) formulation that can be solved using CPLEX.

However, RangDe can still use such fund pooling optimization models as long as the investor is not allowed to specify an arbitrarily tight time-frame for the funds to be allocated; This appears to be the case, so some flexibility in the pooling of funds to cover more time-prioritized demands is possible. But can we strictly say that similar replacement monies will be available to cover original, intended recipients on time?

Risk-Managed Coverage Models
My 55$ can be used to safely satisfy today's general needs if it can be guaranteed that this amount will be unconditionally available toward covering my target loan D(i) in time. RangDe probably cannot guarantee this, but what if a service-level tolerance, say, 99% was allowable (i suspect this level may be bounded by the repayment rate)? RangDe could provide this option to the investor upfront who can then check a box to allow their money to be used to satisfy more time-critical needs with a small probability that it may not be possible to fund their chosen target on time.

The supply available to meet critical needs can be augmented toward additional at-risk allocations (Y) that can be made at some cost functional C(Y). For sufficiently large positive values of C, Y=0 is optimal and the model reduces to the risk-free disbursement formulation.

Max Σ V(X(i)) - Σ C(Y(j))
Σ D(X(i)) ≤ S + Σ Y(j)

The formulation should satisfy these requirements:
1. No funds at risk will be touched unnecessarily.
2. A fund at-risk will be allocated only if the value from using it exceeds the risk-cost.
3. RangDe has to calibrate this cost function to ensure that the desired service level is maintained over time, and could transparently display the service level profile over time to maintain investor confidence.

The focus of this post is not on building the perfect OR model; rather, one hopes this motivates research in the exciting area of micro-credit optimization and analytics. I'm sure there are many more aspects to micro-credit planning and operations that can be improved using OR approaches. Here is an opportunity to tangibly contribute to the community and maybe even make modest profits to go with it :)

Historical note
"Rang De" literally means "to color". Historically, it is popular as the words in the patriotic song immortalized by India's revered freedom fighter Bhagat Singh as he walked to the gallows in 1931.

Saturday, July 30, 2011

Error-reduction codes hidden in Vedic chants

The Vedas are timeless. Indic scholars do not ascribe a particular time to its origin because it has always "been there" - the Vedas are considered to be a constantly evolving treasure of knowledge, wisdom, science, math, and 'best practices' since time immemorial. Modern researchers' requirement to assign a measurable date to an event and slot it into their finite history model results in a 'start date' for some random snapshot of the Vedas that is anywhere between 2-5K years prior to the common era.

The Vedas are in Sanskrit
The Vedas contains a wealth of useful guidelines and ethical principles to ensure a productive lifestyle that is in harmony with the surroundings. Indeed, the word as well as the religion 'Hinduism' that is in use today is an inaccurate artifact created by colonizing Europeans (update, 2013: Term 'Hindu' is much more ancient). The correct term is Sanathana Dharma, a phrase that very roughly translates as "the right way for good guys (and gals) to lead their life". There is no equivalent for Dharma in English, and there is no word for organized religion in Sanskrit to the best of my knowledge (update, 2013: highly limited knowledge).

Fidelity of the Vedas
From an algorithmic and modeling perspective, an amazing fact is that the Vedas have been orally transmitted and remembered using Vedic Chants across millennia, and what we hear today is probably what was first chanted a very, very, long time ago. In a world that is so accustomed to written proof, this approach and it's implications can be a bit tough to grasp initially.

Vedic Chants and its Impact
Per Wikipedia "The UNESCO proclaimed the tradition of Vedic chant a Masterpiece of the Oral and Intangible Heritage of Humanity on November 7, 2003." The Vedas are to be listened to, and not primarily intended to be a reading text. The textual meaning of the Sanskrit words in the Vedas are but a part of its benefit, but the way it is combined together and chanted in a specific manner is equally if not more important. It is this entire structure that has been remarkably preserved across eras and onslaughts including Greek, Mongolian, Central Asian, and European invaders. There were no "heretic" writings for them to easily destroy and thanks to these error codes, India has been able to maintain an unbroken chain of culture and thus retain the most ancient of its connections to its roots.

Here's a digital sample of a Vedic chant. Some may notice a passing resemblance to mystic pagan, Gaelic, or Gregorian Latin chants.

Combinatorial Aspects
I'm betting those wise old Rishis (Indian sages) would have dug OR for its practicality, so they sure would have loved to chant:
"operations research is the science and engineering of better". We'll drop the 'engineering' part for the sake of brevity.

The meaning and pronunciation of each Sanskrit word stands unambiguously by itself and is written just like it sounds (unlike, say English). Sanskrit is characterized by precision. It requires no punctuation marks. Orally, this is accomplished by chanting each word in the passage distinctly, one after the other (a "/" indicates a brief pause).

Code 1: operations/research/is/the/science/of/better

However, even a few hundred cycles is likely to induce errors (In English: where's the comma, pause, syllable emphasis, etc), so they must have decided to add a second layer that chains a word to the next one:

Code 2: operations research/research is/is the/the science/science of/of better

Each word is now spoken twice in order, thereby introducing some redundancy, while also explaining how pairs of words are to be spoken in combination. Remember, the pitch and tone for a particular word may depend on one or more words that precede or follow any given word. Sure enough, another Rishi decided to take this a step further by chaining three words to introduce further redundancy and handle cases where the ebbs and flows in the chants persist a bit longer.

Yet another Rishi put words together in an apparently strange way to create:

Braid Code: operations research research operations operations research/research is is research research is /is the the is is the/the science science the the science/science of of science science of/of better better of of better

The number of times a particular word is repeated is 6. The phrases seem to be chanted in a forward/reverse/ forward format. However, this is a moot point in Sanskrit because switching the order of the words in a sentence does not change the meaning! In other words, each section in Code 2 is chanted thrice in different orders, but they reinforce the exact same meaning each time. Indeed, this also ensures that Rishis who loved to swap words around could now do so (even inadvertently) without fear of messing up the cadence. Pretty cool. Those who've followed this tab for a while would have noticed that the words in this tab are often hopelessly out of order. It's not easy to downgrade to English and write acceptable OR journal papers :)

There are a few more codes, but the pièce de résistance is this particular "bell code" which also happens to be the most pleasing to the ear:

Bell code: operations research research operations operations research is is research operations operations research is/
research is is research research is the the is research research is the/
is the the is is the science science the is is the science/
the science science the the science of of science the the science of/... and so on

The number of times a particular word is spoken in this code is 10, i.e. a factor of 10 redundancy. A sample of the Bell-code chant (the example chosen is the "peace mantra") can be found toward the bottom of this site. Search for the phrase "Ghana Paatha". The first 2.5 minutes is the same mantra that was chanted in the first example, but is now done by a different individual. Comparing the variations in the two performances gives us an idea of the challenge faced by those Rishis eons ago.

If the order of words doesn't matter, there's going to be an exponential number of ways of speaking a sentence correctly without altering the meaning (although some choices are more preferable and popular). But this also serves as a natural fault-tolerant mechanism. Each ordering is an alternative feasible solution that results in at least the correct decoding of the meaning of the passage; yet another reason why the treasures of the Vedas would have been lost to the world without the magical medium of Sanskrit. The Rishis appear to have managed these permutations by carefully designing the chants in sync with phrases to make it easier to orally encode and decode a massive amount of information across generations without a single written note. Indeed, they may have thought of this as an infinite horizon problem given the cyclical time concept of Indian philosophy. They appear to have converged on a limited number of simple encoding rules that appear to be quite effective as well as practically robust. It is estimated that it would take a month or two of continuous chanting to cover the entire bell-code version of the Vedas!

Present Day
We know that there's an inherent connection between linguistics and computer science. If Jeopardy were to be played in Sanskrit, would IBM's Watson have been more accurate and less dependent on guesswork?

These ideas also seem to be used in reinforcing learning among kids. Songs and rhymes with repetitive themes appear to be relatively easier to remember over long time periods. The holiday song in the previous post is a good example. If we ever wondered what drives otherwise fun-loving Indian kids to memorize entire lexicons to collect endless Spelling Bee trophies, and why there is so much emphasis on rote learning in India, now we know its because of Shruti and her dual, Smriti.

It would be nice if our business customers were to present all their business rules, priorities, and goals in bell-code Sanskrit verse on a CD rather than Word text and Power-point charts to avoid the endless headaches arising from the inevitable losses in translation!

Credits and Disclaimer
The sources for this post include numerous websites and individuals. It's tough to list each one, so a big thanks to all who are working hard to preserve this part of the world heritage in its original form and context. Being neither a Sanskrit or an English (or pretty much any language) expert, errors, inadequacies, and misinterpretations, if any, in this post are of course, entirely mine.

update: some typos fixed.
2nd update: January 2013

Wednesday, July 27, 2011

A tough MIP for the winter

The 7.33 Days of C.Xmas

On the first day of C.Xmas,
mipsolver sent to me
A node-zero bound and a tiny branch-and-bound tree.

On the second day of C.Xmas,
mipsolver sent to me
two depth-first searches that dove ...,
and came back empty to a growing branch-and-bound tree.

On the third day of C.Xmas,
mipsolver sent to me
3 French-invented cuts,
2 'doves',
and a tighter bound, trimming the branch and bound tree.

On the fourth day of C.Xmas,
mipsolver sent to me
4 call-back birds,
3 French cuts,
2 doves,
and a still-infeasible branch-and-bound tree.

On the fifth day of C.Xmas,
mipsolver sent to me
5 hi-flying rounding heuristics,
4 call-back birds,
3 French cuts,
2 doves,
all tossed into a gigantic branch-and-bound tree.

On the sixth day of C.Xmas,
mipsolver sent to me
6 more goose eggs,
5 rounding heuristics,
4 call-back birds,
3 French cuts,
2 doves,
and still no incumbent in that branch-and-bound tree.

On the seventh day of C.Xmas,
mipsolver sent to me
7 types of warnings,
6 more goose eggs,
5 rounding heuristics,
4 call-back birds,
3 French cuts,
2 doves,
all mocking me from the overflowing branch-and-bound tree.

8am after my last day of C.Xmas,
manager said to me,
(this song is exponentially grating)
7 days of computing,
6 paying customers a-waiting,
5 days of test-match cricket watching,
4 P-series server-snatching,
3 more tough MIPs in your job batching,
2 minutes for u to clear your desk of every bird-dropping,
and climb that branch-and-bound tree.

Disclaimer: Purely in jest, to state the obvious. Claims of any statistically significant correlation to any real MIP solvers, real managers, real holidays, and virtually anything else that is real is just as unimaginative as this song.

Saturday, July 16, 2011

Where are the new OR innovations coming from?

The telephone appears to be increasingly rejected in favor of returning to Morse-code like telegraphic tweets and talk-free 'radio' text messages. Is OR going through a similar cycle? In a prior tab, empirical evidence was provided to suggest that the so-called age of analytics did not really start at Y2K; like the telegraph and the radio, its always been there and only gone digital now.

Our airline customers who pioneered the construction of scalable OR-embedded infrastructure to manage complex revenue and cost issues have seemingly run out of similar low-hanging fruit. Attending a recent INFORMS conference felt like reading a classifieds ad: "thoroughly impressive solutions seek unanswered practical questions to justify time and expense". This does not mean that we are not innovating - far from it; it's just not from within the OR community, where we continue to indulge in our dual laundering cycle of model building and tool polishing. Occasionally we hunt for non-existent (or worse, gullible and real) customers to pay us good money to take the resultant code-scrap off our hands. After all, OR is merely the science of 'better'. It is our customer who did all the hard work of taking it all the way from 'nothing' to 'good'.

The world of retail is one example of a margin-starved, data-rich industry that is driven by a realistic necessity to innovate. Our retail customer has been gratefully but carefully adapting practically useful resource optimization techniques from the airline world. By carefully refining these methods in the demanding retail context, they are generating a bunch of new analytical 'best practices'. Combined with some good old sales techniques, these approaches appear to be on the verge of reigniting similar innovations in other industrial sectors.

On a related note, OR resembles the fast-drawing gunman of the wild west legend whose niche skills are desperately sought after and bid for by a town threatened by outlaws, but one who also becomes a liability for the town once peace has been established. We should avoid outliving our welcome and be more proactive in seeking new 'towns'.

Tuesday, July 5, 2011

Goodbye, Maine.

There's much forest in Maine, more than any other state in the US. If you randomly parachute into ME, the odds of landing in mother earth's lap is about 1:8, which improves marginally in the winter. People love the outdoor life. This is the land of Acadia, a place for camps and hikes, and is so full of summer life. Yet ME is also the resting place for fallen heroes who have always arrived quietly from distant lands - Korea, Vietnam and Iraq. On rare and lucky occasions at Bangor Airport, you get to see the tired but elated troops walk safely home after a tough tour of duty in another one of those quagmires. And there are other days, when the same faces wait to cross the pond on the orders issued by generals and politicians. Bangor, is an important logistical point for such overseas operations. You can't get a decent runway more north-east in the mainland US.

Nothing sensational happens in Maine, and yet the first events at dawn on 9/11 unfolded at the Portland airport, and at dusk, a small airport-town in Newfoundland, just north of Maine improvised splendidly as they hosted several international flights that were ordered to land as soon as they touched North America. We border just one US state, and I cheered in vain for the 2010 Canadian Edelman team who turned out to be from the 'hood (New Brunswick).

Maine is a strong blue state which elected not one but two red senators, grumbled about it, and then decided to elect a red governor and grumbled even more. ME actually voted against FDR in 1936. Tough times makes for tough decisions. ME always scrapes the bottom in terms of the other green - business-friendliness, and most Mainers struggle through these tough times, as people seem to put up more and more of their possessions 'for sale'. Yet it's hard to get angry at people here after a bad day at the office - even the cable TV operators here are so damn friendly and helpful. The post-woman delivers mail flawlessly in her USPS car on her route, and come winter, very politely asks me to do something about her postbox, even though she knows the answer. Her nemesis is the ubiquitous snow-plow truck whose demon drivers perfectly take out every postbox on their TSP routes for fun. Its a game where Stephen King's creatures rule the streets of Bangor at night, and blissful peace returns at daybreak. There is this duality about Maine that will be missed.

Thursday, June 30, 2011

Worst practices: gender and literacy shaping

Demand shaping is an OR-rich technique of using control levers (e.g. pricing in a supply chain) to induce a demand shift away from scarce items, or toward products in excess supply. If you know the shadow price of constrained inventory (e.g. an airline seat in a particular market), you can use that information to make useful quantitative control decisions that maximize utilization and drive profits.

We look at a couple of related examples of demand shaping done for all the wrong reasons. While I present a couple of egregious examples from India, this is most certainly not a unique Indic problem. Hopefully such examples with motivate a deeper OR study into this worldwide and systemic human rights violation pattern in the human biological chain.

Example 1: gender shaping
Urban India is in the midst of a pop-culture-driven left-liberal movement for men and women that addresses a lot of new-age social issues (e.g. right to fashion, right to a fairer skin tone), while almost completely neglecting the more fundamental 'right to life' issues. This isn't a pro-life v pro-choice discussion. While there's valid arguments on both sides in that debate, this is about the nihilistic practice of gender shaping practiced in several parts of the world, including, but not limited to the English-speaking literati of India, where the sex of the unborn can seal its fate. Since a positive identification is possible only after many months into pregnancy, the term used in India for such gender shaping is close to the truth: female infanticide, i.e. murdering baby girls (for monetary reasons). The gender shapers have succeeded in reducing 'demand for excess inventory' by more than 160 million in just the last few years. In fact, as researchers have argued, since the viability of the female fetus is probabilistically higher, that 'lost-sales number' is merely a lower bound. India has banned sex-determination tests to reverse the tide, but the practice is still rampant.

One of Russel Ackoff's OR consulting projects in India was focused on the primal problem of over-population which motivates the second example. A section in his classic book examines why demand-shaping there (in the benign form of family planning) was relatively unsuccessful. Filtering through all the noise, it now appears that gender-shaping is merely it's evil dual twin.

Example 2: literacy shaping
President Obama has more than once pleaded for US students to shift toward increased enrollment in science and math "STEM" courses to be able to effectively compete with their Indian and Chinese counterparts (A quick look at the recent results of the national spelling-bee and national geographic math/geography contests will tell u why). This was widely published, but where? A Google search shows that this was carried prominently by many newspapers in Asia (a classic example of a complementary "cross elasticity" effect in pricing analytics). Not that students in those countries need any more motivation. The utter bankruptcy in governance exhibited by the dynastic ruling political party (which ironically calls itself "the congress party") that has ruled India for more than 75% of the time post-independence, has driven the country to near ruin. The number of good-quality higher and primary education institutions has been practically stagnant for decades. Rather than do the actual work of building more schools and colleges with all that money pouring in from the liberalized economy, the education ministry has adopted an easier, but dastardly three-pronged literacy shaping approach:
a. decrease the rate of construction of primary and high schools to control the resultant demand for higher education

b. Increase the score threshold for acceptance into the existing higher education institutions.

c. Misuse the well-intentioned affirmative action 'quotas' to drive demand toward politically influential families.

(a) effectively ensures that the poorest and marginalized communities of India cannot really avail of the illusory affirmative action-fenced inventory segments. The literacy shapers further shift demand toward those who wield power via (b) and (c). For the last 3-4 years, we have had farcical situations where, in some parts of India, the score threshold (i.e. the shadow price for a scarce university seat) was adjusted to 100%! In US terms, it means that only those who have a picture perfect SAT score are eligible to apply to a reasonably good college.

That brilliant lady professor from India who shined at your local OR conference is one of the lucky ones who managed to evade the dual traps of gender and literacy shaping.

Friday, June 24, 2011

Duality in music

Regular dual noise service is interrupted to provide some dual music for a change. Check out this piece by the music band 'Tirtha', aptly titled 'Duality', I kid you not:

Being a bit biased toward instruments with strings attached, fast-forward to about 5:50 and listen. That is India's most versatile guitarist, Prasanna, playing. To listeners from India (South India, specifically), Carnatic music's distinctive 'microtone' notes ring through pretty clearly, whereas Jazz aficionados will enjoy the 'swing' i suppose. He seems to be playing both of these, but then, it is not what radio-stations label either as 'fusion' (not quite imaginative), or 'new age' (out of ignorance?). Both Carnatic and Jazz music are original, classical art forms with a rich history and a dedicated fan-base. The former is native to S.India and the latter, apparently is America's only true native art-form. Equally, he is playing neither - purists are more likely come to this conclusion after detecting some 'noise' intertwined with the specific form of music they swear by.

From an OR perspective, i would like to think that the two seemingly unrelated musical forms in this piece achieve the same objective, follow certain well-defined rules and satisfy certain well-understood and pleasing properties, and depending on how you 'look' at it, you can call it Carnatic or Jazz. Duality - there is more than one way of getting things done. Whether you are a South-Indian who is just beginning to discover Jazz, or a Jazz-fan who is touched by the Kalyani Ragam for the first time, it's a great two-for-one deal.

And there is one more connection that might explain the name of the musical piece. One ancestor of Prasanna was the mathematical genius Srinivasa Ramanujan, the 'man who knew infinity'. Signing off with an old solo piece by Prasanna, composed when he was an undergrad at the Indian Institute of Technology, Madras:

Wednesday, June 8, 2011

Open-source Java tools for OR/BA projects

A list of open-source Java-based resources for optimization and business analytics is presented in this post. There's surprisingly few of these going around. COIN-OR has relatively few or no pure Java tools, but if you have heard of any, please post.

Object-oriented, algorithmic, decision-driven analysis of a complex business system having many moving parts, is an effective approach, and Java, a solid delivery option. Add to this the inherent cross-platform interoperability and performance improvements in recent times, and Java becomes a pareto-optimal deployment choice in many project instances.

This is by no means an exhaustive list, and yes, it's missing all those heavy-duty MINLP and SOCP Cadillac solvers, but there's always some solid mileage in your old Chevy if you're a good driver (whatever that means)

LP solver: "Gooplex": Google's 'toy' implementation that is well-written and thus extensible (Apache commons) to a more sophisticated DIY simplex implementation. An old post on this topic here. Great for solving one or a million tiny LPs.

NLP: L-BFGS for large-scale unconstrained nonlinear optimization. This is a translation of the original FORTRAN version written by Jorge Nocedal. The license for this tool appears to be fairly relaxed (LGPL), but do your own checks. The bound-constrained L-BGFS-B is not 'free' for commercial use.

Trivia: Which is supposedly the fastest pure Java LP solver 'in the world'?

Predictive analytics
Colt: high-performance tools for linear algebra, matrix methods, OLS, etc.
Jama: A light-weight package for linear algebra
OpenForecast: Moving averages, multivariate regression, etc.
Wikipedia's meta list of numerical Java tools
Libsvm: SVM for classification, etc.

Sunday, May 29, 2011

April 1963 discussion: OR v Analytics

While digging through old scientific articles trying to ascertain when O.R gained a solid foothold in India, I stumbled on this gem from 48 years ago. The pdf version of this article that was published by "the defense documentation center for scientific and technical information, Cameron Station, Alexandria, Virginia" can be found here. The article is interesting for a variety of reasons. Among other things, it critically appraises the work of Russel Ackoff and Mahalanobis (about whom this non-blog will comment on in a later post) on using OR models for solving nation-planning problems in developing countries.

Words from the original document are in italics and any emphasis below in 'bold' is mine.

Ackoff asserted that a large role was both feasible and desirable. He predicted that extremely high returns would result from addressing national planning problems with operations research techniques in these countries. In striking contrast to this viewpoint, the ORSA president (Dr. Charles Hitch) at that time, stressed the risks of over-selling what OR has to offer the underdeveloped countries at the level of national planning, and apparently wanted OR'ers to focus on tactical and commercial applications at the project and industry level since 'OR is the art of suboptimizing' On the other hand, the document also notes that since then, Hitch has been a pioneer in adapting OR to national defense problems in the United States as Assistant Secretary of Defense (Comptroller). Hmm.

Other dissenters similarly commented on the characteristics of problems to which operations research can be most successfully applied, e.g., abundant and reliable data, a well-structured model, and a clear, reduceable objective function. [Dorfman] concluded that the conditions that are most propitious for the use of operations research tend to occur in "routine and technical problems" at lower and middling levels. The document includes examples that apparently point out the hazards of applying operations research techniques too quickly and broadly.

Salient features of this document include:
1. An open and frank conversational style of writing
2. The presence of constructive and sharp dissent without disparaging the worth or the author of that prior contribution
3. A strong emphasis on the practical method, which really distinguishes OR from other disciplines

- all three of which is mostly missing in the 'sterile' articles that we see published in recent times. When we arrive at the last paragraph that precedes the main body of work in this document (which consists of copious amounts of what we would almost certainly classify as 'analytics' today, e.g., causal statistical regression models, etc), we read this:

First, I am not particularly concerned with whether it might be more appropriate to apply the labels "econometrics," or "systems analysis," rather than "operations research," to one or both of these examples. Methodological purists may find it preferable to fit the examples into one or the other of these categories, but for my purpose, what we are concerned with is the application of quantitative analytical techniques to decision problems in the underdeveloped countries. From this standpoint, econometrics applied to practical, policy problems Is operations research.

At least for me, this argument amicably settles the non-debate on the dual noises emanating from our tribe on 'OR v analytics', but of course, it is wishful thinking that this practical discussion will fix the larger contemporary issue that is centered on 'brand labeling' and 'image perception' than anything that practically and directly affects our customer base, which should have been our #1 priority. An inclusive dual identity for a person is fast becoming the norm in this non-homogenous, 'globalized' planet, and to paraphrase Shakespeare, OR by any another name would add the same value to our customers.

The author of this informative 1963 document is Dr. Charles Wolf, Jr. A brief bio of this distinguished gentleman, and another one can be found here. By sheer coincidence, the May Informs blog challenge appears to be about 'OR and Analytics'.

Friday, May 27, 2011

Memorable battle scenes on film - spot the O.R connection

This post consists of 'tragic battle' scenes from old classics. The aim is to find a popular O.R theme (however flimsy) in the clip, as we honor those who fought for a free world and laid down their lives so that we could continue to build O.R models, unconstrained. There are certainly many other war-movies with stronger military and civilian-OR themes, but these are five sentimental favorites. Enjoy the memorial day weekend as you think of other OR-themed movies!

5. Dirty Dozen (1967) - Jim Brown's incomplete TSP. What a ripper. A politically incorrect movie by today's "lofty" standards.

4. Von Ryan's Express (1965) Sinatra finds a feasible train route through a hostile network, but doesn't make it.

3. Haqeeqat (1964) - Final scene. The Indian army fails to build a robust supply-chain that is required for high-altitude warfare against the Maoists. Still remains the greatest Indian war movie ever made. An incredibly touching rendition of 'Kar Chale' by the great Mohammad Rafi that tears listeners up every time.

2. Enigma (2001) . Alan Turing, tragic genius, WW2 Bletchley park. OK, this is a relatively new movie, but the point is that critical 'analytics' battles have been fought by humble OR-types who saved countless lives and certainly did not nit-pick over whether it was 'OR' or 'analytics'.

1. Sholay (1975) Amitabh's fatal attempt to sever the critical link in the network and save lives succeeds. Favorite all-time movie scene in the most popular Indian movie ever made to date.

Monday, May 16, 2011

Prescriptive Analytics: Optimize the history you are about to create

An important part of business analytics / OR practice is to assess the impact of a prediction-driven decision-support (PDS) system on:

a) the end-users aka consumers
Data-driven analytical prescriptions are based on perturbing a predictive model, which in turn is (usually) based on the observed collective consumer response to the same or similar products offered in the past. If the PDS recommends a clearly obvious pattern of decisions that differ from the past, it can change the behavior of even non-savvy customers, the cascading effects of which can be disruptive to the product provider's business. With all these mobile apps, the population of non-savvy customers is shrinking every day. Therefore, being proactive in designing the PDS to account for this feedback can be important.

b) the PDS
Some times, the decisions that the PDS recommends today (based on yesterday's history), become part of tomorrow's history, which in turn drives the predictor. However, we can be proactive in designing a PDS that is more likely to 'make' a friendlier history down the line. Furthermore, the 'inventory' of history you need to stock up on to calibrate your predictor can also be minimized. History, like gasoline, is often a scarce resource in OR practice.

Monday, May 9, 2011

Will the associates of #OBL move east?

Will the discovery of OBL in a posh hill-station in Pakistan increase the probability of finding the rest of his crew members? If so, will it be optimal for the other members to relocate from their current hideout to a safer place to minimize their chances of discovery but at the risk of briefly surfacing? What is the conditional probability that they are also in eastern Pakistan, given that OBL was found in that area? Does it increase or decrease? If Al-Q initially wanted to avoid an 'all eggs in one basket' situation, then they must have spread out and chosen to hide in places comfortably far away from each other. But on the other hand, if it was 'every rat is on his own', then they may have in fact have ended up, either independently or in collusion, gravitating toward the same geographical area.

It's my 2 cents worth of arguments based on just a flimsy data point that the latter case seems likely now. That they are hiding in places where they least expect the US to pop-in unannounced (Pakistan was never a concern, it now seems). A few hundred yards from a major military academy so far away from Afghanistan must have seemed pretty safe and for just for this reason alone, the raid achieved ultimate surprise. If this is the case, then there's probably a couple more in the vicinity. Pakistan-occupied Kashmir seems like another safe haven. It's further east, and most of the terrorist camps, whose recruits so predictably hit India after the winter snow melts to open up the passes, are located there. Furthermore, the element of surprise is gone for most part. Pakistan is unlikely to welcome further chopper incursions unless they are extremely well compensated.

So an Al-Q card-carrying member who lives west of Abottabad may feel the need to head further east. If he gets into India (which is just 60 miles away from Abbotabad!) through the porous border, there's 1.2 billion people to blend into.

Saturday, May 7, 2011

Informs Northeast Conference 2011 - summary

The INFORMS Northeast conference 2011 was held at U Mass, Amherst yesterday and today. There were many wonderful presentations from students and practitioners alike and it did not feel too crowded either. There was good representation from the OR-strong schools in the Northeast. In terms of industry participation, bigwigs IBM, Oracle (Retail), GE, the US Military, and a host of other interesting LLCs participated, giving students some valuable exposure to some real world business analytics and OR.

My personal favorites:

- The poster session: One example: Using OR to optimally do vehicle routing and sequencing to expedite power restoration after a major power loss in the network at multiple locations. Pretty innovative. and cool.

- Dr. Simchi Levi's plenary on the effectiveness of long chains in improving flexibility - Innovation in supply chain optimization continues...

- Optimal use of Airspace/runway capacity at congested airports - There were three talks featuring young OR/MS faculty members - I suspect we will hear a lot more about them in the coming years

- Agent based simulation by GE corporate research to optimally and practically manage a terribly complex 10-year project involving river dredging. There is something intuitively appealing about ABS and its utterly object-oriented approach to modeling and I do hope I get to use that idea somewhere.

Other comments:
- students (and some others as well) need to be doing more scientific-graphical presentations. More pictures - not the clip-art clutter / video junk type, but a well-thought out, informative visualization of scientific results that makes your approach more transparent to an audience that doesn't necessarily share your background. Mind numbing equation after equation is a kill-joy. Why tell when you can show?

- Of course, great work by the main chair, Dr. Hari J. of U Mass (that's thirty letters in a name, like his blogspot address). The 'health care and OR' sessions were quite packed (there were even some bonafide MD's presenting analytical stuff!) and hopefully this means that a lot more innovative work is on the way in this important area.

Nice work, and hope for an encore next year.

p.s: GPS sub-optimal response reconfirmed
This issue was mentioned on this non-blog a few months before, and now we have reconfirmation that for a brief time period, the GPS routing algorithm (Garmin, my favorite brand) does not always recalculate the optimal path if you deviate from the chosen route. The best route from Elmsford, NY to Amherst MA per google maps appears to be any one of three arc-disjoint pareto-optimal paths (practically). For a period of 10 minutes when I was off the GPS-recommended route in pursuit of an alternative optimal route (that was less congested historically), I was asked to get back to plan and every rejected exit increased my ETA. Finally, the GPS thingy gave up and recalculated from scratch and suddenly the ETA plunged to a value pretty close to the original value, as expected.

This strategy to 'get back to plan' is not uncommon and I don't consider this a defect (not yet). At large airlines, real-time aircraft routers and dispatchers (aka irregular ops) typically utilize such a scheme to minimize any unintended cascading disruptions introduced by wholesale re-optimization.

Tuesday, April 19, 2011

Is your query being queried?

In the world of OR practice, it is almost always the case that a solution approach for a particular analytical problem in one industry is mathematically equivalent to something that was solved in a completely different context three decades ago. To avoid re-inventing the wheel, a first and basic step to rule this in or out would be to Google this (for example). However, there are some unexpected occasions when a query that combines some known well-documented methods and ideas returns very few hits. These situations are interesting because it hints at a probability that the attempted method/approach could be something that is practically new or rare. Perhaps you are asking an original and specific question in a context that doesn't have a well-documented answer. Could this potentially be an 'intellectual property' ? I get excited, but ...

On the other hand, these are exactly the queries that could also interest a search engine provider - technical words/phrases that on their own generate a shipload of responses, but in combination, yield no more than a handful. I get a bit worried. Will my query be flagged and put in a 'queue in a cloud' somewhere, waiting to be inspected by man and machine? Am I going overboard here? Two decades of explaining OR to non-OR people can do that. But didn't that Watson thingy provide the right questions to answers in a jiffy without even using the Internet? Maybe I'm just better off reinventing the wheel and doing 'original research'. Let the lawyers sort it out.

After three weeks of blissful soaking in the joy of India's successful world cup campaign that culminated in the biggest victory party in the world atop cloud nine, it's back to square one.

Sunday, March 20, 2011

Impact of a decision-support system on user behavior

An interesting aspect of OR and business analytics practice is the chance to observe the impact of a decision support system (DSS) on paying customers who deploy it in real-life situations to improve business decisions. A good DSS tends to make users more innovative - they feel more secure that they have 'science' and automation working for them, and try to maximize benefits and progress toward becoming super-users.

A robust DSS should anticipate and react practically to a wide range of user inputs. In particular, if the DSS is based on a historical data-driven analytical model, inputs that are within the historical range would produce outputs roughly based on 'interpolations' - more reliable. On the other hand, an input not seen in prior history is (again, roughly speaking) equivalent to operating a heavy machine outside the functional limits, and may produce extrapolations that may make analytical sense, but little business sense. The latter input scenario makes the DSS look 'silly' because the business-savvy user would have provided a far better manual answer for such new inputs, because unlike the DSS that is just a computer program, they 'know'.

A bad DSS is one that is not trusted by users. There's more to it that good analytics. They may turn off the analytical portions and just use the automating features. Usually, it is equally if not more important to get the non-analytical portions right to ensure that customers are always getting some value for money. Then there's my pet 'timely failure theory' gleaned the hard way. If "infeasible" is an utterly unavoidable option in a DSS output for what ever reason, then make sure that such failures are immediate. Nothing irritates twitter-era humans more than having to wait 20 minutes to find out that nothing is going to happen.

In the ongoing cricket world-cup, the new umpire decision review system (UDRS) that predicts 'outs' using the Hawk-eye ball-tracking technology is turning out to be an example of a good DSS. The machine-free accuracy rate of umpires was 92%, which jumps to 98% with DSS-assisted decisions. If the DSS determines that the predicted result is marginal, the original decision of the umpire stands.

In the past, if the users (umpires) had a slight doubt, they would err on the side of caution. In cricket, that would amount to a 'not-out' ruling in favor of the batsman. That behavior appears to have changed a bit. Now, the decisions are slightly more 'aggressive' i.e., in favor of what the DSS decision is likely to be if a TV-referral were to be made. If the DSS is a more accurate predictor (which it appears to be), then this is likely to be a positive change in user-behavior, on average. A big chunk of the 6% error reduction may well have come from a reduction in the number of 'false not-outs'. Given that cricket today is skewed in the favor of batsmen, this is a welcome development.

On the other hand, cricket administrators have to work hard on defining a simple and effective operating range for the DSS. Otherwise, detractors (batsmen!) will use this as an excuse to get the DSS banished. Similarly, the time taken by the incumbent process to process and return a response to an input review is far too long . A batsman who is going to ruled 'out' by a damn machine after playing an idiotic shot doesn't want the global television audience to see replay after embarrassing replay of his moment of madness.

Sunday, March 13, 2011

Analytics and Cricket - VI: Improving the Cricket World-cup Schedule

The second-most watched single-sport tournament on the planet (after the soccer world cup) is underway in the subcontinent. The ICC cricket world cup is held once every 4 years and the latest edition is being jointly hosted by India, Sri Lanka, and Bangladesh. The original list also included Pakistan, but a militant attack on the Sri-Lankan national cricket team's bus there two years ago led to safety concerns. Fueled by the Indian economy, cricket is now a multi-billion dollar sport and the creation of a professional city-franchise based sports league in India three years ago is the latest example of this phenomenon. For some reason, cricket has always had an intimate connection with Operations Research (search for cricket in the archive). Today's post presents an overview of the potential benefits of using OR-enhanced scheduling in cricket.

The 2011 edition stretches over six weeks, and considering the fact that the number of teams who qualified for the finals is only 14, the duration of the tournament appears to be far too long, and has come in for considerable criticism. For example, the FIFA soccer world cup with more than twice the number of teams in the fray was completed in one month. There have been quite a few instances where the idling time for teams between any two games has been close to one week. Given this (and to minimize the number of boring mismatches), a recommendation for the 2015 world cup has been to reduce the number of teams from 14 to 10.

The 2011 WC format
The 2011 tournament schedule can be found here. We provide a quick overview of the format and 'constraints' here. The teams have been divided into two groups of 7. Each team in a group plays every other team in its own group. The top 4 teams from each group after the league phase enter the knock-out stage that includes 4 quarter-finals, 2 semis, and a grand finale. Each game lasts up to 8 hours, and is typically played in a day-night format, starting around 2pm and ending by 10pm. The majority of the revenue comes from TV ads. The media rights for this edition were sold for $2 Billion to ESPN. League matches involving host nations (and India in particular) garner huge ratings. Marquee match-ups are typically scheduled during weekends. Given the magnitude of revenue at stake, one can guess that even a 1% improvement to the schedule in terms of increased viewership and reduced player fatigue (thereby resulting in better match quality for fans) can add a lot of value.

A sample of hard and soft constraints
A minimum gap of two days between successive matches is a must to ensure adequate time for rest and travel. Minimizing total travel cost and idle time appears to be desirable. Host nations play their games on home turf to the extent possible. Day games (9am start) are also possible, if the morning fog/mist/dew factors are not overwhelming. On the other hand, locations with consistent dew problems at night are better suited for day matches to ensure a more fair contest. Reserve days are required for games that are washed-out due to inclement weather during the knock-out stage of the tournament, and any feasible schedule must take that into account. Of course, the OR-driven Duckworth-Lewis rules for rain-interrupted games are used to maximize the chances of getting a fair contest under the circumstances.

The Big Picture
The global cricket season is itself pretty packed and the overall schedule has come in for much criticism due to player burn out as well as overselling the game. As a cricket fan, it is pretty obvious that the status-quo is so dismal that better scheduling can maximize long-term revenue while also minimizing burn-out. Overselling is a major issue, not only because it kills the golden goose (long-term fan involvement in the game), but the number of inconsequential games being played has lead to match-fixing and spot-fixing (similar to point-shaving in basketball). The money involved in illegal betting is mind-blowing, and many suspect that it already is or could become another source of income for terrorist groups operating in the Af-Pak area. One does not expect OR to help resolve all these issues, but it can certainly be applied to some of the key ones it is well suited for, and the cascading positive effects can make a difference to the overall situation.

Potential OR Approaches
The Traveling Tournament Problem (TTP) popularized by Dr. Mike Trick at CMU appears to be a good starting point for improving the schedule for future cricket tournaments, including the world cup and the IPL, and ultimately, the global cricket season itself. Exact or heuristic approaches that combine constraint programming with MIP models appear to be well-suited to such problems given the complex and 'idiosyncratic nature' of some the scheduling constraints that tend to be imposed in cricket, as well the value added by even small scheduling improvements.

Looking Forward: The 2015 World Cup and Beyond
The next edition with be the "Anzac" world cup, hosted jointly by Australia and New Zealand. The distances between some cities that are likely to host some of the matches can be enormous (e.g. Perth and Sydney are more than 4000 miles apart) or relatively tiny (e.g. intra-NZ games, or Australian east-coast games (Melbourne-Adelaide is less than 800 miles), and the value that can be obtained by adopting optimized schedules can be significant.

I hope this post presents a reasonable high-level overview of cricket tournament schedules and motivates interested OR'ers to further investigate this problem. As a cricket tragic and OR professional, I would be happy to contribute toward any such effort.

(To be submitted as an entry to the March-2011 Informs Blog challenge: OR and Sports)

Thursday, March 3, 2011

Low Probability High Consequence Sporting Wins

LPHC events are always interesting since the implications (conditional expected cost), given the occurrence of the event, tend to have cascading side-effects. In risk-management analytics (e.g. optimal Hazmat routing), limiting the conditional expected risk turns out to be important. On the other hand, in the realm of sport, LPHC events are often quite desirable. Who doesn't love to hear and re-hear those stories about back-to-the-wall fight-backs and come-from-nowhere wins? However, among such magic moments, only a small subset have long-term and wide-ranging implications. Often, we have to wait for years or decades to see how the story unravels and how far the ripple effects go.

Yesterday's massive sporting upset of England by Ireland in the cricket world cup - yet another (truly) sensational match in my hometown within a week - has the potential to fall under this category for a variety of reasons. First some highlights of Kevin O'Brien's amazing counter-attack.

USA beat England in the 1950 soccer world cup - an incredible upset, but one that ultimately did not make much of a difference to the sport (arguably). On the other hand, Joe Namath's 'guaranteed' win in Super-bowl III could be termed a LPHC event since it seems to have played a big part in strengthening the NFL brand by creating a gripping storyline for the future. Or more recently, the 'Invictus' rugby story of South African Springboks that united a nation on the verge of being torn apart by the after-effects of the inhuman apartheid - yes, it is a true story. Sports, like politics is local and everybody has their favorite LPHC picks. For brevity, the top two on my short-list of LPHC sporting upsets are:

1. The 1980 miracle on ice. The uplifting impact of this story on people who haven't even stepped on ice, and the inspiring convergence of the sequence of events, imho, transcends sport, nationality, and time.

2. Was the 1980s the last and greatest decade of pure sporting action around the world, across all sport? India's cricket win over the invincible Caribbean world champions in the 1983 world cup final. If I recall, the odds of team-India winning the world-cup was something like 50, 000 to 1. A eleven-person nobody-group of not particularly athletic cricketers, speaking different languages, practicing many different religions, and coming from different backgrounds, yet each proudly representing a poverty-ridden, but democratic nation of 800 million hopefuls. Competing against the supremely powerful, atrociously talented, all-conquering team that was the West Indies. Watch the trailer to the wonderful documentary film tribute to this Windies team ("Fire in Babylon").

Notice the almost subdued celebrations in those days, and compare with the over-the-top ones by Indian cricketers today!

Incredible though that day was, nobody predicted the after-effects.Events on that London day started a three-decade long perfect cricketing storm that is now a multi-billion dollar professional sports-and-media franchise with a billion-person advertising market, and growing.

Ireland as a nation is financially reeling. Many people are moving out of Ireland, bringing back memories of the potato famine days in the 19th century. Cricketing-wise, they may not even be allowed to compete in the next world cup in 2015. As if all this wasn't enough, they still have to overcome (?) the proverbial luck of the Irish. Can they do the impossible?