Showing posts with label operations research. Show all posts
Showing posts with label operations research. Show all posts

Sunday, February 8, 2015

RT if you agree, FAV if you don't: Does it work?

A simple way of binary polling on twitter that reaches a large number of voters, especially if the polling handle (@H) is sufficiently popular, is to get followers to re-tweet (RT) the message if they agree with the proposition P, and 'favorite' (fav) it if they disagree.


In the above poll, while Kejri is as phoney as they come, the result shouldn't be this skewed.

(This is just a cursory analysis)

Suppose m followers of @H RT, and fav. This gives us a m/n ratio among @H's followers. Next:
a) The message gets forwarded to the followers of the m "RT" handles
b) The message is not forwarded to the followers of n "fav" handles

and this RT'ing process continues. As long as the 'agree' nodes in RT chain via (a) represent random samples, this poll can work reasonably despite only the ayes propagating the message. i.e., the RT probability of the followers of an 'agree' handle (@YES) should not be influenced by @YES' RT (i.e., there are no significant 'vote-banks'). If birds of the same feather flock together, this assumption may not hold, and we are likely to see a disproportionate number of 'agree'. On the other hand, if the poll reaches any 'disagree' handle whose followers are also likely to disagree, the poll does not reach any of these followers, skipping many of the 'disagree' nodes of the social network. The final result may be skewed favorably toward 'agree'. Of course, the nays need not just 'fav'. Then can create their own separate tweet that links this poll, and share this with their followers. Perhaps @H can send out a modified tweet:

'Proposition P: RT if agree. Fav and share this msg if u don't'.

This additional step may restore some balance. However, the burden of extra work falls on the 'fav' group, and only the enthusiastic 'fav' folks are likely to oblige. The final result may remain skewed in favor of 'agree' but could yield a better result compared to the unmodified polling tweet.

Alternatively, @H could send out two independent tweets, the first requesting an RT for agree, and the second, an RT for 'disagree'. Will this 'RT competition' work better? (RT if you agree, RT if you don't :)

Saturday, August 30, 2014

The Best Decisions are Optimally Delayed

The lessons learned from the last few years of practice have convinced me that analytics and OR (OR = Operations Research), or at least MyOR is mainly about learning the art and science of engineering an optimally delayed response. Good analytics produces an optimally delayed response.

But why introduce a delay in the first place? Isn't faster always better? 'Science of better' does not always mean 'science of faster'. From age-old proverbs we find that in between 'haste makes waste or knee-jerk reaction' and being 'too clever by half' lies 'look before you leap'. If we view Einstein from an OR perspective: "Make things as simple as possible, but not simpler", the reason seems clearer. We must make situation-aware and contextually-optimal decisions as fast as possible, or as slow as necessary, but not faster, or slower, i.e., there exists a nonzero optimal delay for every response decision. A middle path in between a quick-and-shallow suboptimal answer, or a slow-and-unwieldy 'over-optimized' recipe. Of course, one must work hard during this delay to maximize the impact of the response, and put Parkinson's law to good use, as suggested below:
See this old post on 'optimally delaying an apology' to maximize benefit to the recipient, or recall the best players of every sport being able to delay their response by those few milliseconds to produce moments of magic, or Gen. Eisenhower delaying the call to launch D-Day. In the same way, a good OR/analytics practitioner will instinctively seek an optimal delay. For an example of this idea within an industrial setting, read this excellent article by IBM Distinguished Engineer J. F. Puget on taxicab dispatching that he shared in response to the above tweet. Implication: If your analytics system is responding faster than necessary, then slow it down a bit to identify smarter decision options. The 'slower' version of this statement is more obvious and is a widely used elevator pitch to sell high-performance analytics and optimization tools.

The history of the 'optimal delay' is many thousand years old, going back to the writing of the world's longest dharmic poem, the Mahabharata, which also includes within it, the Bhagavad Gita, one of the many sacred texts of Hinduism.

(pic link: http://www.indianetzone.com)

The story about how this epic came to be written is as follows:
Krishna Dvaipana (the Veda Vyasa) wanted a fast but error-free encoding of the epic that could be told and re-told to generations thereafter. The only feasible candidate for this task was the elephant-faced and much beloved Ganesha, the Indian god of wisdom and knowledge, and remover of obstacles. The clever Ganesha agreed to be the amanuensis for this massive project on the condition that he must never be delayed by the narrator, and must be able to complete the entire epic in ceaseless flow. Veda Vyasa accepted but had his own counter-condition: Ganesha should first grasp the meaning of what he was writing down. This resulted in a brilliant equilibrium.

Veda Vyasa composed the Mahabharata in beautiful, often dense verse that Ganesha had to decipher and comprehend even as he was writing it down without lifting the piece of his tusk that he had broken off to inscribe, from the palm leaves. If Ganesha was too slow, it would potentially give Vyasa the opportunity to increase the density and frequency of incoming verses that may overload even his divine cognitive rate. If he went too fast, he would risk violating Vyasa's constraint. Similarly, if Vyasa was too slow, he would violate Ganesha's constraint. If he went too fast, his verse would lose density and risk becoming error-prone, and of course, then Ganesha would not have to think much and perhaps write it down even faster. Imagine if you will, a Poisson arrival of verses from Vyasa divinely balanced by the exponentially distributed comprehension times of Ganesha. Writer and composer optimally delayed each other to produce the greatest integral epic filled with wisdom ever known; written ceaselessly in spell-binding Sanskrit verse, without error, and flowing ceaselessly to this day without pause.

I can think of no better way to celebrate Ganesha Chathurthi than to recall and apply this lesson in everyday life.

Monday, November 4, 2013

Operations Research for the SmartGrid - 2: Optimization Problems

In this sequel to part-1,  an overview of a few, specific optimization models employed within three different Smart-Grid areas is provided. I found these samples to be interesting from a practitioner's perspective. INFORMS publishes a lot of good papers in these areas, and is an excellent source of references.

Managing Electric-Vehicle (EV) Charging within a Smart-Grid
EV System Perspective: 
EV problems are fun. Researchers in Hong Kong have investigated efficient online charging algorithms for EVs without future information, where EV charging is coordinated to minimize total energy cost. EVs arrive at the charging station randomly, with random charging demands that must be fulfilled before their departure time. Suppose that N EVs arrive during a time period T, indexed from 1 to N according to their arrival order.  The optimal charging scheduling problem minimizes a total (convex) generation cost over time T, by determining the best charging rate for EV i at time t, x(i, t), subject to various constraints.

The difficulty of an efficient charging control mainly comes from the uncertainty of each EV’s charging profile. Prior approaches assume that the arrival time and charging demand of an EV is assumed to be known to the charging station prior to its arrival. However, the HK team have come up with an optimal online charging algorithm that schedules EV charging based only on the information of EVs that have already arrived at the charging station. They show that their approach results in less than 14% extra cost more an optimal offline algorithm,   which can be potentially reduced even further.

EV User Perspective
Cool routing algorithms can also be developed from a user-perspective. One interesting work I came across takes a more holistic and rigorous view of the simple Tesla routing problem that I blogged out a while ago. Researchers in Europe look at a routing subproblem within the more general context of managing congestion at EV charging stations and minimizing the impact of EVs on grid performance. They employ an algorithm to compute the best charging points to visit to based on the estimated travel time (shortest-path, and reachability based on available battery power) to all charging points, and the point-specific charging cost. Input: O-D locations, initial battery level, desired charging level for the EV user, and the preferred time of departure. The optimal route, the subset of intermediate charging points, and the slot that the EV is going to charge at, are returned as output.

Demand Response and Pricing
Revenue Management and supply chain analytics folks will be pretty familiar with this area. Smart-devices can be programmed to automatically (with human overrides) respond to price changes by reducing or rescheduling electricity usage. Couple of differences here from standard RM/SCM problems : i) unlike supply chains that process manufactured widgets through warehouses and distribution centers, it is quite difficult to efficiently store and "ship" electricity using batteries, although the technologies are getting better each day. Thus, the electricity we use is probably produced less than a second ago, and marginal costs can spike during peak periods ii) electricity tends to be incredibly inelastic, making it stubbornly resistant to pricing changes, unlike say, smart-phones.

We noticed that prior approaches that apply peak-hour "congestion" pricing tended to 'migrate' rather than mitigate peaks. By carefully combining peak and off-peak pricing with accurate short-term load forecasting, and jointly optimizing the entire price profile, it is possible to proactively flatten the overall predicted load profile by inducing customers to make small shifts in their usage. Even a small peak shift-reduction during high-load days can result in a lot of savings. In fact, our experiment using actual Smart-Grid data showed that even a half a percentage point peak reduction using optimization could potentially lead to more than a 25% reduction in cost, which can benefit both the customers, as well as the utility companies.

Smart-Grid Control
Among the variety of problems solved here, researchers are also looking at the security-constrained optimal power flow that aims to minimize the total cost of system operation while satisfying certain contingency constraints.This smart-grid formulation extends the standard optimal power flow (OPF) problem, which determines a cost-minimal generation schedule cost while satisfying hourly demands, as well as energy and environmental limits, and meeting network security goals. Optimization methods used here include Benders decomposition, as well as Lagrangian multiplier techniques. Some of the newer variations employ distributed algorithms that are designed to work on a massive scale.

These are just a few samples that caught my attention. There are a variety of other problem areas being addressed (e.g. batteries, renewable energy sources, micro-grids), all of which perform some type of an optimization.

Saturday, October 26, 2013

Operations Research for the SmartGrid - 1

A popular joke in my undergrad campus at IIT-Madras used to be "why is the large water tank in our campus not used? Answer: the design engineers did not take the weight of water into account". The legend may be as real as the croc in the campus lake, but newspaper reports a few days ago quoted a US government spokesperson saying that the 'heath insurance website worked correctly, but just did not take the volume into account'. I'm sure a lot more attention was paid to the voters within the sophisticated analytical models used during the 2012 elections. Volume was not a problem then, somehow. Actions reflect priorities, as Gandhi said. So what are the priority areas in Smart-Grid research?

I recently attended the IEEE SmartGridComm 2013 international conference in the beautiful city of Vancouver, Canada. (A very brief historical tangent: From my Indian-American immigrant perspective, Vancouver is also a somber reminder of the discrimination that was once practiced by the US and Canadian governments, exemplified by the Komagata Maru incident). The paper presentations were refereed entries, uniformly of high quality, and largely focused on the dizzying science and technology associated with the various elements of the smart-grid (electric vehicles, batteries, wind, solar, communications, security, ...). Marry this with 'Big Data' and you get the convoluted buzz of two 'hyperbolic' distributions. Personally speaking, the glaring problem was this: the tech part felt overcooked, and the human part, somewhat overlooked, save for this five-minute talk, and the excellent keynote talks, which emphasized the latter (a favorite keynote comment described the important and immediate practical problem of 'transmission optimization' as the drunken uncle of the smart grid - largely ignored, but full of smart ideas). I found that several others at the conference too shared an opinion: the single most important component of the Smart-Grid remains the people for whom it is being built in the first place. If anything, understanding their behavior and impact is more important than ever before.

The world of electrical system modeling is full of elegant math that manage electrons that flow through circuits obediently as dictated by the equations. These models match up relatively well with reality (even imaginary numbers work here). In contrast, real world ORMS projects usually begin with people's real and changing requirements, and culminates in finding lasting solutions for real people, using noisy and incomplete SmallData. Unlike widgets, packets, and electrons, the goal of accurately modeling human response largely remains an open challenge, and the temptation to simply ignore this component of the SmartGrid is strong. However, the empirical, perhaps paradoxical, lesson I've learned the hard way is that the more effectively we want to mechanize, automate, and optimize systems by reducing or eliminating manual intervention (i.e. save humans from humans, a la Asimov's robots), the more practically important it becomes for our optimization models to take into account the behavior of, and the implications for all the human stakeholders, upfront. Be it workforce scheduling, Big data analytics, or the SmartGrid, an ahimsa-based multi-objective approach that also minimizes harm or maximizes benefit to the human element and blends harmoniously with the environment is likely to be more sustainable. Which is another way of saying: SmartGrid is one heck of an OR opportunity and I'm glad to be a small part of this journey.

The next part of this series will review some interesting SmartGrid optimization problems.

Wednesday, September 4, 2013

Sine-Generator in the Aryabhatiya, 1500 years ago

Came across this post by Rajiv Malhotra on facebook, and am posting a screenshot.  More research and findings continue to trickle in about the discoveries and creations of ancient Indian mathematicians, and this sloka (verse) is not an isolated example. The focus of this brief post is on the design of the sloka from the data compression optimization aspect.

Brevity and conciseness was important to ancient Indian scientists and linguists for various reasons. Their success in optimally designing such slokas to be memorized by maximizing the density of information carried per unit syllable, while also (and this is important) ensuring that the sloka is logically sequenced and constructed, and relatively easy to memorize, is utterly remarkable. In particular, the affection for the minimal among those Sanskrit grammarians is legendary. There is an oft-repeated saying in ancient India that the joy experienced by a grammarian who manages to achieve a half-a-syllable reduction is only equaled by their joy upon the birth of their child. The choice of the objective function drives the design approach. In contrast with the previous example, we have the Vedic chants in Sanskrit, the world's oldest, unbroken surviving oral tradition, where redundancy was deliberately added to minimize error in oral transmission (an achievement that may have contributed toward India having the world's oldest, unbroken civilization). In this case, what was memorized and repeated was much, much longer than the original verses in order to optimize information and audio fidelity.


(Mysore is a city in Karnataka, India, whose state language is Kannada)

Update (Sept 5, 2013)
The English transcript of the sloka can be found here in the Nature Journal (thanks to Srikrishna ‏for sharing this link).


Update 1: December 15, 2013
A superb post at http://hindufocus.wordpress.com on a compressed Sanskrit verse for remembering the value of Pi to 32 decimal places.
"..... Here is an actual sutra of spiritual content, as well as secular mathematical significance.
gopi bhagya madhuvrata
srngiso dadhi sandhiga
khala jivita khatava
gala hala rasandara
..... The translation is as follows:
O Lord anointed with the yogurt of the milkmaids’ worship (Krishna), O savior of the fallen, O master of Shiva, please protect me."

Wednesday, July 31, 2013

86400X speedup?

I once read a research paper that stated that their customized nonlinear solver reduced computational time for a particular problem class from days to seconds, i.e., something like an 86400X speedup. 
(pic source link: espn.go.com)

Digging a little deeper, it seems the authors did not notice prior work that solved similar sized instances of a more difficult discrete nonlinear case, using an analogous CPLEX-based approach, in a few seconds to a few hours in the worst case. Even a conservative 'from several minutes to a few seconds' mean-improvement is impressive (~100X faster). After deleting complicating side-constraints and relaxing integrality restrictions, the resulting continuous relaxation can indeed be solved really quickly compared to the original problem.

Amartya Sen recently confessed to pulling numbers out of thin air to grab people's attention, lending credence to the claims of his detractors. I hope O(claims) does not turn into a total marketing game in the future.

Saturday, July 27, 2013

Exhaustive Search for #orms books in South India

Blog#2 from India.

A bookstore I've regularly visited over the last couple of decades is the 150+ year old Higginbothams on Mount Road, Madras (Chennai).

(pic source link: wikipedia)

My first ORMS textbook: 'Linear Programming and Network Flows' by Bazaraa, Jarvis, and Sherali was purchased here. In recent years, finding the latest ORMS books here has become tedious. The Chennai Metro rail construction has adversely impacted the parking area in front. Yesterday's search spanned the entire second floor, and ORMS/Business Analytics books were found in the following sections: Management (Management Science books), Operational Management (OM, Supply Chain), Statistics (a slew of Intro2ORMS books, probability models, queuing theory), Linear Algebra (Linear Programming), Computer Science (computational complexity, graph theory, combinatorics), ... Perhaps this situation is a reflection of the field of ORMS itself. It's been a multidisciplinary area since the beginning.

Sapna book house in Bangalore has a good collection of computer science oriented material.
(pic source link: images.touristlink.com)

SBH once boasted of a huge collection of technical books, and it's still a pretty decent place to look for books relating to business analytics, data mining, and machine learning.

Gangarams is another well-known book store in Bangalore.
(pic source link: dnaindia.com)

I've found very few ORMS books here in the past, and I plan to skip it this time around.

Off late, flipkart.com, the Amazon-like online bookstore offers a massive assortment of books, including ORMS titles. It ships only to Indian addresses currently, accepts credit cards, and offers free home delivery within 2-7 business days. Despite these benefits, online shopping denies one the simple joy of a leisurely enumeration search in the afternoon, wading thru stacks of old, dusty tomes in search of that one hidden gem. In a totally random location at Higginbothams was a copy of 'Optimization for Machine Learning'.

Sunday, July 14, 2013

Analytics and Cricket - XI: Using DRS Optimally

The Ashes
The first Ashes cricket test that concluded earlier today ago in England triggered this post.
(pic source link: guardian.co.uk)

This post is again related to the Decision Review System (DRS) that combines machine and human intelligence to support evidence-driven out/not-out decisions in cricket. The previous cricket-related post can be found here where a reader critiqued an earlier post on the false positives issue of DRS (that saw a lot of visits last week during this cricket match).  It's now apparent that England won the match thanks in part to their superior use of DRS compared to Australia.

DRS
The DRS consists of 3 components:
1. The human (a team of umpires, camera manners, and hardware operators)

2. The hardware (hot-spot, slow-mo cameras, snick-o-meters, etc). These are the data gathering devices.

3. The Analytics and Software (ball-tracking and optimal trajectory prediction, aka 'Hawkeye' based system)

This gives us three separate (but possibly correlated) sources of error:
a. Operator error

b. Hardware error: Technology limitations - resolution, video frame rates, hardware sensitivity, etc. may be inadequate at times for sporting action that occurs as fast as 100 mph or spin at 2000 rpm ...

c. Prediction algorithm error: Given such variations in sporting action, a forecast of the future trajectory of the ball is also subject to uncertainty.

(pic source link: dailymail.co.uk)

A smart user, after sufficient experience with the system, will be able to grasp the strengths and limitations of the system. In test cricket, a team is allowed no more than two unsuccessful DRS reviews per inning. Thus it is a scarce resource that must be cleverly used to maximize benefit.  In fact, the DRS is an example of a situation where the use of a decision support system (DSS) itself involves decision-making under uncertainty using a meta optimization model.

Optimal Usage of DRS
There are several factors that dictate when the trigger must be pulled by a cricket captain to invoke DRS to try and overturn an on-field umpiring decision.
i. The probability of success (p)
ii. The incremental reward, given a successful review (R)
iii. The cost of an unsuccessful review = cost of status quo (set to 0, normalized)
iv. The expected future value of having no more than k reviews still available in the inventory (concave, decreasing f(k), with f(0) = 0).

Reviewing only based on (ii) is a like a "Hail Mary" and banks on hope. On the other hand, paying exclusive attention to (i) may not be the best approach either, since it can result in a captain using up the reviews quickly, reducing the chances of taking advantage of the DRS later when "you need it the most". A person who doesn't use DRS at all (or too late to have an impact) leaves unclaimed reward on the table.

Probability Model
We'll start with a simple model. It's not perfect, or the best, but merely a good starting point for further negotiations.

The value of do-nothing  = f(k).
The value of a DRS review = p[R + f(k)] + (1-p)[f(k-1)] = pR + pf(k) + (1-p)f(k-1).

It is beneficial to go for a review when:
pR > (1-p)[f(k) - f(k-1)] or

p/(1-p) > [f(k) - f(k-1)]/R

i.e., odds must be greater than marginal value of a review / marginal reward

In other words:
it is good to review when the odds of overturning the on-field decision exceeds the ratio of the expected cost of losing a DRS review, to the expected incremental reward.

Use Case: for fifty-fifty calls (p = 0.5) with a single DRS review in the inventory, you would want to review only if you are convinced that the present reward is likely to exceed the value of not having DRS for the remainder of the innings. For a fixed reward, the RHS increases steeply after the first unsuccessful review due to the concave f. To be really safe, you want to risk a second and final unsuccessful review only when you can trigger a truly game-changing decision that greatly increases the chances of winning the match.  In general, R may neither be a strictly increasing nor a decreasing function of time. This is especially true in limited-overs cricket where a game-changing event can occur very early in the game. However in soccer, baseball, or basketball, R can be reasonably approximated as an increasing function over time. In general, it makes sense to save the review for the end-game. In any sporting event, including cricket, which is heading for a close finish, it may be beneficial to delay the use of a review.

In the Ashes test, Michael Clarke, the Australian captain appeared to pay more attention to 'R' and less attention to 'f', and was left without recourse at a crucial stage, and this hurt his team. On the other hand, the England skipper Alastair Cook delayed the use of DRS: The last wicket of a closely contested match fell when the game had reached a climax (R = R_max), and was DRS-induced. Thus, optimally delaying DRS involves the constant assessing and updating of risk versus reward and pulling the trigger when the odds are in your favor.

Analytical Decision Support Systems
A smart organization will aware of the strengths, weakness, and value of DSS based decisions. In some industries that are characterized by shrinking margins, even small incremental gains in market-share or profitability using DSS can alter the competitive landscape of the market. This motivates an interesting question: If two firms employ the same decision analytics suite provided by the same vendor, does it necessarily cancel out? or like in cricket, can one firm do a better job of maximizing value from the DSS to gain a competitive advantage?

Updated July 17, 2013:
It turns out, that wicketkeeper Matt Prior was instrumental in ensuring England's good DRS strategy. As we all know, a good prior saves your posterior during crunch time!

Thursday, June 27, 2013

Bengal Holocaust: The Analytics of Mahalanobis - 3

Part-1: Storm Chaser. Part-2: Traveling Surveyor.

The Forgotten Holocaust
When the Nazi death camps in Europe were discovered by the allied forces toward the end of WW2, General Eisenhower ensured that many people, including allied troops as well as local civilians got a good look at those concentration camps, and had the evidence documented for posterity. Today, after more than six decades, the belief that the Jewish holocaust actually happened (with a probability of 1.0) is accepted by most of the world. A key step in such situations is to record as much evidence, and obtain multiple and independent verification of events in a scientific and unbiased manner, when the relevant facts still have a fresh time-stamp and eye witnesses are alive and willing to go on the record.

While the Jewish population, and others in Europe were being cleansed from the world of the fictional master-race of Aryans (another byproduct of the discredited Aryan Invasion Theory (AIT), which was initially concocted by a few 19th century German researchers analyzing Sanskrit texts, and later 'weaponized' by the British Raj for empire building), occupied Bengal in India was dying of starvation and disease. Some historians have attributed the deaths of these 3-6 million innocent Bengalis in the 1940s to the British occupation policy - one that was inspired by the very same AIT that the Nazis whom they were fighting, also subscribed to. However, there was neither an Eisenhower nor his resources to allow an exhaustive audio-video-text recording of the facts surrounding this tragedy in Bengal. Millions of innocent Indians were claimed by that holocaust 70 years ago, and much of the world knows very little about it.

Let's look at the data.

The Bengal of Mahalanobis, 1943
The British empire's occupation model ensured the conversion of India from a thriving manufacturing and knowledge hub into an agrarian, subsistence economy. India turned into a supplier of slave-wage labor as well as a market for Britain's finished goods. If this wasn't enough, the empire monopolized the mother-of-all drug running operations whose supply chain originated in Bengal. Bengal's native economy was badly dented when the second world war arrived.
 
An old map of India (possibly) in the 1940s. Bengal is situated in the eastern part of India.
(pic source link: gypsyscholar.com)

Mahalanobis had just published reports of his pioneering work on optimized random sampling to estimate the Jute crop in Bengal (an effort that ranks among the earliest victories for the field of Operations Research). A reading of his research papers point to a person who was a dispassionate collector of data and details, a stubbornly methodical person (his detailed breakdown of costs incurred during the jute crop sampling optimization project is mind-boggling), and a person who was enthusiastic about, and skilled in the application of analytics to solving real problems.

Mahalanobis was requested by a representative of the occupied people to do something to bring to light the facts surrounding the Bengal holocaust. Thus it transpired that a Operations Researcher / Applied Statistician, and not some military general, was entrusted with the job of recording the facts of the Bengal holocaust that occurred between 1942-1944.

An increasing number of historians today are coming around to the view (including many blogs) that was held by many in Bengal - that this crisis was an inevitable outcome of British Raj policy. Going one step further, Madhushree Mukerjee has painstakingly compiled evidence and data that implicates the British Raj and the one person she felt was most responsible: Winston Churchill, in her recent book "Churchill's Secret War". The evidence is disturbing, and readers can make up their own minds.

(pic source link: http://www.marymartin.com)

The focus of this post is not on Churchill but to try and grasp the estimated scale and dimensions of the tragedy brought to light by the systematic findings of Mahalanobis' randomized survey.

The work of Mahalanobis
Mahalanobis applied for and obtained a grant from the Government after a considerable delay to conduct his analysis. Time was of the essence, and he again employed his cost-effective yet accurate methods based on random sampling (an overview is provided in part-2 of this series) by interviewing families located in different locations within Bengal. Approximately 16, 000 randomly selected families from 386 villages were surveyed between July 1944 and February 1945. Detailed statistics including: loss of life in the family by age and gender, the mortgaging and/or sales, either in part or full, of farm land, as well as the sale of cattle used to plough the land, their profession, economic status, etc. were collected.  Bengal was divided into regions, classified based on the degree to which they affected by the famine. The survey design took this into account and weighted averages were calculated to avoid over- or under-reporting of mortality rates.

Official forecasts of food supply and demand were of pretty poor quality and 'bad guesswork' as mentioned in the prior post, and unreliable. The survey report noted that Bengal was suffering from food inventory deficit well before the crisis of 1943: The net annual import was 100K tons on average, and up to a million and half tons during individual years, during the seven year period between 1933-39, i.e., even before WW2 started, from which point it could have only gotten worse. Data also showed that the pre-1943 rate of land sales was rising in a land where the primary occupation was agriculture and 76% of the family-owned farm land was already at or below the subsistence level. A good proportion of the cattle that was sold was not repurchased by native farmers but by outsiders (possibly for slaughter to supply meat to military personnel). All indicators pointed to an already desperate situation that also left Bengal totally vulnerable to any supply-demand shock, which inevitably arrives sooner or later. In this case, the supply shocks arrived in the form of imperial Japanese troops storming Burma (today's Myanmar), and the apparent failure of rice crop. Demand (and hence price) spikes cannot be ruled out either.

1943 was apocalyptic for Bengal. The Mahalanobis report measured the change in the already terrible economic indicators, as well as the increase in the number of destitutes. These numbers are shocking and point to the total absence of effective government intervention: A 300% increase in economic deterioration and 1200% increase in the rate of destitution (with young women affected the most) during the famine even as Britain appeared to stockpile food for itself. Bengal was a victim of depraved indifference of the worst kind.

(picture source link: boydom.com)

Fatalities
The sampling survey attempted to obtain the number of family members who had lost their lives during the food shortage. Mahalanobis estimated a mortality rate of 5.0% for men and 5.6% for women in a estimated 1943 population of around 61 Million (derived from a 1941 census). By design, the survey excluded: infant and toddler deaths, individuals without families, and entire families that either perished or relocated out of Bengal. Mahalanobis was not provided funds to repeat this sample survey ever again, so the fatalities in 1944 could not estimated.

The next step was to establish a counter-factual: what would have been the 'normal' mortality rate had the disaster not taken place? He chose 1931 as the baseline year, since data was available. That rate was 4.0%.  Madhusree's book notes that the normal mortality rate for India then was 2.1%, so Bengal's 1931 rate was already nearly double that number - a stunning statistic. Note that a different counterfactual will give us a different 'missing number' value. 

Note: This post does not provide numbers from Amartya Sen's analysis because his numerical results have been shown to be shaky in multiple instances upon further examination (see this old dualnoise post for example, as well as external references 1, and 2).

Some of the reasonable numbers quoted by historians today are derived from Mahalanobis' carefully-designed and controlled sampling study. For example, Madhusree, after correcting for infant deaths, and noting the symmetrical distribution of mortality rates around December 1943, arrived at an estimate of 3 million incremental deaths during 1943-44. The total number of incremental fatalities across the war years is likely to be higher and comes eerily close to the number of Jews murdered in Europe by the Nazis. Also, if the reference counterfactual is taken as India's normal mortality rate (2.1%) to include the abnormal situation in Bengal during those years, this number further shoots up. Birth rates in 1943-44 dropped significantly as well. Based on these calculations, it is plausible that Bengal lost around 10% of its population during the war years.

The random sampling methodology for estimating the supply of jute that earned much revenue for Bengal, and for determining food crop produce that fed it was eventually re-used to estimate the number of Bengalis who died without having anything more left to sell or eat. Mahalanobis' carefully chosen words within his final summation reads:
"... The famine of 1943 was thus not an accident like an earthquake or a flood, but the culmination of economic changes that were going on even in normal times."

Postscript
A remarkable finding about the millions of Indians who were left to starve to death in Bengal: there was no cannibalism anywhere.

Selected References
1. Several Sankhya journal articles of the 1930s-40s that cover the relevant works done in the ISI including:

'Mortality in Bengal in 1943'

'The Bengal Famine' - reprinted from 'The Asiatic Review', 1946.

'Report on the Bengal Crop Survey, 1944-45'

'The Sample Census of the Area Under Jute in Bengal in 1940'

'An Estimate of the Rural Indebtedness of Bengal', 1934

'Elasticity of Wheat in 1935 India'

'Indian Statistical Institute: Numbers and Beyond, 1931–47'

2. Madhusree Mukerjee, "Churchill's secret war', pages: 266-273

3. http://www.bowbrick.org.uk/Famine%20pages/famine.htm. Also see the 'key documents on the famine' page.

4. The New York Review of Books: "Did Churchill let them starve?", and the resultant exchange with Amartya Sen.

5. Madhusree's interview at harpers.org.

Updated June 29: mildly edited for brevity.

Sunday, June 16, 2013

Traveling Surveyor: Contributions of Mahalanobis to Analytics - 2

Bengal
This post is the second in a series of blogs based on the work done by P. C. Mahalanobis in the area of statistics, analytics, and operations research between 1930-1960. We move north-east from the location of our prior post on storm-flood forecasting (Orissa) to Bengal: a land of science, wisdom, and dharma that gave to the world a Vivekananda whose thoughts deeply affected Gandhi's contribution to India's freedom struggle, who in turn shaped the work of Martin Luther King, Jr., and Nelson Mandela, and thus the civil liberties of a significant population of the world.

Much of discussion here is gleaned from ISI archives, websites, and various papers from Sankhya, ISI's flagship journal. Those interested in a more detailed and accurate analysis of this work are referred directly to ISI's journal material.

Jute
Bengal (including Bangladesh, formerly East Bengal) produces much of the world's jute. India is the largest producer and consumer of Jute today, followed by Bangladesh. Jute is an incredibly useful crop and has been a significant contributor to Bengal's revenue for a long time.  Here are some contemporary pictures of standing Jute crop in Bengal.



(pics source: informedfarmers.com)

Why this work is important
Prior to 1947, when India was still occupied by the British Raj (there's a very relevant reason for bringing this up, but we'll get to that later), forecasting the supply of this valuable and lucrative crop in Bengal was largely a product of bad guesswork. Like most other sectors in India, the agricultural sector too is highly decentralized, which means a myriad of tiny farms growing jute and other crops, all of which had to be surveyed if one wanted to get an exact, enumerated production number. Mahalanobis came up with an alternative in the 1930-40s using methods derived from statistics and a field that is now termed 'Operations Research': a scarce-resource optimized method for accurate crop forecasting. Today, the Government of India employs sophisticated remote sensing including a Satellite Survey System to improve crop forecasts, but the methods developed then are still relevant and valuable. The seminal work of Mahalanobis in developing an optimal sample-based survey is also interesting to read from a practitioner's perspective. The combination of ideas employed in the work done in the 1930s include data analysis, statistical modeling, pilot study, scarce resource allocation, and mathematical optimization, and ranks among the great achievements in the practice of Operations Research and Analytics.

A map of undivided Bengal, circa 1850 C. E. (source: http://jrahman.files.wordpress.com)

Motivation
Jute and cotton were two of the most important exports out of India after the manufacturing sectors was crippled by the British Raj - 24% of the total revenue between 1927-37 was from Jute. Estimating the total Jute produce in Bengal up until the 1940s was largely a product of guesswork and ad-hoc estimates provided by the administrative chain of the British Raj produced wildly varying numbers. Like other parts of India, cultivation in Bengal was decentralized and spread over nearly 100 Million small farms, which were on average less than half-an-acre in area, spread over more than 60, 000 sq. miles. Jute was grown in a subset of these farms. Furthermore, the cultivation lifecycle of Jute is very short - about two months from planting to harvesting. Consequently, even if the administration was willing to cough up the expenses for an enumeration survey, covering all these farms within 8-9 weeks would be extremely expensive, if not impossible. Add to the fact, that many plots (30%) that cultivated Jute also cultivated other crops in parallel. Thus, while in theory, we can expect a total enumeration to give us near-zero error, in practice, allotting multi-crop areas to Jute and other Human-induced errors would introduce noise. In fact, the report states that the biggest negative associated with an enumerative survey was not the prohibitive cost but its unreliability, and this motivated Mahalanobis to develop and implement an alternative approach that accomplished the task at a fraction of the cost and time, and at a higher level of accuracy using random sampling.

Random Sampling
The nearly 100M jute farms were spread over Bengal in a non-homogeneous manner. Some areas were densely cultivated, some sparsely. The approach was to partition the total area into zones, i = 1, ..., n (area A_i) whose area was internally homogeneous (kinda like the way finite element analysis is used in structural engineering). Within each zone, a number of areas or grids were selected and sampled at random. If a sufficient number of such grids were sampled, the average proportion of area under Jute within a zone (J_i) can be obtained, which allows us to predict the total area under Jute  = sum(i) A_i * J_i.

Decision variables
1. The partition of the total area into approximately homogeneous zones
2. The number of random samples within a zone
3. The area of a sample

For simplicity, we assume that the first decision of partitioning the area within Bengal is an external input and thus our focus is on optimizing the remaining two decisions.

Constraints
1. The cost of the whole operation depends on the second and third set of decision variables.  For a given budget, if the area of an individual sample is large, then the number of samples has to be reduced, and thus the samples would be more spread out and further away from each other.

2. The achievable precision (variance) varies similarly. If the sample area is large, the per-sample variance is smaller, but cost considerations limited the number of such large-samples, and this can hurt the overall variance accumulated across the zone. On the other hand, a smaller area in tandem with a larger number of such small-samples affects precision in an opposite manner.

Nonlinear Optimization Problem
Given either cost or precision as a hard constraint, select the sampling area and the number of random samples to maximize precision, or minimize cost.

Mahalanobis' approach attempts to model the change in variance and cost as continuous functions of the two decision variable sets. Once these functions are at hand, a local optimum is obtained using a derivative based Lagrange-multiplier method. Mahalanobis used this approach to tabulate the achievable precision for a range of cost levels.

An exploratory, small-scale (pilot) survey was initially conducted at a small expense as a proof-of-concept and proof-of-technology validation of the methodology prior to embarking on a full-scale project. This type of an approach is now widely adopted in many business analytics projects.

The effect of the decision variables on variance can be calculated relying on theoretical methods. However, human-induced errors were also common, and Mahalanobis used the idea of interpenetrating half-sample pairs, where two groups independently arrived at Jute area estimates for a given location.  There are many important details here that are left out for brevity. The cost calculation is detailed and empirical and depends on the nature of the survey, and among things, include:
a. cost of staying and surveying at a given site - this depends on the size of the sampled area and time spent
b. cost of traveling from sample to sample - this depends on the distances between the chosen samples and the sequence of visiting.

Again, we have left out a humongous amount of cost calculations that were done. Reading the reports that came out of this work, one is amazed by the time and effort devoted to meticulously tabulating the various costs that go beyond 'ball-park' estimates, to produce an accurate cost function. For example cost calculation (b) depends on the solution to the corresponding traveling salesman problem.

The TSP
One of the many reports that came out of this this project notes:
(source: Sankhya journal, 1940)

This cost calculation is reviewed by Applegate, Bixby, et al. in their book on TSP and in Bill Cook's 'In Pursuit of the Traveling Salesman'.  A literature review of this TSP in these books mention that researchers later showed that the expected length of the optimal tour was approximately between (0.707, 1.27) times the square root of the number of samples visited in a unit square, so Mahalanobis' 1930s estimate was a remarkably good choice.

Results and Business Impact
The cost- and precision-controlled random sampling approach proved to be revolutionary. It achieved greater precision at a fraction of the cost.  Specifically, the margin of error was +/- 2%, and the cost was 1/15 of an enumeration census that was performed the same year and found to be less accurate compared to the random sampling approach. Thus the benefit and return-on-investment of this analytical approach was successfully demonstrated in practice, which received widespread recognition and was later embraced by the Government of independent India for many nationwide surveys.

Prelude to Part-3: The Bengal Holocaust
Within a couple years of the successful demonstration and publication of this work, Mahalanobis' Bengal lost between 2-6 million people due to starvation and disease between 1942-1945, triggered in part possibly by a failure of rice crop. The British Raj, locked in an grim Atlantic battle during WW2, may have suppressed reports and figures. It appears that most of the world, and even a vast majority of Indians, to this day, remain unaware of the reality behind this event.  How to obtain a reasonable estimate of casualties due to this disaster? Who was responsible and how? A recent book has brought this controversy into the open, and it appears that Mahalanobis (and his statistical sampling method) may have played a critical part in solving this puzzle.

To be continued.

Monday, May 20, 2013

Solve TRP, Drive Tesla

So the awesome and hyped Tesla Model S is beating Mercedes, BMW, and Audi in sales despite the $70K price tag. One problem is the current lack of charging stations required to recharge over long trips. CNNMoney reports:
"CNNMoney took a Tesla Model S on a test drive from Washington D.C. to Boston. The all-electric car drove 450 miles with two stops to recharge at Tesla's Supercharger stations in Delaware and Connecticut"



A second problem is that unlike conventional gasoline cars that can be refueled very quickly, charging can take a relatively long time, and adds significantly to your total travel time. In the near future, as the number of Teslas increase, the number of charging stations is likely to go up too. However, it will remain a sparse resource for some time. A future Tesla trip from New York to California will require many recharging stops, reminiscent of the old stagecoach problem during the Gold Rush era, which is used to illustrate the principles of Dynamic Programming.

Update: May30, 2013
Popular Mechanics reports:
"...the California-based EV automaker plans to expand its fast-charging "supercharger" network across the continental U.S. During today's announcement, Musk said New York to Los Angeles trips would be viable as soon as this winter. 

"When you get in a car you have the ability to go almost anywhere," Musk said. "That's what the supercharger network will do." 

The rollout will begin as soon as this summer, with the number of stations tripling to 27 nationwide by the end of June. In addition to increasing supercharger density along the already-established California and Mid-Atlantic coasts, Tesla will debut chargers in a handful of new metropolitan areas within the next few weeks, including Austin/Houston, Portland/Seattle, Boulder/Denver, and Chicago/Milwaukee. By the end of the year, Musk expects to to have "a couple hundred" superchargers spread across the U.S. and southern Canada, with as little as 80 to 90 miles between stations along the Altantic and Pacific coasts..."

Tesla Routing Problem
Given a road network G having a limited number of charging stations located at specific nodes, the Tesla Routing Problem (TRP) will have to:
Find the shortest feasible time path from origin O to Destination D in network G.

Feasibility requires the car to visit enough charging stations in a timely manner. Optimality requires that the sequence of visits be carefully selected such that the total time spent is kept to a minimum.

Total travel time = driving time + recharge time

This involves two decisions:
D1. Which recharging stations to visit en-route?
D2. How long to recharge at every pit stop?

Assumptions:
A1. The longer you recharge, the longer your available range (to a limit).
A2. All recharging stations are similar, and infinite capacity (no queuing!)
A3. Range is deterministic

Update on A2: The Tesla can charge on any outlet, but the time to recharge can vary depending on the source.

Clearly, optimally solving TRP involves the finding a solution to a Traveling Salesman Problem even in the special case of instantaneous recharge (Y=0).

(Updated September 2013: As mentioned in the comments, the general case is not a TSP in the network of charging stations. Here, the assumption is that stations are sparsely located and we visit each one along the way. The title is changed to reflect this distinction).

This is no different from the (milder) challenge faced by conventional car drivers in unfamiliar areas where gas stations may be hard to come by (see this old post). Charge cannot be carried in Jerry-cans. The  shortage of charging stations does make every long Tesla trip a non-trivial problem to manage. Tesla owners have to deal with a NP-Hard Optimization Problem (Heh) on every such trip, but in practice, it should not be difficult to find good quality, feasible TRP solutions.

The recharge time makes the TRP a bit more interesting. If two stations are close to each other, we don't have to recharge fully, and just need the minimum to reach the next stop - or perhaps we recharge fully, which allows us to reach a station that is further away but turns out to be more time-effective, overall. Clearly, these two decisions appear to be interlinked and may need to be jointly optimized. D1 is a binary decision variable X(i) - we either visit or don't visit a station at node X(i), whereas D2 is a semi-continuous variable Y(i) conditional on D1:

Y(i)  U.X(i)
where U = maximum recharge time.

Another interesting and maybe even useful problem for Tesla-makers to solve is where to add new charging stations on the network G over time. What is the minimum number and location of stations required to guarantee feasible solutions to any TRP in the 48 states of continental US? How will stations deal with congestion in the future? and so on ...

I haven't driven a Tesla yet but I wouldn't mind relocating my office to inside one and telecommute.

Sunday, April 21, 2013

Fun with Coins: Optimizing The OR Cafe Price Menu

We've looked at a few coin-optimization models here before. ... We'll look at another one today. We'll try to price the menu at OR cafe. You can look at the detailed problem statement and the associated sample data here. As always, there's no claim that the method described here is a 'best' or even a 'correct' way to manage such a problem. This post serves as a useful starting point for analyzing this fun problem with coins and change.

Basically, the problem we are analyzing in this post here is to change the existing prices of items in the menu by tiny amounts (+/- 2 cents) to maximize the number of (future) customer bills that end with zero or five cents (i.e., $x.05, $x.10, $x.15, ... $x.95) after (13%) tax. Assuming that past transactions (200 of them) are a good indicator of future purchase patterns, we can select a portion of this data (say 60%) to 'train' our pricing model (MIP) on, and then evaluate its performance on the remaining 40% of the transactions.

Model
1. Decision variables: There are five pricing possibilities for the recommended price of an item p(i) = {p0(i)-2, p0(i)-1, p0(i), p0(i)+1, p0(i)+2}, where p0(i) = original price of the item in the menu.

2. Bill(transaction j) b(j) = sum(i) Aij.p(i)*(1+T/100)
where:
a. Aij = number of units of item (i) purchased in historical transaction (j)
b. T = percentage sales tax charged

3.The minimum (l(j)) and maximum (u(j)) values for b can be determined by setting all prices to their lower bounds, and upper bounds respectively. We assume that fractional cents are rounded to the nearest cent. This gives a pre-computable range of feasible b values V(j) = {l(j), l(j)+1, ...., u(j)}.

4. Objective: Maximize the number of times b ends in 0 or 5 (using boolean indicator variable 'z' and 0-1 cost 'c') in the training set by selecting one price from the 5 possibilities for each item

CAFEMIP: Max sum(k,j) c(k,j)z(k,j)
subject to:
ii)  sum(i) Aij.p(i)*(1+T/100) = sum(k)V(k, j).z(k, j) for all j
iii) SOS-1 constraints for: prices p(i) that ensure that one of five prices is picked for each item in the menu, and for z(j) to ensure that exactly one bill realization is active for each transaction.

Round-off Issue
The LHS of constraint 4(ii) can contain fractional cents, whereas the RHS only admits discrete values. This situation can be overcome as follows:
b(j) = sum(i) Aij.p(i) + sum(i) Aij.p(i)(T/100) = b'(j) + tax,
where b'(j) is integer. Every realization of b'(j) (=V'(j)) can be mapped to our 0-1 objective function contribution after accounting for the additional tax it generates. Thus, it is sufficient to retain the integer components of 4(ii) and rewrite it free of round-off error as:

sum(i) Aij.p(i) = sum(k)V'(k, j).z(k, j) for all j

CAFEMIP was optimized using CPLEX 12.x and implemented in Java. As a secondary objective, a tiny penalty was added to minimize the deviation of prices from the original value to avoid unnecessary price changes or revenue shifts.

Results
1. optimized over the entire 200 transactions (no hold-out sample).We count the number of times, the cafe needed to round to the nearest 0 or 5 using the prices before and after optimization.

Roundings needed: original price:   135/200
Roundings needed: optimized price: 35/200
While the results on training data are really good, we do not know how well CAFEMIP would do on hidden data, so we re-run CPLEX on a subset of the data.

2. Optimized over the first 120 transactions only, and tested on the last (hidden) 80 transactions.
training data: original  cost =     77/120
training data: optimized cost = 20/120

--- CPLEX log ----

Tried aggregator 1 time.
MIP Presolve eliminated 34 rows and 17 columns.
Reduced MIP has 230 rows, 847 columns, and 1824 nonzeros.
Reduced MIP has 835 binaries, 0 generals, 0 SOSs, and 0 indicators.
Probing fixed 68 vars, tightened 0 bounds.
Probing time =    0.01 sec.
Tried aggregator 1 time.
MIP Presolve eliminated 4 rows and 207 columns.
MIP Presolve modified 14 coefficients.
Reduced MIP has 226 rows, 640 columns, and 1666 nonzeros.
Reduced MIP has 629 binaries, 11 generals, 0 SOSs, and 0 indicators.
Presolve time =    0.06 sec.
Probing time =    0.02 sec.
Clique table members: 3071.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 4 threads.
Root relaxation solution time =    0.00 sec.

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer     Best Node    ItCnt     Gap

      0     0       13.3095    48                     13.3095       92         
*     0+    0                           73.0000       13.3095       92   81.77%
      0     0       18.8333    13       73.0000     Cuts: 130      189   74.20%
      0     0       20.0000     2       73.0000      Cuts: 22      220   72.60%
*     0+    0                           20.0000       20.0000      220    0.00%
      0     0        cutoff             20.0000       20.0000      220    0.00%
Elapsed real time =   0.16 sec. (tree size =  0.00 MB, solutions = 2)

Clique cuts applied:  34
Cover cuts applied:  35
Implied bound cuts applied:  4
Mixed integer rounding cuts applied:  7
Zero-half cuts applied:  7
Gomory fractional cuts applied:  28

Root node processing (before b&c):
  Real time             =    0.09
Parallel b&c, 4 threads:
  Real time             =    0.00
  Sync time (average)   =    0.00
  Wait time (average)   =    0.00
                          -------
Total (root+branch&cut) =    0.09 sec.
CPLEX objval = 20.000000000000025

original, new price of XS Coffee is $1.24, $1.24
original, new price of S Coffee is   $1.33, $1.33
original, new price of M Coffee is   $1.57, $1.55
original, new price of L Coffee is    $1.71, $1.73
original, new price of XL Coffee is  $1.91, $1.90
original, new price of S Tea is        $1.33, $1.32
original, new price of M Tea is       $1.52, $1.51
original, new price of L Tea is        $1.71, $1.73
original, new price of Donut is       $0.95, $0.97
original, new price of Bagel is        $1.24, $1.24
original, new price of Muffin is       $1.30, $1.28
original, new price of Cookie is      $0.75, $0.75


(prices in red indicate a changed recommendation when optimizing over the partial sample instead of the entire history)

These prices were used to recompute the bills in the hidden sample:
hidden data: original  cost =   58/80 (~72%)
hidden data: optimized cost = 17/80 (~21%)

Thus, we would have been able to significantly reduce the number of roundings on the hidden sample if we used the CAFEMIP prices instead of the original values. If the customer buying pattern remains the same, we can expect these prices to continue to do a good job. 

Sunday, April 14, 2013

Optimally Managing Ladybugs

My daughter loves ladybugs.  However, when it comes to cleverness, many have noted how bees and ants (which seem to have a special place in Indian hearts :) "solve" traveling salesman and shortest path instances that matter to them (see this interesting link as well). In contrast, a ladybug's decision analytics appears to be somewhat shaky. They don't seem to have the kind of auto-GPS of bees and ants (at least in the context of this post), and seem to get confused and stray from their optimal path.



(pic source)

Recently, a number of LBs invaded my residence, and I spent a lot of time trying to shoo them away and determine how they got inside in the first place so that I could block their entry. Then, I came across this website. Apparently, ladybugs prefer humidity and warmth and try to enter homes as winter begins. OK. However, a good number also enter the house come spring time, which was puzzling. It seems some of the LBs goof up. Ladybugs roosting under windows outside the house, with a certain probability, enter the house instead of enjoying spring time outside. Once inside, they suffer from dehydration and die (unless they figure out the route to the bathroom or kitchen). If we try to force them out, they get stressed and shed 'yellow blood' that can stain walls. The optimal ladybug solution for this time of the year is to simply open the windows, so the ones who came inside by mistake can leave. It's a win-win for both parties, and it actually worked. On the other hand, during winter, we have to try and provide a cozy, alternative home in backyard and insulate the house well to keep them out.

Monday, February 4, 2013

Optimally Shoveling Snow

This post attempts to compare the efficiency of various approaches used to clear a rectangular area (L > B) of snow, using a good old manual shovel of width w, always moving in straight lines. We assume that the gradient of the area is zero, and that h inches of snow has fallen uniformly over the field. We also ignore the capacity of the shovel for now. The total snow that must be banked at the borders of the field is constant = LBh. Also total working distance traversed ~ LB/w.

Snow shoveling can be a grueling exercise for some, and public health departments always publish safe shoveling tips during the winter season. This is especially important for people with potential heart problems or back problems. Hopefully analyses like these will help (I'm sure people have looked at this before, but just in case ...).


Our objectives are to minimize:
1. Total work done = Total snow moved X distance moved.
2. Number of Stop-Starts with a fixed unit cost k incurred
3. Total work time in cold weather.





Model:
Given an area bx, and assuming we are moving along x, the snow nearest the starting point must be moved the maximum distance, and this distance linearly reduces to near-zero at the end of x. Thus (introducing an integral sign in this blog, yikes):
a) total work done = wxdx = wx2/2
b) Unit stop-start cost = k

Approach 1: Shoveling east-to-west (along L, banking at the west end) or bi-directional traversing:
total number of passes =  B/w
total work done = (B/w) * wL2/2 + kB/w = BL2/2 + kB/w



Note that the effort level is independent of the shovel width. A wider shovel requires fewer passes but more effort per pass.

Approach 2: Shoveling south-to-north (along B, banking at north end) or traversing in both directions:
total cost = LB2/2 + kL/w

Thus, the second approach reduces the total work done by a factor L/B. However, this reduction is achieved at the expense of more stops (which may not be a bad thing from a health perspective; k may be negative or zero for some). If L = 2B, the second approach requires 50% less effort, but requiring twice as many passes. 

Approach 3: I observed that my Canadian neighbor divides up the rectangle into two areas, always starts from the middle, and creates banks on both sides. If we bank the snow at the south and north ends:
total number of passes =  2L/w
total cost = LB2/4 + 2kL/w

The third approach is twice as effort-efficient as the second approach, at the expense of making twice as many banks, plus an additional penalty of traveling to the mid point after each pass. On the other hand, you are also less likely to run into capacity problems, since the material handled per pass is the lowest. We can also verify that among all starting points chosen for approach-3, the mid-point (B/2) yields the minimal effort. In general, any of these approaches become preferable in terms of total cost, depending on the chosen objective function. For example, we could modify approach-3 and bank snow close to the ends the field (< B/2) at their respective (west or east) ends at the expense of adding more stop-starts.


Time Considerations
If the shape of the field changes, the approaches may change, but it appears that a good principle to follow is to limit the material moved between successive bankings. In particular, we want to identify the shortest Euclidean path between every point in the field to the nearest feasible bank, and purely from an effort perspective, shovel snow along a path to a feasible bank that is optimal for all points on that path. This seems to reduce the working duration taken as well: 
Suppose we maintain constant momentum (i.e. mass X velocity = constant) during a pass, our velocity will inversely proportional to the amount of snow accumulated (e.g. 1/wx). Total time per pass ∝  wxdx = wx2/2, which is proportional to the effort.

For such a velocity model, approach-3 also appears to be optimal from the time perspective. If we assume constant velocity, then the three approaches take the same working time, ignoring the shovel-free travel time required for approach-3. Among these alternative time-optimal approaches, the third approach does the best on the effort-objective.


Lawn Mowing
Non-propelled, walking lawn mowers with bags? Regardless of the traversing method adopted, our effort (per our chosen model) increases as the square of distance until we empty our cut-grass. If we adopt approach-2 or 3, we have more banking opportunities.  However, approach-3 may result in a mowing pattern that won't look so nice, so approach 2 may be more preferable. Operating a self-propelled mower with built-in mulching may reduce our effort considerably.

Update 1: Post "Nemo" Shoveling, February 9, 2013
A truly optimal solution is to have a kind and wonderful neighbor. After three hours of efficient shoveling (h = 24 inches) and 40% completion, Kevin zipped in with his snow plough and finished the remainder in 5 minutes before going off to help others. I now have to find the best way to thank him.

Update 2: (WSJ, via @ThaddeusKTSim), February 10
 (source link: online.wsj.com)