Saturday, March 30, 2013

Synchronized Scheduling Case Study - Part 2

The case study was introduced in the previous post. Read the introduction here.

.... Despite their suboptimal approach, the schedulers of the LEG carrier did some nice work with their synchronization approach.

2SAT versus 3SAT
There is a jump in decision complexity when moving from a two-body conflict resolution problem to three-body conflict-resolution. Intuitively, we can observe this in linear programming (LP) problems. LPs limited to two variables per constraint are easier to solve. The dual (two constraints per variable) is a generalized network flow problem that can be solved using purely combinatorial methods. Intuitively, one can adopt an alternating heuristic that iteratively fixes one variable and optimizes the resulting univariate problem. In computational complexity theory, SAT-2 is easier to manage compared to SAT-3.

Rather than trying to simultaneously juggle flight attendant (FA), pilot, and aircraft maintenance schedules, the LEG schedulers adopted an simplification. They combined the two human elements into a single 'crew' scheduling entity. This can be achieved via "co-pairing", i.e., by building common pairings (partial schedules) for FAs and pilots employing X = {XPILOT ∪ XFA}, i.e., every pairing would be feasible to both pilot and FA contracts and FAA regulations. Column generation allows you incredible flexibility in accommodating such a change. This flexibility is invaluable when hit by a perfect storm. Now, they could work with a two-variable per constraint structure so that their telephonic "alternating heuristic" negotiations could converge in a timely manner. Unfortunately, such a beneficial reduction in complexity was accompanied by inefficiency. The approach erred inordinately on the conservative side. The resources were now too tightly integrated, resulting in a cost spike due to under-utilization of expensive resources. LEG's R&D team was invited to resolve this problem.

Integrated Crew and Aircraft Routing (ICAR)
The OR team quite naturally proposed an integrated scheduling approach that would reduce costs, improve reliability of operations, and also enable tactical synchronization of flight connections. However, the number of possible connections to be evaluated is O(n2) for a n-flight schedule. Methods in the literature are typically data-agnostic; sophisticated alternatives to the straightforward approach designed to perform an acceptable job even in the worst case. Luckily, LEG's OR team had to solve only LEG's problem and not that of every future airline and space shuttle.

The Beauty and the Brute Force
Given the problem size spread over a five-day period, there are a few million candidates to synchronize in the theoretical worst case. However, our focus is only on same and overnight connections, which reduces the candidate list to a few hundred thousand. When we recognize that we only have to focus on the set T of connections of the same sub-fleet type at the same airport (duh), and exploiting a few other contextual restrictions, this final number is data-driven. It turned out that there were no more than tens of thousands of synchronization possibilities to deal with, 99% less than the theoretical bound. This finding allowed the possibility of cheerfully discarding (as we found out a couple of years later when some of the first journal papers on ICAR came out) all the neat out-flanking theorems and approximations presented in the literature.  The method chosen may score low on aesthetics but was effective and flexible in managing the full blown ICAR instances in context. There is a childlike joy in discovering that enumeration solves your practical business problem - akin to finding out that this emperor has very little clothes. This basic counting step gave the team a clear goal.

Problem Statement (at last)
Given a flight schedule, jointly construct feasible crew pairings and aircraft strings such that their scheduling intersection (elements within the connection set T) are optimally synchronized.

And, yes, there were a few TSP (Traveling Salesman Problem) instances to be managed within ICAR. To be continued....

Sunday, March 24, 2013

Synchronized Scheduling Case Study - Part 1

When a low-cost airline (LCA) entered a short-haul aviation market many, many years ago, a legacy airline carrier (LEG) decided to compete with LCA on its own turf by matching the LCA's fare, frequency and super-quick aircraft turn-around times on ground ('turns' for short). This required a high degree of coordination between LEG's in-flight crews (pilots and flight-attendants), ground-crew, airplane maintenance planners, and the airline schedulers (including their scheduling optimization system). In particular, the utilization of the pilots and the flight-attendants had to be maximized. The challenge however was that all these different resources were part of separate organizational structures and were governed by different contractual rules, and the LEG idea eventually turned out to be cost-prohibitive and unreliable due to poor synchronization of resources.  On the other hand, the Operations Research (O. R) team built a specialized scheduling system for LEG (far ahead of the literature available at that time) that would've achieved the needful. Unfortunately, the silo-ed organizational structure hindered adoption.  Result:  Unlike the legendary revenue optimization contest where the legacy carrier triumphed, the tables were turned here, and the LCA flourished. The low-cost LEG idea was resuscitated many years later in a different context and market, and did a better job using the lessons learned from this early setback. In the next few posts, we will compare the two approaches used to synchronize these resource scheduling tasks and summarize the lessons learned.

Airline Scheduling Basics
Given an aircraft schedule, the resource planning is done sequentially in order of their profitability (or cost) to the airline. Thus:
LEG flight schedule → pilot schedule→ flight attendant schedule → aircraft maintenance routing

The cost-critical task here is the construction of "pairings" (or "strings" for aircraft), which denote partial (unassigned) solutions that represents partial tours of a resource. For LEG pilots (set of two persons), and FAs (set of three persons), this represents a tour in a time-space network that starts from a crew domicile (a small subset of airports in the LAC schedule), traverses a bunch of different airports over 1-5 duty days with intervening periods of rest known as layovers, before returning to the home domicile, while satisfying a myriad of contractual and FAA rules for work hours, rest time, etc. In short, the cost and constraints that determine each pairing is an extremely complicated nonlinear, non-smooth, non-convex, nonplussing mixture of on-duty + rest + reliability + crew work life balance + other path-dependent factors. 

Good quality pairings can be created using a dynamic column-generation procedure. A column here represents a pairing, and during an optimization run, thousands of pairings are generated and selected from among the many trillions of possibilities using a "branch and price" scheme. Eventually a subset of the generated columns are selected such that every flight in the time-space network is covered by exactly one pairing, and such that the total cost is minimized (where the actual 'cost' being optimized is a combination of a hundred-odd objective function components).

Note: An airline crew-pairing solver's set partitioning problem is arguably the easiest optimization model in the world to represent on paper:
{SPP: Minimize cx : Ax = 1, x binary, x ∈ X}

In contrast, the real-life computer code used to manage this simple model can easily run in the order of many hundred thousand lines (largely due to the complicated and detailed nature of X), with expensive parallel-processing machines, and an embarrassment of CPLEX licenses to match. Why? A small percentage point reduction in crew pairing cost (cx) can easily recover all these infrastructure costs and then some. Since crews get paid the max of planned and actual costs, these numbers are actual bottom-line savings. Furthermore, such scheduling tools can be mission-critical systems that run 365 days a year, requiring O.R product support staff to be on call like primary care doctors. 

Solution Approach
Downstream systems, and not of immediate interest to this post, combine these pairings into actual monthly schedules with pilot, first officer, and flight attendants assigned by solving yet another set-partitioning problem. Thus cost-effectively scheduling even a single resource requires a sophisticated solution approach. In those days, academic literature often contained interesting approaches that looked good on paper but proved to non-implementable because of the highly context-driven and data-driven nature of this problem that requires developers to pay careful attention to problem size, exploit special structures in SPP and X, and keep in mind certain carrier-specific objectives. On the other hand, interested readers can refer to this journal paper - the first one I read on the job. It is also the #1 cited INFORMS paper based on a recent ranking. Today, this portion of airline planning is typically outsourced to specialized airline crew-scheduling software vendors.

Pairings and Strings
A set of pilot pairings tell you, for any given LEG flight: the in-bound flight that the pilots arrived in, and the out-bound flight on which they will depart. Of course, if a LEG flight schedule is built with lots of tight turns (say 30 minutes or less) at an airport, then it makes sense from the point of view of operational reliability and cost, for the crews to stay with, and depart on the same airplane, assuming of course, that the aircraft routers will assign the same aircraft (tail number) to these flights. This is a resource synchronization rule: Keep crews and aircraft together to the extent possible while also keeping costs low and satisfying all contractual duty, rest, and maintenance rules.

The FA scheduling system can take pilot pairings as inputs to try and keep the FAs with the pilots. Unfortunately, since these crews have different operating hours and safety rules give the varying nature of their job descriptions, very often, pilots and FAs have to break up and depart on different flights. This is an operational nightmare when you are trying to work with tight turns to compete with LCA, and actual costs shoot up if you 'over-synchronize'. To make matters worse, the strings (partial aircraft maintenance routing solutions) also have to be tightly woven with crew pairings. It's a planners nightmare.  Believe it or not, LEG folks actually built much of these schedules separately ("legacy approach"), with each scheduler anticipating what the other person will do, often coordinating via phone-calls in this multi-person game. Result: an expensive and sometimes operationally brittle schedule that increased operational costs and ate up some customer goodwill, which eventually resulted in the plug being pulled. In the next part, we'll see how LEG's OR team analyzed this problem.

to be continued .....

Sunday, March 10, 2013

Conflict Resolution - 3: Contextual Optimization

Asimov's Zeroth Law
We continue this discussion from where we left off a few weeks ago: robot-ethics and how Asimov's robots resolved conflicts. A key update to Asimov's original three laws is the inclusion of the zeroth law (from Wikipedia):

"A robot may not harm humanity, or, by inaction, allow humanity to come to harm."

A fundamental difference between this law and the others is its abstract specification, with little clue on how it will be implemented. Furthermore, by giving this law the highest priority, Asenion robots are designed to first and foremost safeguard 'humanity', while also minimizing injury to individual humans and themselves as secondary and tertiary objectives. A robot is given the difficult computational task of proving that the cost of hurting a human is less than humanitarian benefit derived from an alternative action, and must do so within a finite amount of time.

The Universal Conflict-Resolution Model
Immanuel Kant's 'categorical imperative' is an example of an universal conflict resolution model that furiously strives to be context-free, and one that has greatly influenced western thought. Similarly, the Ten-Commandments' "Thou shall not kill" is absolute. Any machine that includes this hard constraint would be unable to kill, even defensively in order to protect a large number of humans under threat. Kantian rules are easy to 'encode-and-forget' within machines and systems since one does not have to ever worry about the context of its application. As we saw in the previous post, the robotic laws are context-free and Kantian in design. The original three laws operated as hard, must-satisfy constraints and necessary conditions. Per Wikipedia, Kant's

"perfect duties are those that are blameworthy if not met, as they are a basic required duty for a human being."

The problem of course is that (see prior post), the rigidity of all-hard rules is not practical and later versions of Asenion robots appear to additionally operate based on the concept of Kant's imperfect duty (again from Wikipedia):

"unlike perfect duties, you do not attract blame should you not complete an imperfect duty but you shall receive praise for it should you complete it, as you have gone beyond the basic duties and taken duty upon yourself"

Thus, imperfect duties are soft, rather than hard constraints, and the aim is to maximally satisfy (minimally violate) requirements. However, the optimization weights that trade off the degree of importance assigned to each of the Asenion robot's 'imperfect duties' are hard-coded, and additional context-specific inputs are required from humans to achieve a satisfactory result in resolving a dilemma. The robots have no authority to perform context-specific conflict resolutions on their own.

How then does this zeroth law practically work in Asimov stories? It's Kantian abstraction is incomprehensible to all except a couple of enlightened robots (with telepathic ability, no less). In general, the zeroth law remains useful only on paper for most of the robots.

Contextual Ethics: The Indian Way
Rajiv Malhotra's path-breaking book "Being Different: Indian Challenge to Western Universalism" provides a fascinating contrast between the traditional Indian approach of 'contextual ethics' (CE) that arose from its Dharmic thought system, and the 'context free ethics' that largely guides the western approach (for an earlier post based on his work, see here. Some of the methods in this post are applications of ideas in this book). From an optimization perspective, we can think of a CE-embedded robot as one that maximally satisfies a combination of hard, soft, and firm constraints, where a 'firm' constraint refer to a hard constraint that is minimally and temporarily relaxed, depending on the specific context of the dilemma, and for the benefit of the 'greatest good', at the expense of incurring a context-specific penalty. This flexibility must not be confused with moral relativism - where a set of soft constraints are tactically and optimally manipulated according to context to maximize some convenient self-serving objective. 'Optimally timing an apology' can be thought of as an example of a non self-serving, contextual optimization model.  It is worth doing a deep dive into this concept, by reviewing some passages in the aforementioned book:

"The frequently leveled charge of moral relativism against this contextual morality is inaccurate, because the conduct and motive are considered consequential in judging the ultimate value of statements.

.... Dharmic ethics are formulated in response to the situation and context of the problem in a way that makes Western ethics seem unduly codified, rigid, monolithic and even simplistic. A.K. Ramanujan, in his influential essay 'Is There an Indian Way of Thinking?', uses the terms 'context-free' and 'context-sensitive' to contrast the West and India in their respective approaches to ethics: "Cultures may be said to have overall tendencies to idealize, and think in terms of, either the context-free or the context-sensitive kind of rules. Actual behavior may be more complex, though the rules they think with are a crucial factor in guiding the behavior. In cultures like India's, the context-sensitive kind of rule is the preferred formulation" ....

.... Dharmic traditions, on the other hand, have long sought to arrive at truth by balancing universal truths and acts with those that can be determined only in the context in which they occur. Dharmic cultures have thus evolved to become comfortable with complexity and nuance, rejecting notions of the absolute and rigid ideals of morality and conduct....

....dharmic thought offers both universal and contextual poles – not just the latter, as that would be tantamount to moral relativism."

The dharmic approach lies in between an "all-soft constraint" and the Kantian "hard and soft constraint" approach to decision optimization.

Applying contextual optimization
Asimov's telepathic robot Giskard formulates and solves a probabilistic optimization problem where it trades off the opportunity cost (in terms of human lives) against the expected benefit to humanity. However the degree of uncertainty in this conflict-resolution model is too high and the robot eventually crashes. This episode comes across as an example of applying the CE approach to resolve a dilemma. The great Indian epics - the Ramayana and the Mahabharata, contain several brilliantly narrated instances of contextual conflict-resolution. Indian sci-fi movie buffs would not be surprised to know that George Lucas' Star Wars was inspired by the Ramayana.

Contextual optimization in the specific area of 'mathematical decision support software' would mean: allowing the rules of engagement to be configurable depending on the context. For regular users, advanced settings are greyed out, with only universal (default) rules enabled. Only super users, who are well-trained and comprehend the nature and consequences of the beast, get to work with 'firm' constraints, and on rare occasions. For example, an airline crew schedule optimization system should be configured to satisfy contractual and FAA rules, except during emergencies (e.g. post 9/11 recovery) where 'crew welfare' is only achievable by overriding one or more of these rules. Practical decision support systems should be carefully designed to allow such controlled contextual optimization.

Amending the Zeroth Law: The Dharmic Robot
The four laws do not quite protect the rest of the cosmos (e.g., from humanity) given their anthropocentric nature. From an Indian point of view, this gap can be closed by modifying the zeroth law based on the contextual ethics of dharma. Rajiv Malhotra, in his book, provides the etymology and a working definition of dharma:

"Dharma has the Sanskrit root dhri, which means 'that which upholds' or 'that without which nothing can stand' or 'that which maintains the stability and harmony of the universe'. Dharma encompasses the natural, innate behaviour of things, duty, law, ethics, virtue, etc. For example, the laws of physics describe current human understanding of the dharma of physical systems. Every entity in the cosmos has its particular dharma – from the electron, which has the dharma to move in a certain manner, to the clouds, galaxies, plants, insects, and of course, man. Dharma has no equivalent in the Western lexicon."

In such a framework, Asimov's laws would delineate a robot's various dharmas. At the highest level, we can require that a robot abide by the following fundamental law, that is based on an ancient Indian text:

"Non-harming is a robot's highest priority, except in the defense of dharma"

Conflict-resolution is always performed by first applying this highest dharmic principle and customizing it to the specific context. Note that by operating on the fundamental dharmic principle of least harm, a robot would usually satisfy Asimov's zeroth law, albeit in a context-specific manner, while also being in harmony with the original laws, as well as any new laws that get written in the future. Interestingly, the Hippocratic oath of medical doctors is based on a similar idea that represents a non-negative bound: "do good, or at least no harm".  If complex systems, new drugs, etc., are designed by always keeping this fundamental principle in context, it may well minimize the risk of catastrophic failure.