Wednesday, July 22, 2015

How many songs does your favorite internet radio station hold?

Can we even measure the number of songs 'stocked' by our favorite internet radio station (IRS) ? Douglas Hubbard who wrote "How to Measure Anything" would say yes. At least, we can get a reasonable and useful estimate. In fact, it may be tough to identify what an 'exact' answer is here since new songs may be continually added, and less popular ones deleted over time.

This problem appears to resemble another one: How many fish are there in a particular lake? As DH notes, just because we may never see all the fish the lake or hear every song in the IRS, it does not automatically mean that we cannot estimate its size. As part of my self-education in Indian logic, I understand this as follows: when we cannot directly perceive and verify (pratyaksha pramaana), we can try to infer this unobserved truth (~anumaana). Sometimes, we can use comparison (~upamaana) to solve this problem. Here are examples of historically significant events in India where estimating unobserved quantities turned out to be important:

1. Estimating the size of native volunteers ready to take on the occupying British Raj as part of 'Operation Red Lotus' in 1857 (covert sampling and proportions)

2. Estimating the number of refugees inside the Red Fort (without entering) after the 1947 partition of India (comparison of proportions)

3. Estimating the size of jute crop in 1930s Bengal (random sampling)

4. The number of civilians killed in the Bengali holocaust of 1943 (random sampling). The blog carries excerpts, but you should read Madhusree Mukerjee's book for an in-depth analysis.

Using the overlapping sample approach and the fish example in DH's book: catch some fish (say 1000), tag them, and let them swim happily around the lake. Again catch fish (say, 1000) and count the proportion of tagged fish (say 5%), indicating that we tagged roughly 5% of all the fish, giving us an estimate of about 20k. We can also calculate, say, a 90% confidence interval for the chosen sample size. Which brings me to the reason for this blog. I noticed that too many songs on the Ilayaraaja radio station offered by Google-music repeated during my office commutes last week. No problem re-listening to Ilayaraaja musical scores like this one (based on Shamukhapriya ragam is my semi-educated guess).



20 songs were mentally tagged on day-1 and released back into the radio station, and 20% of those tagged songs were streamed back to me on day-2. This experiment was repeated a couple of times, yielding similar results, indicating that Google-music's Ilayaraaja station roughly currently stocks about a hundred songs, and I will soon be hearing only repeats.


Sunday, July 5, 2015

Finding an optimal parking spot for shopping

Here is the original discussion at 'Punk Rock OR'.  The blog below studies a parking objective in the shopping context, and where the entrance to the shopping complex and the exit are at different locations. The cost to be minimized consists of at least two components:
(objective-a) the 'work' done searching for a good parking spot, as well as
(objective-b) the total physical work done after parking. This includes the work done while walking to the entrance, as well as the work done after shopping and exiting the building. Let's focus on objective-b here, given that objective-a has been analyzed before, using the following notation.

Parking coordinates (O, D), where
O = walking distance from the parking spot to the entrance, and
D = walking distance from the exit to the parking spot.
M = mass of the person,
S = expected mass of purchased items, and
(μ, g) = constants denoting the coefficient of friction during walking, and the acceleration due to gravity, respectively.

Work done = force * distance = μg * mass * distance (A prior blog shows how to optimize our in-store shopping route.)

Objective is to minimize: M*O + (M+S)*D

A minimal work objective suggests these simple rules:
1) Heavy shopping: park close to the exit. 
2) Light shopping: park to minimize total walk.

Thus, the goodness of a parking spot depends on the shopping context. Let us figure out a good parking spot for the shopping scenario below.

Linear Store, Manhattan Distances
Consider a long, 'linear' store along the x-axis, with the exit @ x = 0, and entrance @ x = L. Assume 'Manhattan' walking distances ( 1) and that prior customers selected spots nearest the building along y-axis, and picked their spots along the x-axis based on individual preferences. Given this, the distance we can expect to walk to/from the store along the y-axis is already (near) minimal. This leaves us with a (non-convex) one-dimensional search for open locations x(i) along the x-axis over three regions. Let us split this task into two steps. 

Step-1: Find the best spot within each region
A) x(i) = -a  0 
Minimize M(O+D)+S*D = M(L+a +a) + Sa = ML + (2M+S)a*

B) 0 ≤ x(i) = b ≤ L
Minimize M(L-b + b) + Sb = ML +Sb*

C)  x(i) = L+c  L
Minimize M(c+L+c) + S(L+c) = ML+SL + (2M+S)c*

The optimal x* within each of these regions is the one closest to the exit. This is independent of S and L, and identifiable by greedy search.

Step-2: Finding the least cost spot among (a*, b*, c*)
The worst spot in B is no worse than c*, independent of S and L, yielding a binary choice: pick a* or b*.
x(i)  0: incremental cost = (2M+S)a*, a* ≥ 0
x(i) ≥ 0: incremental cost = Sb*, b* ≥ 0

For small values of S, b* dominates and is optimal for light shopping or window shoppers. As S increases and becomes comparable to M, b* is preferable when:
b*/a*  ≤ 1 +2(M/S)

Even if we 'shop till we drop' and carry our own weight, b* can be as much as 3a* and still dominate.

Practically speaking, b* is a solid bet, but when B is full, the shopper is forced to choose between disjoint regions A and C.
x(i)  0: incremental cost = (2M+S)a*
x(i) ≥ L: incremental cost = SL + (2M+S)c*

shoppers would prefer a* to c* unless:
(a*-c*) > L/[2(M/S)+1]

Scenarios:
i) Between spots equidistant from the entrance and exit, a* dominates c*, independent of S and L. 
ii) For the 'carry our own weight' scenario, a* is preferable when
(a*-c*)   L/3
iii) For unmanageably large S, a* remains preferable if:
(a*-c*)  L

We can now draw some conclusions.

Finding a Minimum-Work Parking Spot
Preferred order of parking is:
b* > a* > c*

which is out of order. The actual search pattern we employ may depend on whether the aisles in the parking lot are along the x- or y-axis. 

Light/medium shopping context
Clockwise, starting from the exit, scan B, park if feasible. Else scan C, and park if feasible. Else, turn around choose a spot in A. 

Heavy shopping context
Clockwise, starting from the exit, scan B, park if feasible. Else turn around, scan A, and park if feasible. Else, choose a spot in C.