8

I have been discussing this problem with a coworker for a few days now and neither of us have made any headway on it. I would appreciate any help with a possible solution or maybe a suggestion of a book on related subject matter. The problem is as follows:

I usually park my car near the doors of a convenience store but I constantly forget where my car is parked. Let's say that my car is parked somewhere on the real axis. Let's also assume that the probability distribution of my cars location is given by a normal distribution centered at zero. Starting at zero, I will walk along the real axis until I reach my car's position or turn around and walk back in the other direction.

Given that I am extremely lazy, what is the optimal search strategy that minimizes the expected distance I have to walk?

Steve U
  • 83
  • 5
    Wait a few days and you should receive a ticket for parking illegally. They usually tell you where you can find your towed car. – qqo Dec 01 '14 at 04:48
  • 2
    I don’t think the exact turn sequence is known, but the question is addressed in the references here, specifically the papers by Beck. https://en.wikipedia.org/wiki/Linear_search_problem – Steve Kass Dec 01 '14 at 05:08

1 Answers1

4

First, I'll ignore the fact that a complete search takes an infinite amount of time, so that your question really depends on the maximum length you theoretically could cover on the real line. Lets codify this by saying that you're going to accept an error rate of 1% (so in 1% of the cases, you stop before you find your car).

Lets look at a few special cases-:

  1. You flip a fair coin. If it is heads, you go towards $\infty$, otherwise you go towards $-\infty$. This will have an average success rate of 50%, so it is not acceptable at the 1% level.

  2. Walk in one direction until the tail probability equals 0.5%, then turn around and head in the other direction until that tail probability is 0.5%. Now, your expected success rate is 99%.

No 2 is optimal because it maximizes the "density per unit length" of the overall search path, which means you spend more time searching in higher probability regions, without "deadheading" as much as you would if you did multiple "turnarounds" while searching. We want to minimize turnarounds, but still get the highest probability regions in our search........

  • does this still hold if you try to optimize P(finding car) in a fixed amount of time? (arguably a more realistic model then heading off to infinity) – djechlin Jan 02 '15 at 17:13
  • @AAA Yes it does, the asymptotic case just makes the explanation complete, but you are correct. –  Jan 02 '15 at 17:20
  • I think the "real life" problem is your model changes as you don't find your car. – djechlin Jan 02 '15 at 17:42
  • @AAA yep, that is why I included an acceptable "error" rate...which basically says I am willing to accept an $x$% risk of not finding my car. In my example, there is a 1% chance you don't find your car. –  Jan 02 '15 at 17:44
  • OK, how would you minimize(expected time) of finding the car with P=1? (which should be finite I think/hope) – djechlin Jan 02 '15 at 17:45
  • Your first sentence neglects the fact that the probability a reasonable search algorithm will have to go very far drops off much faster than the distance increases. So this isn't insoluble/nonsensical/necessarily infinite to solve with P=1 and perform a "complete search," it's just the usual unintuitiveness of infinite quantities and finite probabilities/EVs. – djechlin Jan 02 '15 at 17:49
  • @AAA given that there is no guarantee you will find your car, you cannot calculate an expected search time $E[T]$ in the real-world case since $P(T=\infty)>0$. In the theoretical case of unbounded search time, it will have a finite value. –  Jan 02 '15 at 17:50
  • @ Eupraxis1981: Can you say more about 2. being optimal? I'm not convinced. Maybe it's better to go through some of the higher probability region on one side, but not necessarily all; then at some threshhold turn around and go back through the region you've already searched and go on the other side for a while; etc. – paw88789 Jan 02 '15 at 17:52
  • @AAA thus, you could say that for the case of fixed search time $t$, my algorithm minimizes $E[T|T<t]$, which eliminates the cases where you never find your car. –  Jan 02 '15 at 17:53
  • @paw88789 what we want to avoid is "deadheading", which is searching already searched regions (they do nothing bust waste time). Thus, we want to minimize our deadheading distance subject to our desired coverage probability. Imagine that instead of turning around once, you turned around multiple times, say N, each time going a little farther than the previous time. Then the distances between you and your last turn around will be covered up to N times over, whereas in my method, all distances are covered at most twice –  Jan 02 '15 at 18:03
  • @paw88789 therefore, its enticing to want to turn around and get into the higher probability region on the other side of 0, yet it comes with a "turn-around" penalty that is equal to the distance you've just covered, so the more you turn around, the more penalty distance you incur. –  Jan 02 '15 at 18:06
  • @Eupraxis1981: But going further in what is likely the wrong direction is even worse than 'deadheading' since now you will have to go even farther to find the car if it's on the other side. (I tried a (simplified) discrete version of the problem, and found that going back and forth could be better than doing all your searching in one direction first. – paw88789 Jan 02 '15 at 18:07
  • 2
    @Eupraxis1981: As an extreme case, consider that you are going to make so sure you find your car, you are willing to search up to $100$ standard deviations from the mean in either direction. Surely you wouldn't go all the way out to $100$ standard deviations in one direction...after a while you'd decide that you were all but certain to find your car on the other side. – paw88789 Jan 02 '15 at 18:09
  • @paw88789 the analytical solution is rather involved, but imaging you decide to search by going r units to the right, then backtracking 2r units to the left, then 3r units to the right, etc. Now, assuming you are dealing with a standard normal distribution of car locations, how far will you have traveled, on average, before you find your car? Intuitively, there will be many cases where it is rather close to 0, so you find it faster with a two-sided search. However, in the cases where it is farther away, your search distance increases rapidly. –  Jan 02 '15 at 18:28
  • @paw88789 it may be better to try to minimize the median search time, since the search time distribution will be rather heavy-tailed. But yo raise a good point...mean optimality may not lead to the best practical algorithm. No argument there. –  Jan 02 '15 at 18:30
  • I don't think that's it. Let's follow your argument and say instead of 99% sure, it's x% sure, where x% corresponds to search 100 s.d.'s out in either direction. If you're 50 s.d's out to the left (or something more like 2 or 3, really), you're something like 99.9n% sure to find the car if you just backtracked 50 s.d.s and searched just the first one or two s.ds to the left. o/w you could walk 50 s.d.s more, even though you're 99.9n% sure you'll just backtrack those too. I think your algorithm is implicitly assuming a unit distribution of finding the car, not a normal one. – djechlin Jan 02 '15 at 20:22
  • when you say "density per unit length" you're assuming all density is created equal, i.e. a unit distribution, and the algo does not spend much time at all in "higher probability regions", because it spends a lot of time traversing many standard deviations out, in very, very low probability regions. Even an algo where you go out 2 s.d.s, then fling back all the way in the other direction, then fling all the way in the first direction again, paying an extra deadhead/double traversal, clearly has a better time EV. – djechlin Jan 02 '15 at 20:31
  • @AAA how about trying a simulation of both strategies using a normally distributed car location and see which one minimizes the average distance searched to find the car? I did't do this, but if backtracking a lot is beneficial, then imagine that you go 1 ft to the right, then 2 ft to the left, then 3 ft to the right. How much ground have you covered, and what probability have you captured? In fact, the limiting case involves an infinite number of switchbacks, which results in and infinite search length. –  Jan 02 '15 at 20:52
  • @AAA also, I understand that if you are way out, there is little chance of finding your car, hence the pre-determined cutoff...if you want to be 99% sure that turning around is a good idea, then the tail probability needs to be <1%, right? In my example, you would simply be 99.5% sure that you should turn around to find your car. –  Jan 02 '15 at 20:55
  • @Eupraxis1981 you're still assuming unit. Say 99%. Normal curves drop off really fast. So you get up to, something like, 98%, on the right side, walking 50 units. And up to 98.3% walking 100 units. After about 1000 units you reach 98.9%. But when you were at 50 units, you probably should have turned back 50 units, gotten up to 98.3% sure on the left side in 50 units, to now be 98.3% sure for both sides, then worry about units 50-2000 on either side. Walking an extra 50-100 units is of course trivial compared to completing the -2000 to +2000 search space. – djechlin Jan 02 '15 at 22:09
  • You are solving a different problem. You're actually optimizing how long it takes you to fail for a fixed percent assuming you will fail. Say % =90 and the car is in one of $-4, -3, -2, -1, 1, 2, 3$ with probabilities $0.80, 0, 0, 0, 0.03, 0.03, 0.04$. Your algo will search the right branch first since it's shorter, then the left. In the case the car is nowhere (10% of the time) it's optimal. But most of the time, it's really better to go to the 80% on the left and risk backtracking 33% as much, but not have to backtrack at all 80% of the time instead of 10% of the time. – djechlin Jan 02 '15 at 22:17
  • i.e. you're trying to optimally traverse a 99% interval, in which case no backtracking is best. But the expected value of time or steps to reach the car is very bad, compared to algorithms that backtrack sometimes, even though should the car not be in the first 99%, yours will be better. But that's a big assumption to make. In fact, it's only true 1% of the time... and it's, though valid, a poor model of the real situation to ignore that fact. – djechlin Jan 02 '15 at 22:22
  • ^ up two comments, "risk backtracking 33% more". I'm trying to say the extra backtracking is slightly worse should you have to do it, but you won't have to backtrack at all far, far more often. – djechlin Jan 02 '15 at 22:24
  • posted here http://math.stackexchange.com/questions/1089070/the-find-my-car-problem-proper-interpretation-and-solution – djechlin Jan 02 '15 at 23:19