2

As in the title, I am trying to find the largest (aka least upper bound) of a (very large) set of integers. Importantly, I do not have direct access to the full list of integers, but I do have a function $f(n)$ which returns true/false if $n$ is in the set. The function $f(n)$ is expensive and I would like to minimize the number of calls I must make to it.

The integers might or might not be consecutive, or have large gaps between them (i.e. might be sparse or dense). There is no prior-known upper bound on the largest integer in the set, which can go off to infinity in theory.

Is there a well-trodden algorithm for doing this? My inkling is to do some kind of random sample to determine the density, and then try to find the upper bound within some certainty. I'm not sure how to bound my initial sample properly then though, or which distribution I might assume the integers have based on that sample.

Thanks.

R. Granton
  • 29
  • 1

2 Answers2

1

Without more a priori information on the distribution, no algorithm can work. Because you can never be sure that there is no larger $n$ than those already tried.


The best (?) you could do is an exhaustive search: this guarantees that you will someday find the maximum $n$, though you will not know when.

0

Tricky.

As an approximation, assume that the number n is in the set with a probability p (n), and the sum of p (n) over all integers n is finite, so we may guess that the set is finite.

You'd have to check enough values n to make some reasonable assumptions about p (n), and then based on those assumptions make a reasonable guess about the upper bound.

But let's say that p (n) = $1 / n^2$. The probability that there is any element n ≥ N is about 1/N. So 10 is an upper bound with probability 90%, 100 is an upper bound with probability 99% and so on. On the other hand, your checks will likely show that 1 is an element and find no others. p (n) could be any of a gazillion of possible probabilities.

Your chances are much better if you have some knowledge about the distribution.

gnasher729
  • 32,238
  • 36
  • 56