2

An elegant program for a bitstring is the shortest program on a universal Turing machine that outputs this bitstring. According to Kolmogorov complexity, the length of the elegant program is independent of the Turing machine implementation.

Solomonoff induction uses the elegant program for a bitstring to predict the next digit. This is a universal prior given the minimal assumption the output is generated by a Turing machine.

We can use this insight to build a machine learning algorithm called a finite Solmonoff learner. The difference between a finite Solomonoff learner and the original Solmonoff induction algorithm is the finite learner does not have access to all elegant programs. There is no algorithm that can generate all the elegant programs, so the elegant programs the finite learner can use must be stored in memory. With a finite amount of memory, there is a limit to the elegant programs that can be stored and consequently used by the finite learner for prediction.

The limit exists because there are a finite number of elegant programs of a certain length L. When the bitstring of 1s becomes long enough, it is not possible for there to be an elegant program representing the bitstring to be of length L or shorter. If L is the amount of memory available, then eventually the elegant programs will all be longer than L, and none will fit in memory.

Now let's assume we have a very long string of 1s, and we remove one digit to make a prediction problem.

As a concrete example, our bitstring is:

11111111111111111111

We remove a 1 at random:

111111111111111_1111

The learner must figure out the most likely digit to go in the empty spot.

For a given amount of memory we can make the bitstring of 1s long enough that its elegant program cannot fit in memory. In this case, the finite Solomonoff learner will not be able to access the elegant program for the bitstring, and will thus be incapable of predicting the digit that goes in the empty slot.

To continue the example, assume the elegant program that generates the 1s is:

10001011

Furthermore, the amount of memory available is 7 bits. Consequently, the 8 bit elegant program cannot be stored in memory, and the finite learner cannot figure out what goes in the empty slot.

On the other hand, regardless of how long the bitstring of 1s becomes, a human will have no problem identifying the missing digit. A human has finite memory and cannot access all elegant programs to make predictions. Despite having the same handicap as a finite learner, the human can outperform the finite learner infinitely often.

Does this demonstrate Solomonoff learning is less powerful than human learning?

yters
  • 1,457
  • 2
  • 13
  • 21

1 Answers1

3

Your suppositions are wrong. A finite Solomonoff learner can learn the all-1's string. There does exist a learning algorithm that can output a program for the all-1's string.

Consequently, this example does not demonstrate that Solomonoff learners are necessarily worse than a human being. They might be; but this example doesn't prove that.

For instance, consider the following learning algorithm: if the inputs is all-1's, then output a program that only outputs 1's (e.g., 10001011 in your example), otherwise do something arbitrary. This is a trivial algorithm, but it demonstrates that there does exist a learning algorithm that can learn that sequence. In other words, you've given an example of an input sequence that both a human can learn and a finite Solomonoff learner can learn.

What you might have shown is that if we restrict the learner to a very limited amount of memory, then the learner might not be able to learn some things that a human can. But that's a rather uninteresting conclusion. Of course we know that if you severely restrict the memory of the computer, that limits what it can compute. For instance, if we limit the memory of the algorithm to only 7 bits, then the algorithm must act like a finite-state machine with at most 128 states, which severely limits the patterns it can find. A human can learn sequences that can't be recognized by a finite-state machine with 128 states. That's not surprising, and doesn't really have any interesting implications for whether finite Solomonoff learners are worse than human learning.

There's no point in comparing a human to a computer algorithm that has been massively handicapped. To make the comparison interesting, you must allow the computer algorithm to have reasonable access to the resources that any computer would have; otherwise you have trivialized the question. This means allowing the computer learner a non-trivial amount of memory. And once you do that, the learner no longer has any difficulties with your sequence.


There are other issues in your post. If you are assuming that the only possible approach is to try all possible algorithms until you find one that works, that's not the case. More seriously, the problem of "given a sequence, find the elegant program for it" is not computable, so if you require the learner to find the shortest possible program, you've created an impossible problem. Finally, where you say that "According to Kolmogorov complexity, the length of the elegant program is independent of the Turing machine implementation.", but that's not correct. What is true is that changing the universal Turing machine will change the length by only a constant... but if you care about constants, that might be important.

D.W.
  • 167,959
  • 22
  • 232
  • 500