26

Some chap said the following:

Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.

That's always taken to mean that you can't generate true random numbers with just a computer. And he said that when computers were the equivalent size of a single Intel 8080 microprocessor (~6000 valves). Computers have gotten more complex, and I believe that von Von Neumann's statement may no longer be true. Consider that an implemented software only algorithm is impossible. They run on physical hardware. True random number generators and their entropy sources are also made of hardware.

This Java fragment put into a loop:

      file.writeByte((byte) (System.nanoTime() & 0xff));

can create a data file which I've represented as an image:

nanoimage

You can see structure, but with a lot of randomness as well. The thing of interest is that this PNG file is 232KB in size, yet contains 250,000 grey scale pixels. The PNG compression level was maximum. That's only a compression ratio of 7%, ie. fairly non compressible. What's also interesting is that the file is unique. Every generation of this file is a slightly different pattern and has similar ~7% compressibility. I highlight this as it's critical to my argument. That's ~7bits/byte entropy. That will reduce of course upon use of a stronger compression algorithm. But not reduce to anything near 0 bits/byte. A better impression can be had by taking the above image and substituting its colour map for a random one:-

randomised nanoimage

Most of the structure (in the top half) disappears as it was just sequences of similar but marginally different values. Is this a true entropy source created by just executing a Java program on a multi taking operating system? Not a uniformly distributed random number generator, but the entropy source for one? An entropy source built of software running on physical hardware that just happens to be a PC.

Supplemental

In order to confirm that every image generates fresh entropy without a fixed pattern common to all, 10 consecutive images were generated. These were then concatenated and compressed with the strongest archiver I can get to compile (paq8px). This process will eliminate all common data, including auto correlation leaving only the changes /entropy.

The concatenated file compressed to ~66%, which leads to an entropy rate of ~5.3 bits/byte or 10.5Mbits /image. A surprising amount of entropy $ \smile $

Supplemental 2

There have been negative comments that my entropy by compression test methodology is flawed, only giving a loose upper bound estimate. So I've now run the concatenated file though NIST's official cryptographic entropy assessment test, SP800-90B_EntropyAssessment. This is as good as it gets for non IID entropy measurement. This is the report (sorry this question is getting long, but the issue is complex):-

Running non-IID tests...

Entropic statistic estimates:
Most Common Value Estimate = 7.88411
Collision Test Estimate = 6.44961
Markov Test Estimate = 5.61735
Compression Test Estimate = 6.65691
t-Tuple Test Estimate = 7.40114
Longest Reapeated Substring Test Estimate = 8.00305

Predictor estimates:
Multi Most Common in Window (MultiMCW) Test: 100% complete
    Correct: 3816
    P_avg (global): 0.00397508
    P_run (local): 0.00216675
Multi Most Common in Window (Multi MCW) Test = 7.9748
Lag 

Test: 100% complete
    Correct: 3974
    P_avg (global): 0.00413607
    P_run (local): 0.00216675
Lag Prediction Test = 7.91752
MultiMMC Test: 100% complete
    Correct: 3913
    P_avg (global): 0.00407383
    P_run (local): 0.00216675
Multi Markov Model with Counting (MultiMMC) Prediction Test = 7.9394
LZ78Y Test: 99% complete
    Correct: 3866
    P_avg (global): 0.00402593
    P_run (local): 0.00216675
LZ78Y Prediction Test = 7.95646
Min Entropy: 5.61735

The result is that NIST believes that I have generated 5.6 bits/byte of entropy. My DIY compression estimate puts this at 5.3 bits/byte, marginally more conservative.

-> The evidence seems to support the notion that a computer just running software can generate real entropy. And that von Neumann was wrong (but perhaps correct for his time).


I offer the following references that might support my claim:-

Are there any stochastic models of non determinism in the rate of program execution?

WCET Analysis of Probabilistic Hard Real-Time Systems

Is there a software algorithm that can generate a non-deterministic chaos pattern? and the relevance of chaotic effects.

Parallels with the Quantum entropic uncertainty principle

Aleksey Shipilёv's blog entry regarding the chaotic behaviour of nanoTime(). His scatter plot is not dissimilar to mine.

JDługosz
  • 155
  • 4
Paul Uszak
  • 1,602
  • 1
  • 13
  • 21

12 Answers12

102

If you're using some hardware source of entropy/randomness, you're not "attempting to generate randomness by deterministic means" (my emphasis). If you're not using any hardware source of entropy/randomness, then a more powerful computer just means you can commit more sins per second.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
79

Just because you can't see a pattern doesn't mean that no pattern exists. Just because a compression algorithm can't find a pattern doesn't mean that no pattern exists. Compression algorithms are not silver bullets that can magically measure the true entropy of a source; all they give you is an upper bound on the amount of entropy. (Similarly, the NIST test also gives you only an upper bound.) Chaos is not randomness.

It takes a more detailed analysis and examination to start to get some confidence in the quality of randomness obtained in this way.

There are reasons to think that we can probably obtain some amount of randomness by exploiting clock jitter and the drift between two hardware clocks, but it's delicate and tricky, so you have to be careful. I would not recommend trying to implement your own. Instead, I would suggest you use a high-quality source of entropy (usually implemented in most modern operating systems). For more details, see also Wikipedia, haveged, and https://crypto.stackexchange.com/q/48302/351 (which it seems you are already aware of).

Lastly, a comment on your opener:

"Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin."

That's always taken to mean that you can't generate true random numbers with just a computer.

No, that's not how it is usually taken, and it's not what it is saying. It's saying you can't generate true random numbers by deterministic means. Whether you can do it on a computer depends on whether the computer is deterministic or not. If the computer is deterministic, or your program uses only deterministic operations, you can't. However, many computers contain non-deterministic elements, and if your program uses them, more detailed analysis is needed before you can decide whether they can be used to generate random numbers. In your case nanoTime() is non-deterministic.

D.W.
  • 167,959
  • 22
  • 232
  • 500
22

I've always understood the quote to mean that a deterministic algorithm has a fixed amount of entropy, and although the output can appear "random" it can't contain more entropy than the inputs provide. From this perspective, we see that your algorithm smuggles in entropy via System.nanoTime() - most definitions of a "deterministic" algorithm would disallow calling this function.

The quote - while pithy - is essentially a tautology. There's nothing there to disprove and there's no evolution of hardware possible that can make it no longer true. It's not about hardware, it's about the definition of a deterministic algorithm. He's simply observing that determinism and randomness are incompatible. For any deterministic algorithm, its entire behavior is predicted by its starting conditions. If you think you've found an exception, you're misunderstanding what it means to be deterministic.

It is true that a process running on a shared computer with a complex series of caches and which receives various network and hardware inputs has access to much more entropy than one running on simple, isolated, dedicated hardware. But if that process accesses that entropy it is no longer deterministic and so the quote doesn't apply.

bmm6o
  • 321
  • 1
  • 5
18

Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.

When you interpret "living in a state of sin" as "doing a nonsense", than it's perfectly right.

What you did is using a rather slow method System.nanoTime() to generate rather weak randomness. You measured some

... entropy rate of ~5.3 bits/byte ...

but this is just the upper bound. All you can ever get is an upper bound. The real entropy may be orders of magnitude smaller.

Try instead filling the array using a cryptographic hash like MD5. Compute a sequence like md5(0), md5(1), ... (from each value taken one or more bytes, this doesn't matter). You'll get no compression at all (yes, MD5 is broken, but still good enough to produce incompressible data).

We can say, that there's no entropy at all, yet you'd measure 8 bits/byte.

When you really need something random, you not only have to use a HW source, you also have to know a sure lower bound on how much entropy it really produces. While there most probably is some randomness in nanoTime(), I'm unaware of any non-trivial lower bound on it.

When you need randomness for cryptography, then you really have to resort to something provided by your OS, your language or a good library. Such providers collect entropy from multiple sources and/or dedicated HW and quite some work has been put into such entropy estimations.

Note that you usually need hardly any entropy. A good (deterministic) PRNG initialized with a few random bytes is usable for cryptography, and therefore also for everything else.

maaartinus
  • 473
  • 2
  • 9
16

I thought I'd chime in on the meaning of "random". Most answers here are talking about the output of random processes, compared to the output of deterministic processes. That's a perfectly good meaning of "random", but it's not the only one.

One problem with the output of random processes is that they're hard to distinguish from the outputs of deterministic processes: they don't contain a "record" of how random their source was. An extreme example of this is a famous XKCD comic where a random number generator always returns 4, with a code comment claiming that it's random because it came from a die roll.

An alternative approach to defining "randomness", called Kolmogorov complexity, is based on the data itself, regardless of how it was generated. The Kolmogorov complexity of some data (e.g. a sequence of numbers) is the length of the shortest computer program which outputs that data: data is "more random" if it has a higher Kolmogorov complexity.

Your use of compression algorithms like PNG, and comparing the length before and after compression, is similar to the idea of Kolmogorov complexity. However, Kolmogorov complexity allows data to be encoded as a program in any Turing-complete programming language, rather than a limited format like PNG; "decompressing" such encodings (programs) is done by running them, which may take an arbitrary amount of time and memory (e.g. more than is available in our puny universe).

Rice's theorem tells us that we can't, in general, distinguish between programs which loop forever and programs which output our data. Hence it's very hard to find the Kolmogorov complexity of some data: if we write down a program which generates that data, there may actually be a shorter program (i.e. a lower complexity), but we didn't spot it because we couldn't distinguish it from an infinite loop. Kolmogorov complexity is hence uncomputable, although if we knew the Busy-Beaver numbers we could compute it by using those to bound the amount of time that we check each program.

In the case of your example data, to find its Kolmogorov complexity (i.e. "intrinsic randomness") we would need to find shortest deterministic program which outputs that same byte sequence, and take its length.

Now we can answer your question from the point of view of Kolmogorov complexity, and we find that the quote is correct: we cannot generate random numbers (high Kolmogorov complexity) by deterministic means.

Why not? Let's imagine that we write a small computer program and we use it to generate a sequence of random numbers. One of the following situations must apply:

  • We generate a vast amount of output. However, since we know that this output is generated by a small program, the output (by definition) has low Kolmogorov complexity, and hence it's not "random" in this sense.
  • We generate so few numbers that writing them all down would take about the same, or even fewer, bits than writing down our short generating program. In this case, the numbers are relatively incompressible, which indicates that they're quite random in the Kolmogorov sense. However, since the amount of output is comparable to what we put in (the source code for the program), it's fair to say that the program didn't "generate" the randomness, we did by choosing that program. After all, in this case our generating program might as well have just been a list of these exact numbers (e.g. print([...])).

In either case, we're not "generating" more randomness than we put in (the "randomness" of our generating program's source code). We might try to work around this by using a longer generating program, to avoid the output having a short generator, but there are only two ways to do that:

  • Systematically "bloat" the code in some way. However, Kolmogorov complexity doesn't care about the particular program that we used to generate the data: it only cares about whichever generating program is the smallest. Systematic bloat doesn't add Kolmogorov complexity, because such patterns in the code can themselves be generated with a very small amount of code. For example if we take run(shortGenerator) and add a whole load of systematic bloat to get run(bloatedGenerator), a short generator still exists of the form run(addBloat(shortGenerator)).
  • Add bloat non-systematically, i.e. without any patterns, so that an addBloat function would have to end up being just as bloated as the code itself. However, being so devoid of patterns is exactly what makes something random (high Kolmogorov complexity). Hence bloating the generating program in this way does increase the randomness (Kolmogorov complexity) of the output, but it also increases the amount of randomness (Kolmogorov complexity) that we have to provide in the form of source code. Hence it's still us who are providing the "randomness" and not the program. In the above example of just writing print([...]), adding non-systematic bloat is equivalent to just writing more "random" numbers in that hard-coded list.
Warbo
  • 642
  • 3
  • 13
8

Compression isn't an accurate test of randomness, and nor is looking at an image and saying "that looks random".

Randomness is tested by empirical methods. There are in fact suites of specially designed software/algorithms for testing randomness, for example TestU01 and the Diehard tests.

Furthermore, your image is in fact a 1D string of number mapped onto a space, and thus isn't a good representation of certain patterns that can appear.

If you were to examine your image pixel by pixel, you would most likely find many short patterns of increasing value before a sudden drop. If you were to create a graph with the x value being the sample number and the y value being the value obtained from the 'random' function, you would most likely find that your data in fact looks like a sawtooth wave:

Sawtooth Wave

This is the pattern made by values that increase under modular arithmetic (which your computation is an example of: time increasing at a near constant rate, and the & 0xFF acting as mod 256).

Pharap
  • 311
  • 2
  • 8
5

You are confusing the concept of random numbers from "numbers that appear to be random."

To understand von Neumann's quote, we have to understand what it means to "generate random numbers." Warbo's answer links an excellent XKCD to this end: XKCD comic

When we talk about random numbers, we're not talking about the values themselves. Obviously a 4 is no more random than a 3. We are talking about the ability for a third party's ability to predict this value better than random chance. A random number is one which is not predictable. Sometimes we'll add conditions to this. A Cryptographically-secure pseudo-random number generator (CSPRNG) generates numbers which cannot be predicted bettr than random chance if an attacker does not know the seed/key, but if we're talking about truly random numbers (not pseudo-random), its usually defined to be a number that is not predictable, even with complete knowledge of the system, including any keys.

Now, your example, as many have pointed out, is not deterministic. The program does not specify what value comes out of System.nanoTime(). Thus it is not in the same class as using a CSPRNG to generate pseudo random numbers. The former may be nondeterministic while the latter is deterministic if the value of the key is deterministic. The former contains operations which are not defined to have deterministic values.

However, you will note that I said it may be nondeterministic. Do be aware that System.nanoTime() is not designed to provide values for this purpose. It may or may not be sufficiently nondeterministic. An application might adjust the system clock such that the calls to System.nanoTime() all occur on multiples of 256 nanoseconds (or close). Or you may be working in Javascript, where the recent exploits of Spectre have lead major browsers to intentionally decrease the resolution of their timers. In these cases, your "random numbers" may become highly predictable in environments that you did not plan for.

  • So generating random numbers with deterministic processes... sin.
  • Generating random numbers with dedicated random hardware... not sin.
  • Generating random numbers with nondeterministic aspects of computers... maybe sin.

It all depends on what you intend. If you are encrypting your love letters to Sponge Bob so that your sister cannot read them, the demands placed on your so-called-random-numbers are pretty low. System.nanoTime() used as you did is probably good enough. If you're protecting nuclear secrets against an advanced foreign State which is actively seeking them, you may want to consider using hardware that's designed to be up to the challenge.

Cort Ammon
  • 3,522
  • 14
  • 16
4

I don't think you have understood the claim. The point is that if there is a deterministic procedure for generating a 'random' number series (or anything, really), then finding the pattern is merely the task of finding this procedure!

Hence, there is always exists a deterministic method to predict the next integer. This is precisely what we don't expect to happen if we assume randomness!

Any sufficiently complex deterministicity is indistinguishable from stochasticity.

--From Wrzlprmft's user page

Hence, even if something looks random, why on earth would we model it as 'random' if we have a deterministic procedure to generate it?

This, I think, is the key problem. You have only shown some form of indistinguishability of the PRNG and 'true randomness'.

However, that these concepts are therefore equal does not follow. In particular, randomness is a mathematical, theoretical concept. We have already shown above, that in theory, considering PRNG as 'true randomness' leads to a contradiction. Hence, they cannot be equal.

Discrete lizard
  • 8,392
  • 3
  • 25
  • 53
3

I think that others already pointed it out, but it was not that emphasizes, so let me also add to the discussion.

As others already pointed out, there's the issue of measuring the entropy. Compression algorithms may tell you something, but they are source-agnostic. Since you know more about how the data was generated, you probably could construt a much better algorithm to compress it, and that means that true entropy is much lower.

Furthermore, you're somewhat mistaking meanings of phrases "on computer" and "deterministic". You certainly can perform nondeterministic operation on the computer.

Furthermore, in fact, you just did it, but it's not that apparent at the first glance.

A typical deterministic algorithm for a random number generation is ie. PRNG like linear congruential generator. They are stateful. Inner state means less entropy since next state is determined by previous. I won't delve in that, it's probably obvious to you. Important point is that fully deterministic algorithm depends only on previous state, whatever would it be.

Now look at your algorithm. What it is based on? How much state do you have? Is it deterministic?

  file.writeByte((byte) (System.nanoTime() & 0xff));

Let's ignore file.write and any the issues of flushing buffers, waiting for I/O (did you try to add heavy noise on the harddrive cables for a moment? no? hey you could do it. hey, it's nondeterministic then! :) ), and let's focus on the source, it's more important.

The time is some kind of a state. It varies, but most of it is the same. That's why you tried to circumvent it and took &0xFF to drop most of the state. But you have not dropped it all, some state of the previous read may leak to the next one, so it's certainly not fully nondeterministic *)

But we're not interested in that. To "prove" that the quote is wrong:

Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.

You need to prove it by a deterministic means.
What we are interested in, is: is your algo certainly fully deterministic?

..and it's obvious that it is not.

  System.nanoTime() & 0xff

That's a time measurement. Time and measurement. Measurement part may make it deterministic, if the value is cached. I assume it is not, otherwise this function would have no sense. Then, if it's read on the fly from the source, we have time-based value. Since (I again assume) you've not ran that on a single-task dedicated hardware, then you may have context-switching kicking in sometimes. Even if you had a single-task dedicated hardware, time measurement may still be not deterministic, because of temperature/humidity drifts in the time source, bus clocking times, etc.

I totally agree I'm exagerrating here. Drifts won't be that large to make much impact (though for a real nanotime they could be). More importantly, nanotime is meant to be fast. It doesn't read from real time source. It's based on processor's inner instruction/cycle count. That is actually deterministic, if you ensure no context switches.

My point is, it may be actually very hard to run a truly 100% deterministic algorithm if you base it on time, and you have no right to disprove that quote unless you do have fully deterministic means.

*) Interestingly, you probably could increase the actual randomness if you go the hardcore way. Do &0x01, bit by bit, and thread-wait a noticeable time, before reading each bit. Generating data that way would be ridicously long, but I actually would argue that it could be considered to be almost truly random, IIF you're running on non-RTOS and also IFF in each 'noticeable time' is high enough to ensure that underlying OS either went to sleep, or context-switched to another task.

quetzalcoatl
  • 139
  • 2
2

I think the answer you need starts with this comment you yourself made in reply another answer:

The pattern is a result of the interplay of Java, the JVM, the OS, the CPU + caches, the hard disk, the Trance music I was streaming that consumes CPU /RAM cycles and everything in between. The pattern simply arises from the one line of Java code inside a for/next loop. A significant part of the entropy comes from the underlying hardware circuits.

You already realize this, I think: you didn't use deterministic means to create the pattern.

You used a computer, a non-negligible part of which is deterministic, but the entropy came from external non-deterministic (or at least, non-deterministic for all practical intents and purposes at the moment) sources: you or the external world interacting with the computer (and to a lesser extent, any physical imperfections in the computer hardware that might effect the timings of things).

This, by the way, is a big part of how modern operating systems seed their random number generators that are available to programs: by harnessing the entropy in interactions with its hardware and the user that we hope are not predictable to an attacker.

By the way, external-world entropy is actually a problem that has to be dealt with to this day in otherwise well-coded cryptography: computers that have predictable behavior upon boot and during their runtime, such as those with read-only storage or which boot from the network, and which have a predictable network environment (either not attached to a network or the workload on the network is low enough that everything is delivered within a reliable amount of time), and which run the same limited set of software with roughly consistent behavior, might grossly over-estimate the entropy they are getting from these assumed-to-be-unpredictable components, and end up generating far more predictable numbers than you'd get on a typical work-station that's doing all sorts of other stuff for you (streaming music, syncing with dropbox, whatever) in the background.

I think most answers are getting focused on whether checking the last eight bits of time measurements in nanoseconds taken in a loop is a good way to harvest that entropy. This is a very important question to properly answer before you use the method in your example as a random number generation scheme in practice, but it's a separate question from what I think you are asking about.

mtraceur
  • 121
  • 4
0

Just for the record, the actual quote uses the word "arithmetic" rather than "deterministic" and is from John von Neumann in "Various Techniques Used in Connection With Random Digits" published in a symposium proceedings on the Monte Carlo Method in June 1951; the symposium itself was in July 1949.

Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin. For, as has been pointed out several times, there is no such thing as a random number — there are only methods to produce random numbers, and a strict arithmetic procedure of course is not such a method. (It is true that a problem that we suspect of being solvable by random methods may be solvable by some rigorously defined sequence, but this is a deeper mathematical question than we can now go into.) We are here dealing with mere "cooking recipes" for making digits; probably they can not be justified, but should merely be judged by their results. Some statistical study of the digits generated by a given recipe should be made, but exhaustive tests are impractical. If the digits work well on one problem, they seem usually to be successful with others of the same type.

enter image description here

You do have to look at the context. This was during a time when Monte Carlo simulation and computers were available only to very high-end (and usually defence-related) laboratories, and the RAND corporation had created some kind of electric circuit ("A random frequency pulse was gated by a constant frequency pulse, about once a second, providing on the average about 100,000 pulses in 1 second...." to generate a million digits that were later published as a book. Arithmetical methods such as the middle-square method ("A ten-digit number (we took 1,111,111,111) is squared and a middle ten is selected; i.e. the right five digits are dropped and the next ten retained to give "random" digits and to continue the process.") were the "state of the art".

Today we've got lots more statistical tests for randomness and lots more "recipes" for generating numbers deterministically that pass statistical tests with a certain threshold of confidence, along with more rigorous notions of information theory.

If you use the terminology precisely, deterministic pseudorandom number generation is valid to meet the requirements of certain uses. Sufficiently long output sequences of pseudorandom number generators will fail statistical tests, which is why you have to apply these sequences appropriately.

The term "random" bandied about by itself is not precise. (As is the term "sin".)

Jason S
  • 211
  • 1
  • 5
0

To add to previous answers, here's an easy way to think about this question.

It's all about the difference between random and deterministic. We'll come to Von Neumann and what he was saying, afterwards.

Random numbers

A true random number generator would have no pattern, not even hidden in the background, which we could use to predict the next number given the sequence so far. In an ideal world, you could know everything there is to know in the physical universe, and about the system, nanosecond by nanosecond, and it still would be useless to try and predict the next number produced.

That's an ideal case - in practical terms we get there by mixing together many sources that are "not bad approximations" to random, or are truly random, or which mathematically mix things up enough that you can mathematically prove they get very close to unprediuctable and lack bias to any specific numbers or patterns.

  • "Good" sources are things similar to waiting for a radioactive decay process, or other quantum process that's inherently unpredictable. Output from a heat sensitive semiconductor. Random noise in a diode or other electrical material. Counting photons from the sun.

  • Mixed into this, we can also add some that we consider "not bad" which help as they don't have any connection to these: Waiting for the next mouseclick or network packet. Last bit of microtime on next file write. Output of a "known but mathematically pretty random" pseudorandom number generator function. Previous entropy from previous uses of random numbers.

The aim here, is to get a number which still cannot be predicted, whatever in the universe you know, and is statistically as likely to be this as that, with no mathematically detectable pattern, bias or predictability, and no correlation to an event that could be monitored and used for prediction. (Or if correlated with an event, then it's done in a way that makes the connection incredibly tenuous, such as "nanosecond digit only of time of last mouse click")

Deterministic numbers

Mathematicians can prove things about formulae and functions. So it's possible to prove that a function, when repeatedly called, doesn't give any bias or preference to any pattern, other than the simple pattern "these are the outputs of that function if repeatedly called".

So for example, if you pick a number say between 1 and 10 million, write it in binary, and "hash" it repeatedly, you'll get a pretty random looking sequence of digits. It's almost random - but it isn't actually random at all. You can predict given the algorithm and any state, what the next number will be.

We call it "pseudorandom" because it looks and seems to be mainly random, even if it isn't.

Here's a good example. Think about this sequence of 3 digit "random numbers": 983, 367, 336, 244, 065, 664, 308, 602, 139, 494, 639, 522, 473, 719, 070, 217. Let's say that I tell you I can generate a million numbers the same way. You can pass then to a statistician who will confirm (say) that they are distributed equally or whatever it may be. There's no obvious predictable patternb. They look pretty random, right? But now I tell you that they are actually

the 500th+ digit of Pi, grouped in 3s.

Suddenly, however random the

digits of Pi

may be, you can immediately predict that the next 2 numbers will be 986 and 094.

To be clear, I don't know exactly how random the

digits of Pi

are. It will have been studied and the answer well known. But the point is this: In principle, the same conclusion is true for any source that is produced following a deterministic process.

In between

In between the two, are a whole range of "things that look random and are often random to some degree". The more randomness and near randomness one can mix in, the less prone the output is to being able to have any pattern detected or any output predicted, mathematically.

Back to von Neumann and your question

As you can see, deterministic outputs might look random, but and might even, statistically, be randomly distributed. They might even use "secret" or fast-changing data that we have no realistic hope of knowing. But as long as it's deterministic, the numbers can still never truly be random. They can only be "close enough to random that we're happy to forget the difference".

That's the meaning of the quote you gave. A deterministic process just cannot give random numbers. It can only give numbers that seem to be, and behave quite like, random numbers.

We can now rephrase your question like this: "My (or any modern) computer's output can look and behave totally randomly, does that mean von Neumann's quote is now outdated and incorrect?"

The problem is still this: Even if your computer's output may look and behave randomly, it still may not be truly random. If it's only calculated deterministically, that means there is nothing that wasn't prediuctable cause-effect about gettinbg the next number (that's what "deterministic" means in this sense). We start with some existing data (known), we apply a known process (complex or messy or whatever), and we get what seems a new "random number" out. But it's not random, because the process was deterministic.

If you say that your method will include a true hardware random generator, to fix that (like a random number generated from radioactive decay or noise in a semiconductor), then your answer could now be random - but your method by definition is no longer deterministic, precisely because you can't predict the outputs (or effects) given the inputs/initial data (causes) any more.

Von Neumann wins both ways, almost by definition!

Stilez
  • 121
  • 3