2

Where entropy is some measure of the degree of randomness/disorder in a given set of numbers: $S = \{a_1, a_2, ..., a_i\}$

For example, the set $S_{high} = \{4,0,2,5,8,3,7,2,5\}$ has a high degree of randomness/disorder.

And the set $S_{low} = \{4,4,4,4,5,5,5,5,5\}$ has a low degree of randomness/disorder.

I am aware of information entropy $IE$, which applies to probability distributions (and quantifies the amount of information, which is related to randomness/disorder, contained in a probability distribution):

$$IE = \sum p_i log(\frac{1}{p_i})$$

However, I simply have numbers. Although I can take these numbers and convert them to an empirical probability distribution as:

$S_{low} = \{4,4,4,4,5,5,5,5,5\} $

$\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\space\downarrow$

enter image description here

The process/function by which one would do so (convert numbers to an empirical probability distribution so that the $IE$ formula above can be applied) is not differentiable (at least it seems that way to me).

So I wonder, is there any differentiable function that can take a set of raw numbers, and approximate the "entropy" of those numbers in the sense described above?

  • You need to clarify what you mean by Entropy http://en.wikipedia.org/wiki/Statistical_randomness – Dale M Mar 06 '15 at 04:29
  • @DaleM We can consider entropy to be the formula I gave above for $IE$ applied to the empirical probability distribution that can be constructed from the set of numbers $S$ – killajoule Mar 06 '15 at 04:31
  • Well then, your entropy will always be zero because your numbers that are used to construct the probability distribution will always perfectly match it. – Dale M Mar 06 '15 at 04:32
  • @DaleM that's not true. If I plug in the empirical probability distribution for $S_{low}$ (whose plot I give in the question), into the $IE$ formula, I get: 0.298 (which is a measure of the information, or the related quantity: disorder, in $S_{low}$) – killajoule Mar 06 '15 at 04:34
  • "differentiable" with respect to what? – leonbloy Mar 06 '15 at 13:45
  • @leonbloy Good question. Differentiable with respect to the inputs to the function which are the numbers in the set. I.e. $S={a_1,a_2,...,a_i}$ and $f(a_1,a_2,...,a_i) \approx IE$ so differentiable with respect to $a_i$ – killajoule Mar 06 '15 at 20:46
  • Furhermore, you should not speak of sets ${ ... }$, if order matters, but of tuples, or lists $( \cdots )$ – leonbloy Mar 06 '15 at 21:28
  • Hi. You may want to check out my approach at http://math.stackexchange.com/questions/1368567/empirical-entropy – mathreadler Aug 16 '15 at 09:54

2 Answers2

2

You don't have sets, but strings $S$ of digits. If these digits are not just symbols but represent some numerical values you could do a discrete Fourier transform of $S$, concatenated by its reverse in order to remove unwanted boundary effects. A chaotic behavior of $S$ will be reflected in the "middle" Fourier coefficients being large. The latter phenomenon is the discrete version of the fact that the Fourier coefficients of a periodic analog function $f$ tend to zero with a speed depending on the smoothness of $f$.

0

Just to add to Blatter's answer:

The concept you're looking for is neither "entropy" or "randomness". Randomness is not a property of a set of numbers but of a source (that is, the source is unpredictable – for you; but it might be predictable for someone else). For example, the numbers $\{4,0,2,5,8,3,7,2,5\}$ might predictably come from the solution of the equation $$x\; (x^2 - 7 x + 10)^2\; (x^4 - 22 x^3 + 173 x^2 - 572 x + 672) =0,$$ and the sequence $\{4,4,4,4,5,5,5,5,5\}$ might have been unpredictably produced by a random number generator.

Blatter's answer looks for periodicities in your sequence. That might be what you were looking for, or maybe not.

pglpm
  • 911