3

This is a paraphrasing of a statement from Data Mining by Witten et al., 4th edition, on section 1.6.

Consider the set $S = \{1, \dots, 288\}$.

A rule is defined as a one-element singleton set whose element is from $S$, e.g., $\{1\}$, $\{2\}$, etc..

The question: how many sets with no more than $14$ rules can be created?

My work: There are $288$ such rules. The result should then be given by $$\sum_{i=1}^{14}\binom{288}{i} \sim 2.36 \times 10^{23}\text{.}$$

See here for the computation.

The correct answer: Witten et al. states that the answer is $\sim 2.7 \times 10^{34}$ with no explanation of the derivation.

Hopefully my terminology is correct. If someone knows this material well and I have something incorrect in my problem setup, please let me know.

Further context which might be helpful: this statement arises from looking at a data set of weather variables. We have five variables per observation, for which each variable has $4$, $4$, $3$, $3$, and $2$ variables. Thus, there are $4 \times 4 \times 3 \times 3 \times 2 = 288$ possibilities for each rule, according to Witten et al. It is possible for a variable to not participate in a rule, but every rule must have at least one variable.

If we restrict the rule set to contain no more than $14$ rules [based on a previous example], there are around $2.7 \times 10^{34}$ possible different rule sets.

The following slides were taken from www.cse.unt.edu/~nielsen/classes/CSCE5380/Slides/Chapter1.pptx:

enter image description here enter image description here enter image description here

Clarinetist
  • 20,278
  • 10
  • 72
  • 137
  • He appears to have calculated $288^{14} \sim 2.7\times 10^{34}$. There are 288 possible rules, so he's calculated the number of sequences of length 14 whose elements are rules. – Albert Steppi Dec 18 '16 at 04:35
  • @AlbertSteppi So then, the correct answer for the question posed would be $\sum_{i=1}^{14}288^{i}$? – Clarinetist Dec 18 '16 at 04:42
  • @AlbertSteppi I would also imagine that there are redundancies which are supposed to removed as well... as ${1, 2}$ should be the same as ${2, 1}$, for example. – Clarinetist Dec 18 '16 at 04:45
  • 1
    I think your original answer is correct by my interpretation of rule sets. Witten seems to mean something different by this term. I don't think this calculation is important to the material in the text so I think it's safe to ignore it. – Albert Steppi Dec 18 '16 at 04:45
  • @AlbertSteppi Thank you! – Clarinetist Dec 18 '16 at 04:46

0 Answers0