-1

At first I thought I could just represent them with the set of natural numbers, because each bit string represents some natural number. However doing this would mean the strings '010' and '00010' are the same, which they are not.

  • 1
    What do you mean by represent? Do you mean, find some well-known or understandably constructed set that is bijective with the set of finite-length binary strings? – Isky Mathews May 05 '18 at 19:58
  • 1
    If so, how does $2^\omega$ work for you? – Isky Mathews May 05 '18 at 19:58
  • 1
    Perhaps an extension to a subset of $\Bbb Q$, something like $\frac n{(n+1)^k}$ for a number $n$ with $k$ leading zeroes? – abiessu May 05 '18 at 19:59
  • 2
    A notation for it is ${0,1}^*,$ where the star means Klein star. Which is $$\bigcup _{i = 0}^{\infty}{0,1}^i,$$ and ${0,1}^i$ are i ordered tuples of elements from ${0,1}$ – Phicar May 05 '18 at 19:59
  • 1
    How about making each 0 in a string 1 and each 1 in a string 2 and then interpreting the string as a list of exponents of primes in the unique factorisation of some number, e.g. $010$ goes to $121$ and so $2^1\times3^2\times5^1 = 90$ or $00010$ goes to $11121$ and thus $2^1\times3^1\times5^1\times7^2\times11^1$... – Isky Mathews May 05 '18 at 20:02
  • I really like the $2^\omega$ and the prime number representation (its kinda like Gödel numbering), but I think the Kleene Star is probably the standard notation for this set. – Ozaner Hansha May 05 '18 at 20:04
  • 3
    @IskyMathews "$2^\omega$" denotes the set of infinite binary strings. – Noah Schweber May 05 '18 at 20:06
  • 1
    @Noah Schweber: Thank you! That was a silly typo. – Isky Mathews May 05 '18 at 20:09
  • 1
    @OzanerHansha If you're looking for a notation for naming this set, "$2^{<\omega}$" is also used (it's the standard in mathematical logic, for example). – Noah Schweber May 05 '18 at 20:10
  • I see. I guess ${0,1}^+$ (${0,1}^*$ includes the empty set) is more computer science-y and $2^{<\omega}$ is more set theory-y. Thanks! – Ozaner Hansha May 05 '18 at 20:13

3 Answers3

2

Here's one easy way to biject the finite binary strings to the natural numbers (let's say we're thinking of the natural numbers as not including zero):

  • Step $1$: put a "$1$" on the left.

  • Step $2$: view the resulting string as a number in binary as usual.

E.g. "$010$" turns into "$1010$," which is ten. The empty string meanwhile turns into the string $1$, which then turns into the number one.


If we want to get $0$ in the mix too, we can just add "Step $3$: subtract $1$" to the above.

Noah Schweber
  • 260,658
  • As commented in the question, "the strings 010 and 00010 are not the same". So, how to generate 0, 00, 000, ...? – Peter Krauss Mar 20 '25 at 16:00
  • About bijection, there are more bit strings than Natural numbers (like Cantor Set). Suppose the set $X_k$ of all bit strings of lengths zero to k, we have $|X_k|=2^{k+1}-1$, because you most to count 00, 000, 01, 001 etc. as distinct elements. – Peter Krauss Mar 20 '25 at 16:06
  • @PeterKrauss Re: your first comment, note that I'm adding a 1 to the left of each string. So e.g. the string 0 would turn into the (binary) number 10, 00 would turn into 100, etc. And your second comment is just wrong: even though the "counts at each finite stage" look different, the cardinalities of $\mathbb{N}$ and $2^{<\omega}$ are the same. – Noah Schweber Mar 20 '25 at 16:25
  • Incidentally, the misconception in your second comment is basically the same "limit-vs.-process" issue as in (comments to) this old question of yours. – Noah Schweber Mar 20 '25 at 16:40
  • Sorry, I didn't read it carefully. Only later did I realize that I had already used the "protect with leading 1" algorithm, that you describe. To be able to express bit strings more economically, in 64-bit integers (where the first bit is the sign and must be zero to be positive). In PostgreSQL the direct cast of VARBIT x, (b'01'||x)::bit(64)::bigint, results in big number because copy to the left. You can copy bit string to right using overlay( b'0'::bit(64) PLACING (b'1' || x) FROM 64-length(x) )::bigint. – Peter Krauss Mar 22 '25 at 04:18
0

(this answer is a Wiki, you can edit and improve it!)

The non-numerical nature of bit string calls for contextualization. There are some alternative ways, in pure Mathematics, to do both:

  • Bit string definition, and
  • Define the Set X of All Finite Bit Strings.

PS: as commented by @NoahSchweber, the "standard name" of $X$ in Mathematical Logic, is $2^{<\omega}$.

Defining bit string

In Computation bits (binary digits) are not numbers, are "raw symbols", typically 0 and 1.

A bit string s is a sequence of zero or more bits. For example 010 and 00010 are distinct elements of $X$. In particular s=``, the empty string, is an element of $X$.

We can cast to numbers of the domain $D=\{0,1\}$ using sequences:
  ``≡(); 0≡(0); 1≡(1); 00010≡(0,0,0,1,0).

Bit strings are equipped with the concatenation operation. If a and b are bit strings, then the concatenation $c=a||b$ is also a bit string.

PS: the empty string is important because it is the neutral element of the concatenation.

Defining X by Kleene star

As commented by @Phicar, the easiest way to define a bit string is using the Kleene star:
 $\{0,1\}^*$

It is a "generator expression", producing elements by concatenation, with lengths ranging from zero to countably infinite. Thus, it is not explicit regarding finiteness, as requested.

Back to the Kleene star definition:
  $X = \bigcup _{k \in \mathbb{N}}\{0,1\}^k ~ \equiv ~ \{0,1\}^*$

We can be more explicit, limiting k to the length of the "finite bit strings": \begin{equation} X_k = \bigcup _{i = 0}^{k}\{0,1\}^i \end{equation}

This definition seems the requested answer.

Note. We can rewrite it using recursion, to emphasize the hierarchy in k:
  $X_{k} = \{0,1\}^k \cup X_{k-1}$


NOTES, EXAMPLES AND ILLUSTRATIONS

Examples of $\{0,1\}^i$, the generator of all bit strings with length i:

$\{0,1\}^0$ = {``}
$\{0,1\}^1$ = {0,1}
$\{0,1\}^2$ = {00,01,10,11}

Examples of $X_k$, the set of all bit strings with no more than k bits, with respective cardinalities:

X0 = {``};   $|X_0|=1$.

X1 = {``, 0, 1};   $|X_1|=3$.

X2 = {``, 0, 00, 01, 1, 10, 11};   $|X_2|=7$.

X3 = {``, 0, 00, 000, 001, 01, 010, 011, 1, 10, 100, 101, 11, 110, 111};   $|X_3|=15$.

  Note. The sequence of cardinalities (1,3,7,15,...) is A000225.

Pictorial view of the hierarchical structure suggested by the recurrent definition. it can be associated with the Cantor Set picture and the binary tree:

enter image description here

Notes about cardinality:

For fixed length bit strings, $|S_k|=2^k$, so $|X_k|=2^k+|X_{k-1}|$. Then, by induction:

  $|X_k|=2^{k+1}-1$.


Comments under discussion:

As the curious cardinality of Cantor Set, the value of $|X_{\infty}|$ seems uncountable.

Using the bijection of the @NoahSchweber's answer, we can say that the (countable) infinity of the cardinality of Natural numbers overflows by 1 bit (to the uncountable infinity of Cantor Set).

Peter Krauss
  • 176
  • 11
  • This isn't an answer: the OP isn't asking about $X_\infty$ but rather $\bigcup_{k\in\mathbb{N}}X_k$. (Also, it's "Kleene star," and it's literally the same thing that regular expressions use for this.) – Noah Schweber Mar 20 '25 at 19:36
  • Hi @NoahSchweber, sorry, you are confusing a little "curiosity note about cardinality" (at the end of the text) with the answer: it is only a note. I added the phrase "under discussion" in the cardinality note. And I open as a Wiki, now you can edit. And I revised the English, the Latex, and removed the section about regular expression... It looks better — English is not my language (nor Mathematics), please consider again. – Peter Krauss Mar 22 '25 at 04:35
0

I am late to this question but, since it appeared in my feed, the most obvious way to represent all of the finite bit strings is as functions on the natural numbers that are zero, almost everywhere.

John Douma
  • 12,640
  • 2
  • 26
  • 27