15

I've found that if I don't understand the etymology behind a cs/programming term, it usually means that I've missed or misunderstood some important underlying concept.

I don't understand why the Kleene star is also called the Kleene closure. Is it related to closures in programming, a function with bound non-local variables?

... on reflection, maybe it is because it allows an open ended set to be written in a closed expression form?

... well in good old rubber-duck-explaining fashion, I'm now guessing that is it, but would still welcome an authoritative answer.

mallardz
  • 253
  • 1
  • 2
  • 5

3 Answers3

18

A set is closed under some operator if the result of applying the operator to things in the set is always in the set. For example, the natural numbers are closed under addition because, whenever $n$ and $m$ are natural numbers, $n+m$ is a natural number. On the other hand, the naturals are not closed under subtraction since, for example, $3-5$ is not a natural number.

The closure of a set $S$ under some operator is the smallest set containing $S$ that is closed under the operator. For example, the closure of the natural numbers under subtraction is the integers; the closure of the natural numbers under addition is just the natural numbers, since the set is already closed.

So, "Kleene closure" is not an alternative name for "Kleene star". The Kleene star is the operator; the Kleene closure of a set is the closure of that set under the operator.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
7

In a nutshell

The name Kleene closure is clearly intended to mean closure under some string operation.

However, careful analysis (thanks to a critical comment by the OP mallardz), shows that the Kleene star cannot be closure under concatenation, which rather corresponds to the Kleene plus operator.

The Kleene star operator actually corresponds to a closure under the power operation derived from concatenation.

The name Kleene star comes from the syntactic representation of the operation with a star *, while closure is what it does.

This is further explained below.
Recall that closure in general, and Kleene star in particular, is an operation on sets, here on sets of strings, i.e. on languages. This will be used in the explanation.

Closure of a subset under an operation always defined

A set $C$ is closed under some $n$-ary operation $f$ iff $f$ is always defined for any $n$-tuple of arguments in $C$ and $C=\{f(c_1,\ldots,c_n)\mid \forall c_1,\ldots,c_n \in C\}$.

By extending $f$ to sets of values in the usual way, i.e. $$f(S_1,\ldots,S_n)=\{f(s_1,\ldots,s_n)\mid \forall s_i\in S_i. 1\leq i\leq n\}$$
we can rewrite the condition as a set equation:
$$C=f(C,\ldots,C)$$

For a domain (or set) $D$ with an operation $f$ that is always defined on $D$, and a set $S\subset D$, The closure of $S$ under $f$ is the smallest set $S_f$ containing $S$ that satisfies the equation: $S_f=\{f(s_1,\ldots,s_n)\mid \forall s_1,\ldots,s_n \in S_f\}$.

More tersely with a set equation, the closure of $S$ under $f$ may be defined by:

$$S_f \text{ is the smallest set such that } S\subset S_f \text{ and } S_f=f(S_f,\ldots,S_f)$$

This is an example of least fixed-point definition, often used in semantics, and also used in formal languages. A context-free grammar can be seen as a system of languages equations (i.e. string set equations), where the non-terminal stand for language variables. The least fixed-point solution associate a language to each variable, and the language thus associated to the intial symbol is the one defined by the CF grammar.

Extending the concept

The closure as defined above is only intended to extend a subset $S$ into a minimal set $S_f$ such that the operation $f$ is always defined.

As remarked by the OP mallardz, this is not a sufficient explanation, since it will not include the empty word $\epsilon$ in $S_f$ when it is not already in $S$. Indeed this closure corresponds to the definition of the Kleene plus + and not to the Kleene star *.

Actually, the idea of closure can be extended, or considered in different ways.

  1. Extension to other algebraic properties

    On way to extend it (though it is no longer called closure) considers more generally an extension to a set $S_f$ having specific algebraic properties with respect to the operation $f$.

    If you define $S_f$ as the smallest set containing $S$ that is a Monoid for the binary function $f$, then you require both closure and a neutral element which is the empty word $\epsilon$.

  2. Extension through a derived operation

    There is a second way which is more properly a closure issue. When you define the closure of $S\subset D$, you can consider it with respect to some of the arguments, while you allow values from the whole set $D$ for the other arguments.

    Considering (for simplicity) a binary function $f$ over $D$, you can define $S_{f,1}$ as the smallest set containing $S$ that satisfies the equation: $$S_{f,1}=\{f(s_1,s_2)\mid \forall s_1\in S_{f,1}\wedge\forall s_2\in D\}$$

    or with set equations:

    $$S_{f,1} \text{ is the smallest set such that } S\subset S_{f,1} \text{ and } S_{f,1}=f(S_{f,1},D)$$

    This also makes sense when the arguments do not belong to the same set. Then you may have closure with respect to some arguments in one set, while considering all possible values for the other arguments (many variations are possible).

    Given a Monoid $(M,f,\epsilon)$ $-$ for example the monoid of strings with concatenation $-$ where $f$ is an associative binary operation on the elements of the set $M$ with an identity element $\epsilon$, you can define the powers of an element $u\in M$ as: $$\forall u\in M.\; u^0=\epsilon\; \text{ and }\; \forall n\in\mathbb N\; u^n=f(u,u^{n-1})$$

    This exponentiation $u^n$ is an operation that takes as argument an element of $M$ and a non-negative integer of $\mathbb N_0$.

    However, the natural extension of this operation to subsets of $M$ is not the usual one which would be, for a given value of $n$, $U^n=\{u^n\mid u\in U\}$. It should rather take into account the original definition of $u^n$ from the operation $f$, wich would give: $$\left\{ \begin{array}{l} U^0=\{u^0\mid u\in U\}=\{\epsilon\}\\ \forall n\in\mathbb N,\; U^n=f(U,U^{n-1}) \end{array} \right.$$ so as to be consistent with the natural extension of the operation $f$ to subsets of $M$.

    Now we can define the closure of $U_{\wedge,1}$ of $U\subset M$ for the first argument of the power operation, as indicated above with the set notation, as: $$U_{\wedge,1} \text{ is the smallest set such that } U\subset U_{\wedge,1} \text{ and } U_{\wedge,1}=f(U_{\wedge,1},\mathbb N_0)$$

    And this does give us the the Kleene star operation when the construction is applied to the concatenation operation of the free Monoid of strings.

    To be completely honest, I am not sure I have not been cheating. But a definition is only what you make it, and that was the only way I found to actually turn the Kleene star into a closure. I may be trying too hard.
    Comments are welcome.

Closing a set under an operation that is not always defined

This is a slightly different view and use of the concept of closure. This view is not really answering the question, but it seems good to keep it in mind to avoid some possible confusions.

The above implies that the function $f$ is always defined in the reference set $D$. That may not always be the case. Then closure can also be a mathematical technique to extend a set so than some operation will always be defined. The way it works in practice is as follow:

  • start with the set $D$ where $f$ is not always defined;

  • build another set $D'$ constructed from elements of $D$, with an operation $f'$ that is always defined, such that you can ...

  • show that there is an isomorphism between $D$ and a subset of $D'$ that is such that $f$ is the image of $f'$ restricted to that subset.

Then the set $D'$ with the operation $f'$ is a closed extension of $D$ with $f$.

That is how integers are built from natural numbers, considering the set of pairs of natural numbers quotiented by an equivalence relation (two pairs are equivalent iff the two elements are in the same order and have the same difference).

This is also how rationals can be built from the integers.

And this is how classical reals can be built from the rationals, though the construction is more complex.

babou
  • 19,645
  • 43
  • 77
6

Another meaning of closure, which is more general than the meaning explained by David Richerby, is any operator $\ast\colon X \to X$ on a poset $X$ that satisfies the following axioms:

  1. $x \leq x^\ast$
  2. $x \leq y \longrightarrow x^\ast \leq y^\ast$
  3. $(x^\ast)^\ast = x^\ast$

The classical example is topological closure, which also satisfies $\bot^* = \bot$ and $(x \lor y)^* = x^* \lor y^*$, two properties not satisfied by the Kleene star.

The poset in the case of the Kleene star is the poset of all sets of words: $X=2^{\Sigma^*}$. If $x,y \subseteq \Sigma^*$ then $x \leq y$ if $x \subseteq y$. The axioms of closure then state that

  1. $L \subseteq L^\ast$
  2. $L_1 \subseteq L_2 \longrightarrow L_1^* \subseteq L_2^*$
  3. $(L^*)^* = L^*$

The Kleene plus operator also satisfies these axioms, so is also a closure operator under this definition.

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514