4

A 'relation' is an object with a body that's a set of tuples, and a header that's a set of attribute identifiers. This sort of object is used in the theory of 'relational databases'. A relation is like a table in a spreadsheet, but the collection of columns ("attributes") is a set, and the collection of rows ("tuples") is a set.

Let relation $A$ have attributes $X_1, X_2, \dots, X_m, Y_1, Y_2, \dots, Y_n$ and $B$ have attributes $Y_1, Y_2, \dots, Y_n, Z_1, Z_2, \dots, Z_p$. Thus, $A$ and $B$ can be joined on a shared set of attributes, $Y_i$.

Now consider $X, Y, Z$ as three composite attributes (thinking of each group of attributes as a single attribute). Then the natural join of $A$ and $B$, A JOIN B is a relation with heading $X,Y,Z$ and a body consisting of all tuples ($x$, $y$, $z$) such that a tuple appears in $A$ with $X$ value $x$ (i.e. the attributes not shared with B) and $Y$ value $y$ (i.e. the shared key attributes), and a tuple appears in $B$ with $Y$ value $y$ and $Z$ value $z$. [Quoting some from C.J. Date's IDBS.]

Can we prove that natural join is associative? That is, a tuple $t$ is in A JOIN (B JOIN C) iff $t$ is in (A JOIN B) JOIN C?

Hatshepsut
  • 1,364
  • There are quite a few things about this question I don't understand. Please reread the second paragraph and correct possible errors. Also, what are composite attributes? Are all relations ternary? What is an $X$-value? – martin.koeberl Mar 18 '17 at 02:51
  • @martin.koeberl Added some information, hopefully its clearer? – Hatshepsut Mar 18 '17 at 02:59
  • Do you mean $B$ to have attributes $Y_1,Y_2,\dots, Y_n,Z_1,\dots, Z_p$? If I understand it correctly, $X=X_1\times\dots\times X_n$ and similarly for $Y$ and $Z$, and $A\subseteq X\times Y$ and similarly for $B$? – martin.koeberl Mar 18 '17 at 03:01
  • @martin.koeberl ah sorry - a typo, yes! Fixed now. – Hatshepsut Mar 18 '17 at 03:03

1 Answers1

2

I'll try to stay with your terminology. However, I use $\in$ to say that a tuple is in the body of a relation, and I'll write $\circ$ for the join (see below). So, assume $C$ is a relation with attributes $Z_1,\dots,Z_p,U_1,\dots,U_l$. Let's write $X,Y,Z,U$ for the composite attributes. We want to show that $A\circ(B\circ C)=(A\circ B)\circ C$.

Assume that $(x,y,z,u)\in A\circ(B\circ C)$. That means that $(x,y)\in A$ and $(y,z,u)\in B\circ C$ and thus $(x,y)\in A$ and $(y,z)\in B$ and $(z,u)\in C$. So we have $(x,y,z)\in A\circ B$ and thus $(x,y,z,u)\in (A\circ B)\circ C$. The other direction is analogous.

Two comments are in order here:

  • Note that really we have, if $(x,y,z)\in A\circ B$ and $(z,u)\in C$, that $((x,y),z,u)\in (A\circ B)\circ C$, and similarly $(x,y,(z,u))\in A\circ (B\circ C)$. Depending on how you define tuples, these might not the same. I don't know whether this subtleties are important for you.
  • Commonly, if $A,B$ are binary relations, one defines $A\circ B=\{(x,y)\mid \exists z((x,z)\in B\wedge (z,y)\in A)\}$ (this might appear confusing, but think of the definition for functions), but I thought this is close enough to the join to justify the abuse of notation. If you want to check your understanding, it might be a good exercise to show that this also is associative.