Identical systematic encoding for original view and BCH view Reed Solomon code - reference?

Question

I'm looking for a reference (if it exists) for an identical systematic encoding original view (codeword is a sequence of values) and BCH view (codeword is a sequence of coefficients) for $GF(q)$, $RS(q-1,k)$ code as defined in the Wiki article:

original view systematic encoding

BCH view systemaic encoding

These are the parameters I've found for identical systematic encoding:
Original view using data points $\alpha^0$, $\alpha^1$, ..., $\alpha^{q-2}$ .
BCH view using generator roots $\alpha^{k+0}$, $\alpha^{k+1}$, ..., $\alpha^{q-2}$

The ordering of original view codeword values corresponds to increasing powers of $\alpha$, left to right, while the ordering of BCH view coefficients is most significant term first, left to right. Using the data points and generator roots mentioned above, the resulting codewords for the same message will be identical.

Shortened codewords can be implemented by using full codewords with leading zeroes.

Example, $RS(15,9)$ using field $GF(2^4)$ : $x^4 + x + 1$. Original view systematic encoding using data points $\alpha^0$ ... $\alpha^{14}$, in hex {1,2,4,8,3,6,c,b,5,a,7,e,f,d,9}, BCH view systematic encoding using generator roots $\alpha^9$ ... $\alpha^{14}$, ..., in hex{a,7,e,f,d,9}. The result is an identical systematic encoding matrix:

 1 0 0 0 0 0 0 0 0 c a c 3 9 7
 0 1 0 0 0 0 0 0 0 2 f 8 5 9 f
 0 0 1 0 0 0 0 0 0 8 e 7 a b 2
 0 0 0 1 0 0 0 0 0 b f 5 1 b 5
 0 0 0 0 1 0 0 0 0 9 f 6 a a 3
 0 0 0 0 0 1 0 0 0 7 4 8 3 2 3
 0 0 0 0 0 0 1 0 0 7 a 3 d b b
 0 0 0 0 0 0 0 1 0 d 5 7 d 1 f
 0 0 0 0 0 0 0 0 1 8 1 d 5 3 a

Dilip Sarwate · Accepted Answer · 2024-11-06T18:08:46.107

Original formulation Reed-Solomon codes:
Let $k$ and $n$ be integers such that $0 < k < n \leq q$ where $q$ is a prime or prime power. Let $\alpha_0, \alpha_1,\cdots, \alpha_{n-1}$ denote $n$ distinct elements of $\mathbb F_q$. The original description of Reed-Solomon codes is that data $\mathbf u = [u_0, u_1, \cdots, u_{k-1}] \in \mathbb F_q^k$ is encoded into a codeword $\mathbf C^{(\mathbf u)} = [C_0^{(\mathbf u)}, C_1^{(\mathbf u)}, \cdots , C_{n-1}^{(\mathbf u)}]\in \mathbb F_q^n.$ If we call $u(x) = u_0+u_1x+\cdots + u_{k-1}x^{k-1}$ the data polynomial, then we have that $$ C_j^{(\mathbf u)} = \sum_{i=0}^{k-1}u_i\alpha_j^i = u(\alpha^j);\quad \mathbf C^{(\mathbf u)} = [u(\alpha_0), u(\alpha_1), \cdots, u(\alpha_{n-1})]\tag{1}.$$ The set of $q^k$ codewords is a linear nonsystematic $[n,k, d=n-k+1]$ Reed-Solomon code, a $k$-dimensional subspace of $\mathbb F_q^n.$ The generator matrix of this code is the $k\times n$ Vandermonde matrix: $$\mathbf G = \left[\begin{matrix}1 & 1 & \cdots & 1\\ \alpha_0 &\alpha_1 & \cdots & \alpha_{n-1} \\ \alpha_0^2 &\alpha_1^2 & \cdots & \alpha_{n-1}^2\\ \vdots & \vdots & \ddots & \vdots\\ \alpha_0^{k-1} &\alpha_1^{k-1} & \cdots & \alpha_{n-1}^{k-1} \end{matrix} \right], \tag{2}$$ and $\mathbf C^{(\mathbf u)} = \mathbf{uG}.$

Original formulation $[q-1,k]$ Reed-Solomon code is a cyclic code.
We now consider the special case $n=q-1$ and $\alpha_j = \alpha^j, 0 \leq j \leq q-2$ where $\alpha$ is a primitive element (a.k.a. a generator of the multiplicative group $\mathbb F_q^*$) of $\mathbb F_q$. Note that the $\alpha_i$'s are no longer arbitrarily chosen distinct elements of $\mathbb F_q$, but all the nonzero elements of $\mathbb F_q$ arranged in a specific order. This $[q-1,k]$ "original definition" Reed-Solomon code is a cyclic code meaning that if $\mathbf C$ is any codeword in this code, so are all the cyclic shifts of this codeword also codewords in this Reed-Solomon code. Consider a generic codeword $\mathbf C^{(\mathbf u)} = [u(\alpha^0), u(\alpha^1), \cdots, u(\alpha^{q-2})]$. Its left cyclic shift by one place is $$[u(\alpha^1), \alpha^2, \cdots, u(\alpha^{q-2}), u(\alpha^0)].$$ We can express this vector as $$[u(\alpha^1), \alpha^2, \cdots, u(\alpha^{q-2}), u(\alpha^0)] = \mathbf C^{\hat{\mathbf u}} = [\hat u(\alpha^0), \hat u(\alpha^1), \cdots, \hat u(\alpha^{q-2})]$$ where \begin{align}\hat{u}(x) &= u(\alpha x) = u_0 + (u_1\alpha) x + \cdots + (u_{k-1}\alpha^{k-1}) x^{k-1}\\ \hat{\mathbf{u}} &= [u_0, u_1\alpha, u_2\alpha^2, \cdots, u_{k-1}\alpha^{k-1}] .\end{align} Note that $$\hat{u}(\alpha^0) = \hat{u}(1) = \sum_{i=0}^{k-1} u_i \alpha^i = u(\alpha^1)$$ exactly as we want it to be. More generally, $$\hat{u}(\alpha^j) = \sum_{i=0}^{k-1} \hat{u}_i (\alpha^j)^i = \sum_{i=0}^{k-1} (u_i\alpha^i) \alpha^{ji} = u(\alpha^{j+1}).$$ Thus, the cyclic shift by one place of codeword $\mathbf C^{(\mathbf u)}$ is the codeword $\mathbf C^{(\hat{\mathbf u})}$. Additional left cyclic shifts just give more iterations of this construction. Moral: This $[q-1,k]$ "original definition" Reed-Solomon code is a cyclic code.

Original formulation $[q-1,k]$ cyclic Reed-Solomon code is a BCH code.
So, this specific $[q-1,k]$ "original definition" Reed-Solomon code is in fact a cyclic code even though we are not describing its encoding in the usual way that cyclic codes are described. Is this cyclic code a BCH code? Yes; in fact, it is a (primitive) narrow-sense BCH code. To understand this claim, let's first associate a codeword polynomial $C(z)$ with each codeword $\mathbf C$ of this code.

$$C(z) = C_0 + C_1z^{1} + \cdots + C_{q-2}z^{q-2}.\tag{3}$$

Let's think of the codeword $\mathbf C^{(\mathbf u)}$ as the finite-field Fourier transform of the extended information vector $\mathbf U = [u_0, u_1, \cdots, u_{k-1}, 0, 0, \cdots, 0]$ of length $q-1$. Thus, $$\mathbf C^{(\mathbf u)}_j = \sum_{i=0}^{q-2} U_i\alpha^{ij} = \sum_{i=0}^{k-1} u_i\alpha^{ij} = u(\alpha^j) $$

The $j$-th element of the inverse Fourier transform of $\mathbf C^{(\mathbf u)}$ is $$U_j = \frac{1}{q-1} \sum_{i=0}^{q-2} \mathbf C_i^{(\mathbf u)}\alpha^{-ij} = -\sum_{i=0}^{q-2} \mathbf C_i^{(\mathbf u)}\alpha^{-ij} = - C^{\mathbf u}(\alpha^{-j}) = -C^{\mathbf u}(\alpha^{q-1-j}).$$ But $U_j = 0$ for $j= k, k+1, \cdots, q-2$ and so we see that the codeword polynomial of each codeword in the $[q-1,k]$ "original formulation" Reed-Solomon code has $$\alpha^{q-1-k}, \alpha^{q-1-(k+1)}, \cdots, \alpha^{q-1-(q-3)}, \alpha^{q-1-(q-2)}\tag{4}$$ as zeros. Since $\alpha^{q-1} = 1$, note for future reference that the list of zeroes can also be written as $$\alpha^{-k}, \alpha^{-(k+1)}, \alpha^{-(k+2)}, \cdots, \alpha^{-(q-2)} \tag{4*}$$ Re-ordering the list of zeroes, we see that each codeword polynomial is a multiple of $$g(z) = \prod_{i=1}^{q-1-k}(z-\alpha^i) = \prod_{i=1}^{d-1}(z-\alpha^i).\tag{5}$$ The generator polynomial of degree $q-1-k$ has as roots $d-1$ consecutive roots $\alpha, \alpha^2, \cdots, \alpha^{d-1}$ and so we see that the $[q-1,k,d]$ original formulation Reed-Solomon code over $F_q$ is also a narrow-sense primitive BCH code over $\mathbb F_q$. The original formulation Reed-Solomon code is not a systematic code, and neither is this BCH code a systematic code.

The list of roots of $g(z)$ stated above doesn't match the list in the OP's question. Why?
$\alpha,\alpha^2, \cdots, \alpha^{q-1-k}$, the list of roots of $g(z)$ stated above doesn't match the list $\alpha^k, \alpha^{k+1}, \cdots, \alpha^{q-2}$ stated in the OP's question. The reason for this is that the conventional notation used by coding practitioners is different from the notation used above. By default, in all of the above and especially $(3)$, the unstated assumption is that $C_0$ is the first symbol transmitted, followed by $C_1$, then $C_2$, etc. and finally ending with the transmission of $C_{q-1}$ as the last symbol transmitted. Coding practitioners prefer to think that the codeword polynomial is $$\tilde C(z) = C_0z^{q-2} + C_1z^{q-3} + \cdots + C_{q-3}z^{1} = C_{q-2} = z^{q-2}C(z^{-1}).\tag{6}$$

The codeword symbols are still being transmitted as $C_0$ is the first symbol transmitted, followed by $C_1$, then $C_2$, etc. and finally ending with the transmission of $C_{q-1}$ as the last symbol transmitted; all that has changed is the codeword polynomial that the coding practitioner is associating with this codeword.

For any polynomial $g(z)$, the zeroes of the reversed polynomial $\tilde g(z) = z^{deg~g}g(z^{-1})$ are the reciprocals of the zeroes of $g(z)$. So, with this convention, the zeroes of the generator polynomial $\tilde g(z)$ are the reciprocals $\alpha^k, \alpha^{k+1}, \cdots, \alpha^{q-2}$ of the zeroes of $g(z)$ as stated in $(4*)$. This list matches the OP's claim exactly.

Why do coding practitioners use reversed polynomials ?
The first thing that a BCH decoder does is to compute the syndrome which uses polynomial evaluation at the $n$-th roots of unity. Other computations also require polynomial evaluations, and in such cases, it is very convenient to have the polynomial coefficients become available in descending order of degree so that Horner's method for polynomial evaluation can be used. It is more troublesome to have the coefficients become available in ascending order. This convention extends even to the data polynomials for systematic BCH encoders. If the data enters the encoder in order $u_0$ followed by $u_1$ etc, it is best to define the data polynomial as $\tilde u(z) = z^{k-1}u(z^{-1})$ which is the reverse of the data polynomial defined earlier. The transmitted codeword then is $$[\tilde u_{k-1}, \tilde u_{k-2}, \cdots, \tilde u_0, n-k \text{ parity symbols} ] = [u_0, u_1, \cdots, u_{k-1}, n-k \text{ parity symbols } ],$$ that is, the data symbols appear in the transmitted codeword in exactly the same order as they entered the encoder. The transmitted codeword polynomial is $$\tilde C(z) = z^{n-k}\tilde u(z) + \cdots = u_0z^{n-1} + u_1z^{n-2} + \cdots + u_{k-1}z^{n-k} + \cdots $$ To find the rest of the codeword polynomial coefficients, what the encoder does is to divide $u_0z^{n-1} + u_1z^{n-2} + \cdots + u_{k-1}z^{n-k}$ by the generator polynomial to get a remainder polynomial $r(z)$ of degree $n-k-1$ (and hence $n-k$ coefficients). The codeword polynomial then can be written as $$\tilde C(z) = z^{n-k}\tilde u(z) + \cdots = u_0z^{n-1} + u_1z^{n-2} + \cdots + u_{k-1}z^{n-k} - r(z) $$ and is a multiple of the generator polynomial $\tilde g(z)$.

Finally, an answer to the OP's observation:
The set of codewords comprising the $[q-1,k]$ BCH code described above is exactly the same as the set of codewords comprising the original formulation Reed-Solomon code with "evaluation points" $\alpha^0, \alpha^1, \alpha^2, \cdots, \alpha^{q-2}$. Now, there exists a degree-$(k-1)$ data polynomial $v(x)$ such that $v(\alpha^i) = u_i, 0 \leq i \leq k-1$, which can be found via Lagrange interpolation as suggested in the Wikipedia article on Reed-Solomon codes that is cited by the OP. Then, \begin{array}{lllllllll}\mathbf C^{(\mathbf v)} &= [v(\alpha^0)& v(\alpha^1)& \cdots & v(\alpha^{k-1})& v(\alpha^k)& \cdots & v(\alpha^{q-2})]\\ &= [u_0 & u_1& \cdots& u_{k-1} &v(\alpha^k)& \cdots& v(\alpha^{q-2})] \end{array} is indeed a codeword in the original formulation Reed-Solomon code, as well as a codeword in the BCH code whose generator polynomial has zeroes $\alpha^k, \alpha^{k+1}, \cdots, \alpha^{q-2}$ exactly as found by the OP. What I don't have is what the OP seeks: a reference for where this proof can be found in the literature.

There is a video I could use for a reference in addition to your answer. Skip to 6:51 into dual view of RS codes by Mary Wootters. The video shows that if a non-systematic original view encoded RS codeword of values is viewed as coefficients to a polynomial $c(x)$, $c(x) = 0$ for $x$ = {$\gamma$, $\gamma^2$, ..., $\gamma^{n-k}$}, with $n = q-1$, and using $\alpha$, roots are {$\alpha$, $\alpha^2$, ..., $\alpha^{-k}$}. Reversing the codeword results in roots {$\alpha^k$, $\alpha^{k+1}$ , ... , $\alpha^{q-2}$}. Thank for the help with this. — rcgldr, Jan 25 '25 at 14:51
@rcgldr Since comments are often not read, I suggest that you convert your comment into an answer, and possibly even accept your own answer since it is the clearest answer to "I'm looking for a reference...." — Dilip Sarwate, Jan 25 '25 at 19:58
The reference would only be needed for something like the Wikipedia article, but I don't plan on updating it. I looked at an old textbook, Error Correction Codes by Peterson and Weldon (copyrights 1961, 1972), and it refers to RS codes as a sub-class of BCH codes (no mention of "original" view). It also shows a hardware diagram for systematic encoding, so systematic BCH encoding occurred around that time. My observation with the parameters used in Wootter's video is that systematic encoding for original and BCH view produce the same codewords. — rcgldr, Jan 25 '25 at 20:07

Identical systematic encoding for original view and BCH view Reed Solomon code - reference?

1 Answers1

Linked