3

I am a PhD student and I work in the area of applied mathematics. I am studying the different kind of uniform column weight binary matrices with prescribed inner products between its columns. I am particularly interested in binary matrices arising from narrow-sense primitive binary BCH codes. So, in the first part, I will describe what had already been proven and can be mathematically proved regarding the BCH code I am dealing with. And what I want to be proved is presented here as a Conjecture.

Let "$[495]$-BCH code" denotes the narrow-sense primitive binary BCH code of length $1023$ with designed minimum distance $495$. Let "$[496]$-BCH code" denotes the even parity sub-code of the $[495]$-BCH code consisting of all even parity code vector in it. Let the symbol $a_j$ denote the number of code vectors of weight $j$ in the given even parity BCH code. Please note that the statements written in "italic" font do already have a mathematical proof.

Now, from T.Kasami's work, the weight distribution of the $[496]$-BCH code is as follows:

$$a_0 := 1$$ $$a_{496}:= 16368$$ $$a_{512}:=1023$$ $$a_{528}:= 15376.$$

Now, let $\mathcal{H} := \{\mathbf{b}_1, \mathbf{b}_2, \mathbf{b}_3,...,\mathbf{b}_{16368}\}$ denote the set of all the code vector of weight $496$, and let $\mathcal{G}:= \{\mathbf{c}_1 , \mathbf{c}_2 , \mathbf{c}_3 ,...., \mathbf{c}_{1023} \} $ denote the set of all the code vector of weight $512$, and let $\mathcal{T}:= \{\mathbf{d}_1 , \mathbf{d}_2 , \mathbf{d}_3 ,..., \mathbf{d}_{15376}\} $ denote the set of all the code vector of weight $528$.

Due to the above given weight distribution it follows that, if $\mathbf{b}_i, \mathbf{b}_j \in \mathcal{H}$ are two distinct code vectors (i.e. $ i \neq j $), then the inner product between them defined over field of real numbers can only take one of these three values: $248, 240, 232$, and this can be easily proved. Note that $\langle \mathbf{b}_i , \mathbf{b}_j \rangle = 248 \implies \mathbf{b}_i \oplus \mathbf{b}_j \in \mathcal{H}$, $\langle \mathbf{b}_i , \mathbf{b}_j \rangle = 240 \implies \mathbf{b}_i \oplus \mathbf{b}_j \in \mathcal{G}$, $\langle \mathbf{b}_i , \mathbf{b}_j \rangle = 232 \implies \mathbf{b}_i \oplus \mathbf{b}_j \in \mathcal{T}$, where $\oplus$ is the $\mod 2$ addition (element-wise). The above facts are the result of some simple calculations. Now, define the matrix $B$ as $$B:= \big[\mathbf{b}_1 \big| \mathbf{b}_2 \big| \mathbf{b}_3 \big|,...., \big| \mathbf{b}_{16368}\big],$$ basically the matrix $B$ is consist of all the code-vector of weight $496$. Now, simple calculation can show us that each row of $B$ contains exactly $7936$ number of $1's$, and I have proof this also.

Now, I begin explaining my finding regarding the above BCH code of which I don't have a mathematical proof:

Conjecture 1: Let $j\in \{1,2,3,...,16368\}$ be arbitrary. Define the set $S$ as $$ S:= \{ \mathbf{b}_j \oplus \mathbf{b}_i : \forall i \in [16368]\setminus \{j\} \},$$ where $[16368] := \{1,2,3,...,16368\}$ Then the set $S$ can be partitioned into three pair-wise disjoint subsets $\mathcal{O}, \mathcal{P}, \mathcal{Q}$ such that $\big|\mathcal{O}\big| = 8400,\ \big|\mathcal{P}\big| = 527,\ \big|\mathcal{Q}\big| = 7440$, in addition to that, $\mathcal{O}\subset \mathcal{H},\ \mathcal{P}\subset \mathcal{G}$, and $\mathcal{Q}\subset \mathcal{T}$.

I am have tried a lot but can't find a direct way of proving Conjecture 1.

  • 3
    I'm fairly sure this particular BCH-code can be usefully described by using the trace function, which may or may not help. But I am not sure I have understood every detail of the question. The binary words of this code are elements of the vector space $\Bbb{F}2^{1023}$. This means that if $x=(x_1,x_2,\ldots,x{1023})$ and $y=(y_1,y_2,\ldots,y_{1023})$ are any two such words, then their inner product $$\langle x, y\rangle:=\sum_{i=1}^{1023}x_iy_i$$ is naturally an element of the binary field $\Bbb{F}_2$, and can thus only take values $0$ or $1$. – Jyrki Lahtonen Jul 21 '24 at 05:05
  • 1
    I guess you wanted to count the number of their common zeros (which is a natural number in the range $[0,1023]$. This is usually not called an inner product. However, it is an interesting quantity (for example for the purposes of coding theory). The tool for studying them is to map binary vectors into vectors of real numbers in ${+1,-1}^{1023}\subset\Bbb{R}^{1023}$ and use the inner product of real vectors. – Jyrki Lahtonen Jul 21 '24 at 05:09
  • 1
    Which often leads to simple questions about [tag:exponential-sums]. I need to recall a number of facts, but I'm somewhat optimistic about those tools saying something about your problem. – Jyrki Lahtonen Jul 21 '24 at 05:11
  • 1
    Oh, the mapping from $\Bbb{F}_2$ to ${+1,-1}$ is to simply map $0\mapsto +1$, $1\mapsto -1$. Or $x\mapsto (-1)^x$ if you like. The latter generalizes to a (canonical) additive character of an extension field. The relevance, is that the usual inner product of real vectors then calculates what you seem to want to calculate. If $x$ and $y$ (binary vectors of length $\ell$) differ at $d$ positions, and $s(x)$, $s(y)$ are the corresponding $\pm1$-vectors, then the inner product $$\langle s(x),s(y)\rangle=\ell-2d.$$ – Jyrki Lahtonen Jul 21 '24 at 05:25
  • ^ ..... the number of their common ones... – Jyrki Lahtonen Jul 22 '24 at 04:33
  • 3
    This code is one whose weight enumerator was discovered by Kasami in 1966 whose work was reported in an unpublished report in 1967 from the Coordinated Science Laboratory of the University if Illinois at Urbana-Champaign, and in a small combinatorics conference. It was publicized (and extended) by Elwyn Berlekamp in Chapter 16 of his 1968 book Algebraic Coding Theory. See also lots of references to "the small set of Kasami sequences" which are the cycle representatives of the $33$ cycles of period $1023$ from this cyclic code. (Berlekamp was my PhD advisor and I was with CSL 1973-2010) – Dilip Sarwate Jul 22 '24 at 14:02
  • @JyrkiLahtonen ; I am sorry! I gave too many unnecessary details which might have confused you. Actually, $\langle x, y \rangle $ is the standard inner product and not over the binary field. So, $\langle x, y \rangle \in \mathbb{Z}_+$. – Dark Forest Jul 22 '24 at 17:19
  • @JyrkiLahtonen ; I have the mathematical proof for their common number of ones. Please read my question from the point where I started writing in bold letters. – Dark Forest Jul 22 '24 at 17:23
  • I was fairly certain that this is the $0/1$ version of the small Kasami code, but did not remember whether that is also a BCH-code. Thanks for the confirmation, @DilipSarwate. I will try and solve the OP's question with the trace representation. Cannot commit to a schedule, unfortunately. – Jyrki Lahtonen Jul 22 '24 at 17:50
  • @DilipSarwate ; I am aware of the Kasami's work and therefore I am not asking for the proof of the weight enumerate of the BCH code. I am sorry for not clearly describing my problem statement. Actually, the statements written in "italic" font do already have a mathematical proof. What I am not able to prove is described in the "header" of my question. – Dark Forest Jul 22 '24 at 18:00
  • @JyrkiLahtonen ; The code I have mention is a BCH code whose weight distribution was proved by T.Kasami. Please try to prove what I have written under the header since except that, I have mathematical proof of the rest of the statements I made in my question. – Dark Forest Jul 22 '24 at 18:06
  • @DarkForest, please edit your question stating precisely what you want proved, maybe as a separate first paragraph. – kodlu Jul 22 '24 at 21:11
  • 1
    @kodlu ; I have edited my question. What I want to be proved is now presented here as the Conjecture 1 in the last part of my question. I am sorry as I can't present the Conjecture 1 in first paragraph because I had to give sufficient mathematical details fist which can be useful in proving Conjecture 1. – Dark Forest Jul 22 '24 at 22:20
  • The standard inner product takes values in $\Bbb{F}_2$. At least in coding theory. After all, you can't expect natural numbers to come out if you add elements of $\Bbb{F}_2$ together. In the same vein: the sum of two words of a binary code is automatically the componentwise addition in $\Bbb{F}_2$. What else could it be in the vector space $\Bbb{F}_2^n$?? In other words, you don't need to use $\oplus$ or some such notation to emphasize modulo two arithemetic. I realize this may not be second nature to you, but it takes me extra mental effort to think it could be something else. – Jyrki Lahtonen Jul 23 '24 at 06:06
  • @JyrkiLahtonen ; I apologise if I did't make it clear that when I write $\langle a, b \rangle$ I mean inner product defined over the the field of real numbers and not on $\mathbb{F}_2$. In other words, $\langle a, b \rangle$ gives the common number of ones between the code vectors $a$ and $b$. – Dark Forest Jul 23 '24 at 10:31
  • Sorry about being more than a bit edgy. Yeah, the tools I could immediately recall only reproduced Kasami's result. I had a vague recollection of having seen a description of the low weight words in terms of those exponential sums, but couldn't find it. May be a more roundabout method will work? Thinking... – Jyrki Lahtonen Jul 23 '24 at 15:55

2 Answers2

2

Work in progress:

Perhaps looking at the (nonzero) cycles of this $[1023,15,496]$ cyclic code will help. The code's parity-check polynomial has two factors: a primitive polynomial $h_{10}(x)$ of degree $10$ and a primitive polynomial $h_5(x)$ of degree $5$. The $2^{15}-1 = 32,767$ nonzero codewords can be partitioned into $32$ cycles of $1023$ codewords each, and one cycle of $31$ codewords.

The single cycle of $31$ codewords is just the nonzero codewords of the $[31,5,16]$ shortened and expurgated first-order Reed-Muller code (of length $31$ and parity-check polynomial $h_5(x)$) that have been repeated $33$ times to make the length $31\times 33=1023$ and weight $16\times 33 = 528$. Let $\mathbf v$ denote a cycle representative and $\mathsf T$ denote the "cyclic-shift-by one place" operator. Then, the $31$ codewords are $\mathbf v$, $\mathsf T\mathbf v$, $\mathsf T^2\mathbf v$, $\mathsf T^3\mathbf v\cdots$, $\mathsf T^{30}\mathbf v$.

The 32 cycles of 1023 codewords are comprised of

  • $1$ cycle of weight $512$
    Th cycle is comprised of the nonzero codewords of the $[1023,10,512]$ shortened and expurgated first-order Reed-Muller code of length $1023$ whose parity-check polynomial is $m_{10}(x)$. Let $\mathbf u$ denote a cycle representative. Then the codewords comprising this cycle are $\mathbf u$, $\mathsf T\mathbf u$, $\mathsf T^2\mathbf u$, $\mathsf T^3\mathbf u, \cdots$, $\mathsf T^{1022}\mathbf u$.

  • $16$ cycles of weight $496$
    Each cycle representative is of the form $\mathbf u + \mathsf T^c\mathbf v$ for some $c \in C$ where $C \subset \{0,1,2, \cdots, 30\}$ has $16$ elements. The codewords comprising this cycle are $\mathbf u + \mathsf T^c\mathbf v,$ $\mathsf T(\mathbf u + \mathsf T^c\mathbf v) = \mathsf T\mathbf u + \mathsf T^{c+1}\mathbf v,$ $\mathsf T^2\mathbf u + \mathsf T^{c+2}\mathbf v$, $\mathsf T^3\mathbf u + \mathsf T^{c+3}\mathbf v, \cdots$, $\mathsf T^{1022}\mathbf u + \mathsf T^{c+1022}\mathbf v$. Note that since $\mathbf v$ is of period 31, the exponent of $\mathsf T$ can be reduced modulo $31$, that is, the $j$-th codeword in the cycle can be expressed as $\mathsf T^j\mathbf u + \mathsf T^{(c+j)\bmod 31}\mathbf v$.

  • $15$ cycles of weight $528$
    Each cycle representative is of the form $\mathbf u + \mathsf T^{c^\prime}\mathbf v$ for some $c^\prime \in C^\prime$ where $C^\prime = \{0,1,2, \cdots, 30\} -C$ has $15$ elements. Similar remarks apply with respect to the codewords comprising each cycle.

Let's check the statements numerically: $16\times 1023 = 16368 = a_{496}$, $15\times 1023 +33 = 15378 = a_{528}$ and of course $1\times 1023 = 1023 = a_{512}$.

The set of integers $\{0,1,2, \cdots ,30\}$ can be partitioned into $6$ cyclotomic cosets of size $5$ and $\left\{0\right\}$. Since $|C| = 16, |C^\prime| = 15$, I conjecture that $C$ is the union of three cyclotomic cosets plus $\left\{0\right\}$ while $C^\prime$ is the union of the other three cyclotomic cosets.

Let us now consider the $16368$ codewords of weight 496 but instead of labeling them with integers $1$ through $16368$ as the OP has done, we arrange them into a $16\times 1023$ array with the codeword in the $i$-th row and $j$-th column being $\mathbf b_{i,j} =\mathsf T^j\mathbf u + \mathsf T^{(c_i+j)\bmod 31}\mathbf v$. Note that each of $\mathbf v$, $\mathsf T\mathbf v$, $\mathsf T^2\mathbf v$, $\mathsf T^3\mathbf v\cdots$, $\mathsf T^{30}\mathbf v$ occurs in exactly 33 sums in each row.

After these preliminaries, lets's take up the OP's question: Add $\mathbf b_{r,s}$ to all the other codewords in the $16\times 1023$ array. This results in $16367$ codewords of the $[1023,15,496]$ code of which $527$ have weight $512$, $8400$ have weight $496$ and $7440$ have weight $528$. Why?

Well, the $527$ codewords of weight $512$ are easily explained. $\mathbf b_{r,s}$ is of the form $\mathsf T^s\mathbf + \mathsf T^\ell \mathbf v$ for some $\ell\in \{0,1,\cdots,30\}$ and so when we add $\mathbf b_{r,s}$ to a $\mathbf b_{r,s^\prime}$ (same row), in $32$ instances, the two $\mathsf T^\ell \mathbf v$'s cancel out leaving us with the sum of two cyclic shifts of $\mathbf u$ which is just another shift of $\mathbf u$, i.e. a codeword of weight $512$. In all other rows, this cancellation occurs in $33$ places, giving us that the modified array has $32+ 15\times 33 =527$ codewords of weight $512$.

Dilip Sarwate
  • 26,411
1

I suspect that Dilip's approach may be more likely to bear fruit, but I am adding the following more algberaic point of view to the mix anyway. I will study Dilip's description later in the fond hope that one of us can take it further :-)


Let $F=\Bbb{F}_{2^n}=GF(2^n)$ to be the finite fiels with $q=2^n$ elements, and let $E=\Bbb{F}_{2^m}=GF(Q), m=2n, Q=q^2,$ be its unique quadratic extension. The case $n=5$ is relevant here, but I want to take a look at the more general $n$ for the time being. Let $$tr^n_1:F\to \Bbb{F}_2, x\mapsto \sum_{i=0}^{n-1}x^{2^i}$$ be the (absolute) trace function of $F$, $$ tr^m_1:E\to \Bbb{F}_2, x\mapsto \sum_{i=0}^{m-1}x^{2^i} $$ the absolute trace of $E$, and $tr^m_n:E\to F, x\mapsto x+x^q$ the relative trace. It is well known that $tr^m_1=tr^n_1\circ tr^m_n$ (called transitivity of the trace). It is also well known that the words of the Kasami code can be thought of as the following functions $$ c(\alpha,\beta):E^*\to \Bbb{F}_2, x\mapsto tr^m_1(\alpha x)+tr^n_1(\beta x^{q+1}),\tag{1} $$ where $\alpha$ ranges over $E$ (so $2^m$ choices) and $\beta$ ranges over $F$ ($2^n$ choices). Altogether $2^{m+n}=2^{3n}$ codewords (when $n=5$ we get a code of $2^{15}$ words).

It is easy to see that the choice $\beta=0$ gives exactly the $2^m$ words of weight $Q/2=2^{m-1}$. Those are the words from the (shortened) first order Reed-Muller code $R(1,m)^*$ (aka the simplex code). The rest of them have differing weights, deviating from the expected $Q/2$ by $q/2$ into either direction. I want to characterize the pairs $(\alpha,\beta)$ that yield the words of low weight $=(Q-q)/2$.

To that end I use the direct sum decomposition (note to the experts: Niho's method for studying certain correlation functions is not unlike this, but that method won't work as cleanly as it does here). $$ E^*=F^*\times S,\tag{2} $$ where $$S=\{\alpha\in E\mid \alpha^{q+1}=1$$ is cyclic of order $q+1$, and is the $E/F$ analogue of the unit circle of the complex plane $\Bbb{C}/\Bbb{R}$. Observe that $F^*$ is cyclic of order $q-1$, and as $q$ is even, $\gcd(q-1,q+1)=1$, whence their product is direct, and all of $E^*$.

For all $y\in F$ we have $y^q=y$, and for all $\eta\in S$ we have $\eta^q=\eta^{-1}$. Due to the decomposition $(2)$ we can write $x=y\eta$ in $(1)$, and let $y$ range over $F^*$, and $\eta$ range over the unit circle $S$. We end up with $$ c(\alpha,\beta)=tr^n_1(y[\alpha\eta+\alpha^q\eta^{-1}])+tr^n_1(\beta y^2) =tr^n_1(y[\sqrt{\beta}+\alpha\eta+\alpha^q\eta^{-1}]),\tag{3} $$ where I used (in addition to the transitivity of the trace and the earlier results the fact that $tr^n_1(y^2)=tr^n_1(y)$ for all $y\in F$.

Let's look at $(3)$ by fixing a value of $\eta$, but $y^*$ still ranging over $F^*$. That subword (with $q-1$ bits) is a word of the Reed-Muller code $R(1,n)^*$. It is the all zeros word, iff and only if the quantity in square brackets is equal to zero, and a word of weight $q/2$ otherwise. Working under the earlier assumption that $\beta\neq0$, we see (taking Kasami's results into account) that the bracket quantity vanishes either for no choice of $\eta$ or for exactly two choice of $\eta$, and the latter case is of interest to us as they yield the low weight words.

Next I rewrite the square bracket coefficient using the variable $s=\alpha^{-(q-1)/2}\eta$. Observe that $\alpha^{-(q-1)/2}\in S$ for all $\alpha\neq0$. Rewriting then gives that $$[\sqrt{\beta}+\alpha\eta+\alpha^q\eta^{-1}]=0$$ if and only if $$ s+\frac1s=\sqrt{\beta/\alpha^{q+1}}. $$ Here $\sqrt{\beta/\alpha^{q+1}}\in F^*$. The quadratic equation $$ x+\frac1x=\sqrt{\beta/\alpha^{q+1}}\tag{4} $$ thus has its coefficients in $F$. By the well known solvability criterion the roots of $(4)$ are in $F$ if and only if $tr^n_1(\alpha^{q+1}/\beta)=0$. So if $tr^n_1(\alpha^{q+1}/\beta)=1$, then the solutions of $(4)$ will be in the quadratic extension field $E$. Furthermore, the roots of $(4)$ are always each others reciprocals, and in the latter case also $F$-conjugates. These force the solutions into the subgroup $S$. Observe that the solution $x=1$ will not pop out as $\beta\neq0$.

Conclusion. The word $c(\alpha,\beta)$ is of the minimal weight $(Q-q)/2$ if and only if $\beta\neq0$ and $$tr^n_1(\alpha^{q+1}/\beta)=1.$$


That's where I am at the moment. A possible continuation is then to study how often the two conditions in my Conclusion are satisfied for the sum of a pairs of $(\alpha,\beta)$ meeting them (one of them fixed). Hopefully more to follow :-)

Jyrki Lahtonen
  • 140,891