autokey script gone wrong!

Question

I tried to make a script to encode a txt with the autokey cipher. From my poor understanding of the cipher I created another cipher which has nothing to do with autokey:

first I encoded a txt with a key which give me a result as long as the original txt -Vigenère cipher used
then I encoded the original txt with first result -Vigenère cipher

How can I get the original txt from the original key?!

$$\begin{array}{} \text{txt} &+ &\text{key} &\to &\text{etxt}\\ \text{txt} &+ &\text{etxt} &\to &\text{final result} \end{array} $$

_{Editor's note: the + stands for vigenere cipher algorithm}

fgrieu · Answer 1 · 2018-09-01T19:47:30.843

Recipe sans math

Given a non-empty $\text{final result}$ obtained as in the question, and $\text{key}$, we can find $\text{txt}$ as follows (see full answer for assumptions made and justification):

Extend $\text{key}$ to the same length as $\text{final result}$, by repeating $\text{key}$ as necessary, then truncating.
Form pairs of letters at same index in $\text{final result}$ and $\text{key}$, and decode per the following table, where $\text{key}$ determines the line, and $\text{final result}$ determines the column. The intersection should contain two possible letters for $\text{txt}$.
We now have two choices for each letter of $\text{txt}$ (or are certain that $\text{final result}$ was not obtained as in the question, if any -- was encountered). In the former case, redundancy in $\text{txt}$ is the only way to disambiguate.

   A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z final result
A  AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ --
B  -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ
C  MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY --
D  -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY
E  LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX --
F  -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX
G  KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW --
H  -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW
I  JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV --
J  -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV
K  IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU --
L  -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT -- HU
M  HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT --
N  -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS -- GT
O  GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS --
P  -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER -- FS
Q  FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER --
R  -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ -- ER
S  ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ --
T  -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP -- DQ
U  DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP --
V  -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO -- CP
W  CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO --
X  -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN -- BO
Y  BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN --
Z  -- BO -- CP -- DQ -- ER -- FS -- GT -- HU -- IV -- JW -- KX -- LY -- MZ -- AN
key

For example, given $\text{final result}=\mathtt{HOS}$ and $\text{key}=\mathtt{DOG}$

for the 1^st letter we use the line for $\mathtt D$ and the column for $\mathtt H$; we read CP there, thus know that the 1^st letter of $\text{txt}$ is $\mathtt C$ or $\mathtt P$
for the 2^nd letter we use the line for $\mathtt O$ and the column for $\mathtt O$; we read AN there, thus know that the 2^nd letter of $\text{txt}$ is $\mathtt A$ or $\mathtt N$
for the 3^rd letter we use the line for $\mathtt G$ and the column for $\mathtt S$; we read GT there, thus know that the 3^rd letter of $\text{txt}$ is $\mathtt G$ or $\mathtt T$

Thus $\text{txt}$ can be $\mathtt{CAT}$. In English, that's the most common of 8 possibilities including also $\mathtt{PAT}$ (a chess term), $\mathtt{PNG}$ (initials of an image format), and 5 others.

Detailed answer

I'll assume the usual definition of the Vigenère cipher, even though that's reportedly not the cipher promoted by Vigenère, which is closer to autokey. Consequently:

$\text{txt}$, $\text{key}$, $\text{etxt}$ and $\text{final result}$ are strings of letters $\mathtt A$ to $\mathtt Z$, which will be assimilated to integers $0$ to $25$ without explicit mention.
$\text{txt}$ and $\text{etxt}$ are unknown strings, while $\text{key}$ and $\text{final result}$ are known. They all are of equal length (with $\text{key}$ extended by repetition as necessary).
The question's $\,+\,$ is not regular addition, and I'll note the variant $\;\widetilde+\;$ to avoid confusion. It holds that
- $\;\widetilde+\;$ applied to individual letters stands for addition modulo $26$, so that $\mathtt R\;\widetilde+\;\mathtt M=\mathtt D$, because $\mathtt R$ maps to $12$, $\mathtt M$ maps to $17$, $12+17\bmod26$ is $3$, and $3$ maps back to $\mathtt D$. See this table.
- $\;\widetilde+\;$ applied to two strings of letters of equal length performs $\;\widetilde+\;$ on letters of same rank in each strings, producing a string of the same length.

For example, if $\text{txt}$ is $\mathtt{CAT}$ and $\text{key}$ is $\mathtt{DOG}$, then the question's equation becomes $$\begin{array}{llllllllll} \text{txt}\;\widetilde+\;\text{key} &=&\text{etxt} &=&\mathtt{CAT}\;\widetilde+\;\mathtt{DOG}&=&\mathtt{FOZ}\\ \text{txt}\;\widetilde+\;\text{etxt}&=&\text{final result}&=&\mathtt{CAT}\;\widetilde+\;\mathtt{FOZ}&=&\mathtt{HOS} \end{array} $$

To solve for $\text{txt}$, we eliminate $\text{etxt}$ in the second equation by substitution with the value assigned to $\text{etxt}$ per the first equation. The equations are equivalent to $\text{txt}\;\widetilde+\;(\text{txt}\;\widetilde+\;\text{key})\,=\,\text{final result}$.

This implies that for each $t$, $k$ and $f$ designating letters of same rank in $\text{txt}$, $\text{key}$ and $\text{final result}$, we have $t\;\widetilde+\;(t\;\widetilde+\;k)=f$. Per usual rules of (modular) arithmetic in the ring $(\Bbb Z_{26},+,\times)$, including associativity and distributivity, this is $(2t+k\bmod26)=f$, and is equivalent to $2t\equiv f-k\pmod{26}$.

The modulus $26$ is the product of distinct primes $2$ and $13$. Therefore, by the Chinese Remainder Theorem, the above equation is equivalent to $2t\equiv f-k\pmod2$ and $2t\equiv f-k\pmod{13}$. And, since $2^{-1}\bmod13=7$, for given $f$ and $k$, and under the necessary condition that they are of the same parity, we can obtain the two solutions for $t$ as$$t_0=7(f-k)\bmod 13\quad\mathsf{then}\quad t_1=t_0+13$$

The recipe's table is a pre-computation of this formula. For example, given $f=7=\mathtt H$ and $k=3=\mathtt D$, we get that $t$ is one of $t_0=7(7-3)\bmod13=28\bmod13=2=\mathtt C$ or $t_1=2+13=15=\mathtt P$, and the table has CP for line D column H.

Notations for modular arithmetic

By definition, for positive integer $m$ and any integers $a$ and $b$

the notation $b\equiv a\pmod m$ means that $a-b$ is a multiple of $m$
the notation $b=a\bmod m$ additionally means that $0\le b<m$
_{Note: in the later notation, there is neither an $\equiv$ sign, nor an opening parenthesis immediately on the left of $\bmod$}
equivalently to the previous statement, $a\bmod m$ is the integer defined as
- for non-negative $a$, the remainder of the Euclidean division of $a$ by $m$
- for negative $a$, the integer $m-1-((1-a)\bmod m)$
the notation $b\equiv a^{-1}\pmod m$ means that $a\,b\equiv1\pmod m$
the notation $b=a^{-1}\bmod m$ additionally means that $0\le b<m$.
_{Note: $b$ is called the multiplicative inverse of $a$. It exists when the Greatest Common Divisor of $m$ and $|a|$ is $1$. For small integers, $b$ can be found from $a$ and $m$ by trial and error. More generally, the GCD and $b$ can be computed using the extended Euclidean algorithm, or its half-extended variant.}

autokey script gone wrong!

1 Answers1

Recipe sans math

Detailed answer

Notations for modular arithmetic