4

While I understand the principle of bit-slicing, several papers mention byte-sliced AES implementations (see e.g. Homomorphic Evaluation of the AES Circuit and Fast Implementations of AES on Various Platforms).

However, I don't clearly understand how byte-slicing works. Especially, one can read in the above mentioned papers that:

  • 16 blocks are processed in parallel
  • The permutations in ShiftRows/MixColumns are now "for free"

Could someone explain how byte-slicing works in the case of AES, and how it allows to process 16 blocks in parallel without computing the ShiftRows?

Raoul722
  • 3,003
  • 3
  • 23
  • 42

0 Answers0