12

From what I understand after testing the Crypto-JS file here:

http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/aes.js

AES creates encrypted strings that contained letters, both capitalized and lowercase, as well as numbers, division characters, addition characters and equal sign characters.

Do AES encryption strings only contain the characters mentioned above, or can they contain any character from UTF-8?

Thanks in advance. I know this question might seem a bit dumb, but I haven't been able to find the answer after online research.

Howard Butler
  • 149
  • 1
  • 1
  • 4

2 Answers2

17

AES does not operate on or produce characters — it has no knowledge or care of any particular character encoding. AES and other modern block ciphers accept and output arrays of bytes. The same concept applies to the key, and (in block modes that require one), the initialization vector.

How (and if) you choose to encode the output is up to you. For storing in a file or database, there is typically no need to encode the output – byte buffers work just fine. For showing the result to a user, things like hex-encoding are frequently used. For transport mediums that require only 7-bit clean ASCII, you can encode with Base64.

That said, CryptoJS appears to automatically encode outputs with Base64. I question the wisdom of this as a default (along with several others made by the CryptoJS authors), but that's the situation as it stands.

Stephen Touset
  • 11,162
  • 1
  • 39
  • 53
10

AES is a block cipher that operates on 128 bit blocks and for any messages (plaintexts) of other size than 128 bit one uses AES in some mode of operation, e.g., CBC considering the message as a sequence of 128 bit blocks (plus padding if required) or modes like CTR to turn AES into a stream cipher.

Anyways, the ciphertext which is output by AES when used with a particular mode of operation is a bitstring (sequence of bytes) of some size (size of the input message plus padding if necessary) and has as it is nothing to do with any character in any encoding.

How you display/store/transmit the ciphertext (the bitstring/sequence of bytes) and thus the set of possible characters that are used therefore depends on the character encoding that is used, e.g., utf 8.

For any given message (bitstring), the ciphertext produced by AES can be any possible bitstring, i.e., all bit patterns of the size of the ciphertext are possible. For instance, let us for simplicity assume the message to be 128 bit and let it be $0^{128}$, then the ciphertext can be any element of $\{0,1\}^{128}$. Consequently, when using a character encoding for the ciphertext, any character (bit pattern) supported by the chosen character encoding may appear somewhere in an AES ciphertext.

In you case, the ciphertext is BASE64 encoded and thus you observe exactly this set of characters when displaying the ciphertext. This particular encoding is a usual choice, since you do not run into encoding issues when dealing with on/transmitting ciphertexts between different platforms/systems.

DrLecter
  • 12,675
  • 3
  • 44
  • 61