1

After reading this question/answer I thought I'd try and implement the SHA-256 for my own education. My initial thought for converting the input into a number was to use a line of code like sum([ord(character) for character in input_string]), I quickly realized this is a terrible idea because I'm greatly reducing the entropy of the input by mapping a lot of highly varied strings to a relatively small number of integers. My second though was to get a hex representation of the string then get the base 16 integer representation of that hex string

import binascii
input_integer = int(binascii.hexlify('hello world'), 16)

However since I'm not an expert in hashing algorithms there may be something wrong with my second implementation that I'm not aware of.

How do you convert a string to a number to be used in a hash algorithm?

My code is written in python, for reference

John
  • 113
  • 1
  • 4

1 Answers1

1

The input format of the data doesn't affect the security of the algorithm.

Any format that can be represented by a sequence of bits will be fine. You can use raw ASCII byte values, 6-bit Base64 values, anything which is in binary form.

If you convert your string into ASCII bytes, for example, you'll end up with a set of 8 bit numbers. Join them together and there's your input.

Since SHA256 uses 32-bit words, it's common to use a byte format for the data (4 bytes = 32 bits). After converting your input to a set of bits, your first task will be to append a 1 bit, pad with zeroes to 448 bits and then append a 64-bit length (the original bit length of the input data) to give you a 512 bit buffer.

I'm sure you can handle the rest.

adelphus
  • 2,324
  • 1
  • 9
  • 8