20

I'm looking for high-speed SHA-256 implementations, and specifically, ones with low latency; that is, the time between when you submit the message block, and when the output (or internal state) is produced.

I've googled it, and the fastest I've found appears to take circa 70nsec; does anyone know of something faster?

Now:

  • I'm not interested in parallel implementations (which can obviously do far more than 1 hash every 70nsec); I am specifically looking for a single high-speed hash.

  • An ASIC implementation (or even a proposal that gives a plausible outline of how fast it would be) would be quite acceptable.

  • We can assume that the preimage fits within a single SHA-256 block (or that we're just doing the compression function itself).

  • An alternative way of answering this would be 'how fast can we compute SHA-256 on a single large message'; can we do something faster than about 1GByte/second?

  • I specifically asked about SHA-256, however if someone has something similar about, SHA-3 or AES, I'd be interested in hearing that as well.

Thanks!

poncho
  • 154,064
  • 12
  • 239
  • 382

2 Answers2

15

Ryzen supports special instructions for SHA-256, achieving 1.9 cpb on long messages according to eBACS (graph). Assuming a 4 GHz clock, this corresponds 2.1 GB/s, 17 Gbit/s or 30 ns per block.

From the 22.5 cpb value for an 8-byte message (a single block), we get 180 cycles per block, 2.8 cpb or 45 ns for a single unpadded 64-byte block.

CodesInChaos
  • 25,121
  • 2
  • 90
  • 129
3

For SHA-256 I found the following: http://www.heliontech.com/downloads/fast_hash_asic_datasheet.pdf

It appears to me that the device is performing hashes quickly, and not in parallel. If so, they claim that they can achieve an SHA-256 rate of 2327 Mbps. This is an ASIC implementation as you requested. I would reccomend before purchasing one for your application, actually contacting the company as I have no prior experience with this product to verify that it indeed does what you want. I misread Gb vs. GB

For SHA-3, slide 4 of this presentation:

http://csrc.nist.gov/groups/ST/hash/sha-3/Round2/Aug2010/documents/presentations/SAVAS_Efficient_HW_Implementations_of_3SHA3Candidates.pdf

claims that speeds of 21.23 Gbps can be achieved in hardware for SHA-3. This is from 2010, however it was part of the data NIST used to select its finalist. Please note that the speed shown for BLAKE is the speed of BLAKE used in the NIST competition for SHA-3, and Not BLAKE2, which is significantly faster.

This document details building the hardware needed for SHA-3, http://keccak.noekeon.org/Keccak-main-1.2.pdf (The main documentation version for version 1.2), lists a throughput rate of 12.4 Gbit/s on specified hardware. (See the first entry in the table in section 7.4.1)

(From my above comment) I know this is not the question you asked, but you can use BLAKE2bp for high speeds. BLAKE2b is near your desired 1 Gibps in software implementations, and then BLAKE2bp actually parellizes the hashing (not doing multiple hashes at once), thus further increasing the speed of the hash. Then BLAKE2bp could be implemented in hardware for even higher speeds.

Ninja_Coder
  • 393
  • 1
  • 10