22

Is a memory-hard proof-of-work scheme necessarily resistant to speedups from custom ASICS?

Background: Bitcoin uses a proof-of-work scheme based on SHA256 hashing. The scheme is compute-bound. Initially, people would mine solutions on their PCs or GPUs. However, eventually it was noticed that a custom ASIC (a custom-designed hardware chip) can compute SHA256 hashes much more rapidly and thus a custom ASIC can mine more efficiently and more rapidly, so people built custom ASICs that can compute SHA256 hashes much faster than any general-purpose CPU. Today ASICs have a 100x advantage over mining on your PC, so PC-based mining is not very competitive.

Some folks have proposed fixing this by designing a memory-hard proof-of-work scheme. A memory-hard scheme is one that fundamentally requires some minimum amount of memory (e.g., 1GB of RAM) to solve efficiently, and where there are no useful time-memory tradeoffs. These are hard to design, but suppose we were able to construct one. For instance, Cuckoo Cycle is one plausible attempt at such a scheme. Suppose we use Cuckoo Cycle, or we find another scheme. I've seen it argued that such a scheme would end the monopoly of ASICs and thus make mining more democratic, because ASICs can speed up computation-bound tasks but not memory-bound tasks.

Here's my question. Is this argument correct? More specifically, is it true that memory-hardness is enough to ensure that ASICs won't have much of an advantage over general-purpose CPUs? What prevents an attacker from building a custom ASIC and buying off-the-shelf DRAM chips, and building systems that pair each ASIC with a DRAM chip?

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
D.W.
  • 36,982
  • 13
  • 107
  • 196

3 Answers3

8

What prevents an attacker from building a custom ASIC and buying off-the-shelf DRAM chips, and building systems that pair each ASIC with a DRAM chip?

DRAM memory is already pretty optimized for random memory accesses per second per dollar. Since a memory bound PoW spends more time waiting for memory than doing computation, there's little point in using an ASIC to speed up the computation. The bottleneck is DRAM latency. The ASIC is just a very expensive way to spend a larger fraction of time waiting for memory...


Also see related discussion at reddit.com/r/Bitcoin/. For reference purposes (and in the unlikely case Reddit shuts its doors), here’s a copy of my post there:

hashcash functions must be cheap to verify to achieve their purpose.

And thus memory hardness can only be achieved by going beyond the hashcash proof of work, as I wrote about in “Beyond the Hashcash Proof-of-Work (there’s more to mining than hashing)”, published at cryptorials.io.

The desirability of a memory hard PoW for a high market cap crypto currency is an open question with no easy answers and one that deserves more study.

I think it could be desirable if it can radically change the economics of mining, to the point where mining is profitable for no one. That means that even the (hypothetically) most energy-efficient ASIC with custom memory technology, would still not have a positive ROI, due to the following reasons:

  1. fabrication costs are going to be much larger than for whatever memory technology is used for commodity computing devices, which still enjoy much larger economies of scale.

  2. electricity costs, while lower than for commodity memory, are unlikely to to be more than an order of magnitude lower.

and the main reason:

  1. huge numbers of people could be willing to mine at a loss, just as they are happy to play the lottery. this requires a very low barrier to entry, like one click to install a mining app on your phone that will make it mine overnight while charging. it helps to know that your mining efficiency is not more than (roughly) an order of magnitude worse than with custom hardware.

Of course I'm not claiming this is a likely scenario, but I think it's at least imaginable...

Another possibility to consider is having 2 PoWs: one compute bound, and one memory bound, splitting the blocks between them. Mining the compute bound one would be (barely) profitable with ASICs, while the mining the memory bound one would be unprofitable, but help decentralization.

Still, unprofitable mining is by no means a requirement for making memory hard PoWs desirable.

It is conceivable that any memory technology developed to make the PoW's memory accesses as energy efficient as possible, will benefit many other classes of computation especially in mobile, solar powered, and other low-powered settings, as thus inevitably lead to commoditization.

This argument works best for PoWs that do as little computation as possible apart from memory accesses, and I believe this is where Cuckoo Cycle's edge trimming shines.

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
John Tromp
  • 221
  • 1
  • 5
8

Yes, the argument is largely correct. A good memory-hard proof-of-work scheme can be fairly resistant to speedup using ASIC, if designed around a good primitive like Argon2 and parametrized appropriately; in particular, having a large fraction of its cost spent in un-cacheable accesses to enough memory that DRAM is the only economical choice for that.


The idea with this strategy is that the (hash rate)/(operating cost) ratio of an ASIC cracker can be limited by the DRAM power, and/or its (hash rate)/(building cost) can be limited by DRAM cost. From both standpoints, ASIC-based designs are not much better than their CPU-based counterparts; and we can find plausible bounds on how better ASICs can be.

With suitably large investment, using today's technology, an ideal ASIC design would have the cracker on the same die as the DRAM. This is not far-fetched: there are suppliers for design tools and logic cells using the special cost-optimized process of DRAM. Compared to a cracker using commodity CPU and DRAM, that entirely saves the CPU and interconnect to DRAM, and most of the cost of logic operations (both from the standpoint of power and building), and possibly reduces the time spent in memory accesses by I guess at most 3 binary orders of magnitude due to the combination of lower propagation and logic delays, and optimized choice of DRAM cells (these can be much faster than the ones in mainstream DRAM, but the silicon area and power per cell also increases, evening out things).

All in all, I'd say that the above ASIC design is no more than 16 times more cost-effective than custom boards using commodity CPU and DRAM (in either purchasing or operating cost, ignoring the uncertainties in silicon and energy procurement); and probably much less attractive than that for an adversary with limited investment capability, which realistically will be an important consideration even for a three-letters-agency. That's for a memory-hard entropy-streching function parametrized towards making the cost of memory accesses (rather than logic operations) the limiting factor, and requiring more DRAM than can economically fit SRAM (to be on the safe side, say at least 232 bits, or about half a gigabyte per instance of the function), and with an access pattern to DRAM that defies any caching strategy (one way the later is achieved is by using computed, random-like addresses).

Note: In the above, I use estimations based on DRAM memory accesses because (in my opinion) that gives the tightest bound for the benefit ASICs can hope to achieve compared to more COTS cracker designs. But this is not a recommendation to reduce to the max the number of logic operations in the parametrization of a state-of-the-art memory-hard entropy-streching functions; on the contrary, the sensible thing to do is to pump the memory-related parameters up to the point where that eats say 2/3 of the affordable cost (including using some sizable fraction of a gigabyte per core), then pump the computation-related parameters up to what's affordable; this makes designing an ASIC cracker able to jump over both hurdles less likely.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
7

Memory-hard proof-of-work: are they ASIC-resistant?

Theoretically, the answer is a clear “no”.

Given enough resources (read: invested time and money) and the appropriate knowledge (ASICs don’t grow on trees, they have to be designed) all currently known and/or published “memory-hard PoW” solutions could be rendered into futile efforts. But theory ends where the real world starts…

More specifically, is it true that memory-hardness is enough to ensure that ASICs won't have much of an advantage over general-purpose CPUs?

Your “feel” seem to be correct, as no memory-hard design has managed “to be enough” to ensure that ASICs won't have much of an advantage over general-purpose CPUs… at least, so far.

What prevents an attacker from building a custom ASIC and buying off-the-shelf DRAM chips, and building systems that pair each ASIC with a DRAM chip?

Allow me to step up my soapbox while I dive into that…

Practically, the real show-stopper for ASICs in the realms of cryptocurrency is and will always be “production cost”. Remember that an ASIC is a dedicated piece of hardware. It only solves a single, specific task and isn't (re)programmable like a personal computer or some handheld device.

Production costs for ASICS can quickly run into a few million Euros/US-Dollars due to things like the design of such ASIC chips, their (clean room) manufacturing, and – last but not least – the low yields that come with ASICs (current industry feedback shows 50% of the manufactured chips don't work and are destroyed instead of being used or sold).

Long story short: to create an ASIC for whatever cryptocurrency, you have to have financial backing and a clear break-even-point within reach. If you don’t have those two corner-stones, memory hardness is the least of your problems and you better go back to your drawing board to rethink your plan.

Getting back to your questions: if we assume that your attacker has access to resources as described above, such an attacker would be able to work his/her way well beyond the limits of a specific “memory-hard” PoW implementation.

The logic question arising out of this (and somewhat implied by your question) is: is memory-hardness “the” solution towards ASIC-resistance? Not really, for reasons of which I only described a few (to prevent this answer from becoming a small book).

Potential solution to the problem…

Besides that, I’ld like to point at the fact that — based on the above – there seems to be a solution to the problem. Yet, it clearly leaves the path of “memory-hard PoW algos”. Nevertheless, I think it should be mentioned in this context: instead of chasing some dragon, one could simply exploit the usual problems faced by attackers by exploiting the amount of resources needed, and pushing things beyond feasability… in this case: economic feasability, by leveraging the break-even-point out of the financial reachability of the opponent.

Keeping in mind that ASICs – once designed and manufactured – can not be recoded, transformed, or otherwise reused, a simple but effective attack vector against such attackers themselves emerges: frequently change the PoW implementations, swapping one algo design for another. To keep up with such changes, an attacker who relies on (let’s just call it) ”the magic of ASICs” would need to restart his/her efforts… over and over again upon each swap of algo designs. This – even theoretically – will quickly become more and more infeasable in several areas (economic, time-factor, etc). In the end, an attacker will have to decide of it makes sense to invest more than the expected return, or giving up the efforts of fighting ever-changing cryptocurrency hashing algos and instead focussing on economically more interesting targets.

Memory-hard PoW may have its place, but currently that’s not the front seat…

I can’t deny this potential solution to the problem could benefit from different memory-hard algo designs as well, as those would change ASIC design requirements even more than usual. Yet, this is a “could” and not a “would”. All in all, memory-hard PoW algos currently don’t provide the amount of protection against ASICs that “marketing speak” tries to sell to cryptocurrency users (incl. miners). Surely, we can’t ignore that algos like Litecoin’s Scrypt and things like cryptonight were able to show it is possible to slow down the birth-rate of leaner and meaner ASICs at least for a few short moments (because they changed the hardware and/or resource requirements via algo-design and -combination).

Yet, up until now, every fixed algo implementation – memory-hard or not – is slowly being rendered void just like some former cryptographically secure things we all used to love: MD5, RC4, and SHA1. Therefore, I’ld say it’s not worth focussing on memory-hardness in relation to PoW.

Instead, memory-hardness should be kept in mind as a potential factor when designing the next hashing implementations… only to be implemented in such a way if it really makes sense. (After all, cryptocurrencies – just like everything else cryptography-related – are also influenced by “implementation speed expectations” as well as considering “resources available to the average, benign users in contrast to potential attackers”.)

Missing proofs, practical disproofs, and the future of it all…

If there were a working memory-hard PoW (in relation to ASIC resistance, providing equal chances when mining et al), you would be looking at a near-to-perfect cryptocurrency. As soon as you see it, feel free to ping me… as far as I can see when looking around, we’re still a few years away from something like that. Chances are, it does’t exist… but we can’t know for sure yet since – up until now – no one was able to prove that; just like no one was able to prove up until today that memory-hard implementations are actually able to successfully prevent ASIC (or FPGA) implementations, or at least pull their effectiveness down to a near-CPU level. Instead of providing the needed/wanted/expected proof, a small truckload of theories and papers have already practically been proven wrong.

Nevertheless, assuming cryptocurrency is here to stay (which the most recent EU tax decisions in relation to cryptocurrencies definitely underlined) we might get to a useable and fair PoW one day… even if there’s a good chance that “memory-hardness” will not be a prime factor of the solution. After all, attackers only get smarter and attacks only get better. The search for better and more flexible solutions to known problems have been ongoing for a pretty long time now (as a random example, see the following paper from 1995: An SRAM-programmable field-configurable memory as well as bordering patents), and might push “memory-hardness” into an abandoned corner unless CPU technology, as well as desktop systems as a while, are able to make some similar jumps forward soon.

Looking at military crypto, it’s a wonder we still tend to rely on CPUs instead of using dedicated ASICs and speedy FPGAs. This especially hits one in the face when thinking about cryptocurrency mining: while PoW resembles a high-speed race with a few sprinkles of luck to go with it, some still seem to hope they’ll win that race against fellow-miners (and potential attackers) while riding a “memory-hard” bicycle. Personally, I sincerely doubt memory-hardness by itself will ever be able to provide the fair chance those people are hoping for.

$\color{lightgray}{∎}$

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240