5

Consider a program written in a common language such as C. Assume that it does not have any explicit parallel constructs. Then, once it is compiled to an executable program, it will be run serially, even if the machine has multiple processors.

Now, suppose we wanted theoretically to write a "parallel compiler", which would analyze such a program and figure out what operations are safe to execute concurrently, and output an executable which would run in parallel on multiple processors, so that, no matter the timing between these threads, the program in the end will come up with the same results as with "serial compiler".

I am guessing that such a compiler, if possible, would need a very long time to calculate - I am guessing that the problem "parallelize an arbitrary program" must be at least NP-complete (possibly exp). Is that known, and if so, is that true, and if so, can you point me to a published argument?


Let me try and formalize the problem better. Assuming each instruction emitted by the compiler executes in unit time, then for a given number of processors, and assuming the processors communicate and access memory instantly, given that some instructions will depend on others, there is theoretically the shortest possible time to complete the program.

Let's say we want our compiler, maybe not reach this theoretical limit, but come up with instructions that will take up to 2x the theoretical limit.

Raphael
  • 73,212
  • 30
  • 182
  • 400
user54388
  • 51
  • 1

2 Answers2

5

It is not clear what exactly you want of that compiler, so let me explore several possibilities.

Optimize the parallelism

This is impossible. Optimizing even sequential runtime is not computable, and a similar proof shows the same for parallel runtime. The proof also extends to any notion of "almost optimal" like within a constant factor or something similar.

Thus, any notion of parallelization in this direction is not even in NP, and it's not even close.

Get some parallelism

Without defining exactly what "some" means we can not classify the problem in terms of complexity. There are certainly things that modern compilers can do, for instance vectorize suitable for-loops -- but all of these are certainly not NP-hard to detect (or they wouldn't do them).

I'm certain that you can carefully define some parallelization feature that is NP-complete to detect. I'd especially look towards coarse-grained parallelism, though exploiting that requires serious rewrites of code.

Do anything

That's trivial and hence boring. For instance, standard data-flow analyses will identify (sub)sequences of statements that are independent and can thus be reordered or, in principle, be executed in parallel. This will get you some very fine-grained parallelism that may not even gain you a speed-up, and will certainly not scale to many processors, but well.

Some language have features to tell the compiler that two things are independent, e.g. method calls, and that way lies potential gain. Then it's easy for the compiler again, though.

Raphael
  • 73,212
  • 30
  • 182
  • 400
2

You'd need to define exactly what you mean by "parallelize". Besides, finding a parallelization (what a compiler is presumably asked to do) is a search problem, while NP is a set of decision problems. Search problems can't be "NP complete", they are a different kettle of fish. Sure, problems in NP often have corresponding search problems (see for example Bellare's "Decision vs search"), and if one is hard the other is too. Let's be loose with language and talk about "NP hard search problems" if the corresponding decision problem is NP hard.

Compilers "solve" NP hard search problems as a matter of course all over the place. But they usually only consider a subset of the full problem (lower-hanging fruit, if you will), use approximate solutions, or outright heuristics that "usually work in the programs seen in practice". A fascinating glimpse at the problems tackled and how they are addressed is given by one of LLVMs researchers, John Regher in his blog.

vonbrand
  • 14,204
  • 3
  • 42
  • 52