Is what the all-zeroes instruction does considered in machine instruction design?

Question

I remember using a simulated machine with with a minimal instruction set in university. In particular I remember that the instruction that consisted of all zeroes loaded was a LOAD instruction. So, if you made a mistake and began executing empty memory then it would simply keep executing until it ran out of address space.

x86 appears to perform an ADD on empty memory, so similar behaviour to the example above.

All zeros in 6502 on the other hand, corresponds to the BRK instruction which activates a software interrupt so executing empty memory quickly puts the machine in a known state.

My understanding is that 6502 was designed to be written by humans, whereas (at least modern) x86 is designed for compilers to output. Historically, when a machine instruction set is designed, how much attention has been paid to the specific representation each instruction has, given that it can have real effects? I've focused on what the all-zeroes instruction does because I expect that would be the most common instruction to be accidentally executed, but I'd be interested in considerations relating to other instruction representations as well.

score 2 · Accepted Answer · answered Nov 03 '17 at 03:12

In general ease of decoding and code density seem to be given more emphasis when designing an instruction encoding than catching the accidental execution of data (though the danger of nop slides is well known).

Catching the all-zeroes case is rather inexpensive and a number of ISAs specify that such will always be an illegal instruction; e.g., RISC-V (The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Version 2.1: "We reserve all-zero instructions to be illegal instructions to help trap attempts to execute zero-ed or non-existent portions of the memory space. The all-zero value should not be redefined in any non-standard extension. Similarly, we reserve instructions with all bits set to 1 (corresponding to very long instructions in the RISC-V variable-length encoding scheme) as illegal to capture another common value seen in non-existent memory regions.") and Power (Power ISA, Version 2.07: "An instruction consisting entirely of binary 0s is guaranteed always to be an illegal instruction. This increases the probability that an attempt to execute data or uninitialized storage will result in the invocation of the system illegal instruction error handler.")

One might expect memory access protection mechanisms to prevent accidental execution of data, but special casing the all-zero instruction is an inexpensive means of providing a slightly deeper defense. (In addition, some embedded processors do not provide any memory protection.)

Mitch Alsup (a semi-retired computer architect) has designed a RISC-inspired ISA that has a somewhat broad range of illegal instructions meant to catch executing data. From a comp.arch post: "And in particular, opcodes near integer zero are decoded as unimplemented, and Opcodes near FP one (~10**-5 to ~10**5) are also decoded as unimplemented; in both positive and negative senses." "Thus jumping into data is highly likely to result in attempting to execute an unimplemented instruction."

Is what the all-zeroes instruction does considered in machine instruction design?

1 Answers1