Suppose I have code such as
#include <stdint.h>
namespace
{
struct Thing {
uint8_t a;
uint8_t b;
uint8_t c;
uint8_t d;
};
auto& magicRegister = *reinterpret_cast<volatile uint8_t*>(0x1234);
auto magicMemory = reinterpret_cast<Thing*>(0x2000); // 0x2000 - 0x20ff
}
int main()
{
magicMemory[0] = { .a = 1, .b = 2, .c = 3, .d = 4 };
magicRegister = 0xff; // Transfer
magicMemory[0].d = 5;
magicRegister = 0xff; // Transfer
}
in which writing 0xff to the byte located in address 0x1234 makes the system copy 256 bytes from the region starting at 0x2000 to an inaccessible memory. The contents of the memory region only matter when the transfer takes place.
How can I guarantee that GCC's optimization won't mess up the order of operations?
I see a few options.
- Make
magicMemoryvolatile: I dislike this, because the contents of the memory region don't matter until the transfer is initiated. - Add
asm("" ::: "memory");before each transfer: simple and effective, but may prevent some unrelated code from being properly optimized.
int main()
{
magicMemory[0] = { .a = 1, .b = 2, .c = 3, .d = 4 };
asm("" ::: "memory");
magicRegister = 0xff; // Transfer
magicMemory[0].d = 5;
asm("" ::: "memory");
magicRegister = 0xff; // Transfer
}
- Same as #2, but add
asm(""
:
: "m"(*reinterpret_cast<uint8_t*>(0x2000)),
"m"(*reinterpret_cast<uint8_t*>(0x2001)),
"m"(*reinterpret_cast<uint8_t*>(0x2002)),
// ...
"m"(*reinterpret_cast<uint8_t*>(0x20fd)),
"m"(*reinterpret_cast<uint8_t*>(0x20fe)),
"m"(*reinterpret_cast<uint8_t*>(0x20ff)));
instead: stupid, but technically should work. Sadly, it crashes the GCC port I'm using.
EDIT: What if I also had this:
uint8_t unrelatedVariable;
int main()
{
unrelatedVariable = 5; // This is redundant and should be optimized away
// same as before
unrelatedVariable = 6;
}
It is possible to fix the ordering of accesses to magicRegister and magicMemory while not affecting unrelatedVariable?