45

While reading Keywords That Aren't (or, Comments by Another Name) by Herb Sutter I came across these lines:

That's right, some keywords are semantically equivalent to whitespace, a glorified comment.

And

We've seen why the C++ language treats keywords as reserved words, and we've seen two keywords —auto and register — that make no semantic difference whatsoever to a C++ program. Don't use them; they're just whitespace anyway, and there are faster ways to type whitespace.

If the keywords like auto(maybe not in C++11) and register are of no value, then why were they created and used?

If it doesn't make any difference to include the register before a variable

#include<stdio.h>
int main(){
    register int a = 15;
    printf("%d\n%d\n",&a,a);
    return 0;
}

Why does the above program give an error?

test_register.c: In function ‘main’:

test_register.c:4:2: error: address of register variable ‘a’ requested

printf("%d\n%d\n",&a,a);

The following program works in C++.

#include<iostream>
int main(){
    register int a = 15;
    std::cout<<&a<<'\n'<<a;
    return 0;
}
Community
  • 1
  • 1
ayushgp
  • 4,891
  • 8
  • 40
  • 75
  • 5
    For compatibility with the C of the time? – juanchopanza May 26 '16 at 09:18
  • 42
    They were of value in the past, when the compiler was not clever enough to decide about register on its own – M.M May 26 '16 at 09:19
  • @M.M @juanchopanza is `register` still relevant in C? – ayushgp May 26 '16 at 09:22
  • 5
    It's not relevant in modern C. Possibly it is if you are using an extremely ancient compiler – M.M May 26 '16 at 09:23
  • duplicate of [“register” keyword in C?](http://stackoverflow.com/q/578202/995714)? It's in C but the result in C++ would be similar – phuclv May 26 '16 at 09:27
  • @M.M I have edited the question. If it isn't relevant then why does it give an error? – ayushgp May 26 '16 at 09:36
  • 9
    @ayushgp `&` cannot be used on a variable declared as `register – M.M May 26 '16 at 09:40
  • The program in your edit gives an error because it's a C program, not a C++ one. Rename it to `test_register.cpp` and compile it using a C++ compiler and it should be accepted. – Theodoros Chatzigiannakis May 26 '16 at 09:42
  • 1
    TL;DR for historical reasons that made sense back in 1970 but are completely irrelevant today. – n. m. could be an AI May 26 '16 at 09:44
  • @TheodorosChatzigiannakis that's weird. It works in C++. Even with arrays. But both of these fail in C. – ayushgp May 26 '16 at 09:46
  • 7
    @ayushgp I don't think it's weird at all, considering the stated meanings. In C, the address of a register variable can't be taken, because the variable is genuinely reserved for register storage. In C++ it *can* be taken, because the register keyword doesn't mean anything (yet). And it goes without saying that a C frontend will respect C rules, while a C++ frontend will respect C++ rules. – Theodoros Chatzigiannakis May 26 '16 at 09:48
  • @TheodorosChatzigiannakis But as stated [here](http://stackoverflow.com/questions/578202/register-keyword-in-c), the register keyword in modern C compilers doesn't make any difference, I find the behaviour of C weird. – ayushgp May 26 '16 at 09:51
  • 2
    @ayushgp In the C11 spec, 6.7.1p6 it says that *"the extent to which such suggestions are effective is implementation-defined"* and in footnote 121 it says *"whether or not addressable storage is actually used, the address of any part of an object declared with storage-class specifier register cannot be computed, either explicitly or implicitly"*. So it doesn't matter if it actually goes to a register. If it's declared as register, then the address can't be taken. – Theodoros Chatzigiannakis May 26 '16 at 09:53
  • @TheodorosChatzigiannakis so register is just a keyword in C++ while its behaviour is defined in C. I think you should add this C spec reference to your answer. – ayushgp May 26 '16 at 09:56
  • @TheodorosChatzigiannakis in C `register` does not mean "genuinely reserved for register storage". It may have in the 1970s but not any more. – M.M May 26 '16 at 10:02
  • @M.M "reserved" may have been bad wording for that comment, but I can't edit it anymore. What I meant is that `register` keyword imposes the restrictions anyway, and the implementation may or may not go ahead and actually put it in a register (in the sense of it "reserves the right" to do it, regardless of whether it actually does it). – Theodoros Chatzigiannakis May 26 '16 at 10:04
  • 3
    OK. another relevant point is the 'Itanium clause'; on a system with no trap representations, `int a,b; &a; b = a;` is well-defined, but `int a; b = a;` may cause a hardware trap due to the fact that Itanium registers have a trap state called "Not A Thing" – M.M May 26 '16 at 10:06
  • "in C++ is just a keyword without a special meaning while in C it is used to hint the compiler to store the variable in a register" no it's useless in C for over a decade [Is the register keyword still used?](http://stackoverflow.com/q/10675072/995714), https://msdn.microsoft.com/en-us/library/482s4fy9.aspx – phuclv May 26 '16 at 10:21
  • @LưuVĩnhPhúc why do modern C compilers still give an error then? – ayushgp May 26 '16 at 10:23
  • @LưuVĩnhPhúc I think he is trying to make the point that `register` keyword does make a difference (although that difference is not anything to do with whether a variable is stored in a register) – M.M May 26 '16 at 10:39
  • @ayushgp Because the standard still demands it. – Theodoros Chatzigiannakis May 26 '16 at 10:40
  • You don't need "EDIT" in your questions. Every post on Stack Overflow already has [a detailed edit history](http://stackoverflow.com/posts/37456494/revisions) that anyone can view. Stack Overflow isn't a discussion forum; try to confine your questions to *questions,* not discussions. – Robert Harvey May 26 '16 at 16:14
  • Should we mention the GCC explicit reg vars extension here, or is that out of scope for this question? – LThode May 26 '16 at 16:51
  • @RobertHarvey I'll keep that in mind next time. – ayushgp May 26 '16 at 16:52
  • 1
    This is an excellent example of a question that should *not* be tagged as both C and C++, since the behaviour is completely different between languages. This is two (pretty much unrelated) questions in one. – Alex Celeste May 26 '16 at 17:58
  • @M.M: There's nothing special about Itanium in that regard. Given `uint16_t foo(int x) { uint16_t q; ... code that might change q ... ; return q; }` many compilers would generate code that might return a value outside the range 0..65535 if `q` is never written. – supercat May 26 '16 at 19:21
  • Related to: [Replacement for deprecated register keyword C++ 11](http://stackoverflow.com/q/20618008/1708801). The rationale deprecation is covered and so may help. Also there is a great quote on the use in C. – Shafik Yaghmour May 26 '16 at 19:34
  • @supercat on itanium it may generate a hardware trap, whereas the other platforms (assuming no trap representations for int) cannot – M.M May 26 '16 at 21:37
  • @M.M: Having a uint16_t hold a value which may non-deterministically exceed the range 0-65535 is outside the bounds of any defined behavior. If a compiler for any CPU is required to say on the rails when accessing an uninitialized variable nominally through a pointer, it must generate code which ensures that cannot happen even if the pointer is in-lined to a register access. – supercat May 26 '16 at 21:41
  • @supercat The uint16_t would hold *indeterminate value*, for non-Itanium . [see this thread for discussion](http://stackoverflow.com/questions/25074180/is-aa-or-a-a-undefined-behaviour-if-a-is-not-initialized/) – M.M May 26 '16 at 21:43
  • @M.M: From the point of view of the Standard, a variable must always either hold a value of its type or a trap representation. A number outside the range 0-65535 is not a value of type uint16_t and consequently cannot be anything but a trap representation. – supercat May 26 '16 at 22:13
  • @supercat variable can also hold an *indeterminate value* , according to the Standard. Is discussed on linked thread, I don't want to rehash here in comments – M.M May 26 '16 at 22:20
  • @M.M: I saw nothing on that thread that mentioned the possibility of an uninitialized variable of a type which shouldn't have "room" for any trap representations holding a value outside the range of its type. I was also unimpressed with DR the notion that multiplying an indeterminate value by zero would yield an indeterminate value. The reason that contagious indeterminate values facilitate propagation is that they allow expressions to be lazily evaluation. If `a=b*c; d=a;` is evaluated when `b` is indeterminate and `c` is zero, and from the compiler's POV nothing changes `b` or `c`... – supercat May 26 '16 at 23:04
  • ...between the above code and the next use of `d`, a compiler could substitute `b*c` for `d`. If `b` were to change for some unexpected reason and `c` wasn't zero, that could cause `d` to behave as a value unequal to `a`. If `c` equals zero, though, then `d` should equal zero even if `b` changes unexpectedly. – supercat May 26 '16 at 23:07
  • note the words *indeterminate value* that appear all over the thread. This is what the value of uninitialized variable is. It does not correspond to any particular representation or any particular range of values. – M.M May 27 '16 at 03:28
  • 1
    @M.M: Categorizing behavior for Indeterminate Value in a fashion which would allow a program to stay on the rails and yet not impede optimization would be tricky, but IMHO very worthwhile. Many algorithms can work nicely with an array which is initialized in such a fashion that each read of an unwritten element may yield a possibly-different Unspecified value, but the Standard presently includes no efficient way of turning an Indeterminate value into an Unspecified one [an action which should merely require placing an limited optimization barrier that will in most cases have zero cost... – supercat Jun 03 '16 at 16:39
  • 1
    ...as opposed to e.g. storing the value in a `volatile`-qualified local variable, storing the address of that variable in a `volatile`-qualified global, and then reading the automatic variable. The compiler would have no way of knowing whether the act of storing the address of the local to a volatile variable might have changed the contents of that local (if the global was mapped to a DMA controller, storing the address could alter memory at that address), and would thus have to do a physical memory read from that local variable. Such an approach would work, but be absurdly inefficient]. – supercat Jun 03 '16 at 16:43

3 Answers3

54

register

In C, the register storage class was used as a hint to the compiler, to express that a variable should be preferentially stored in a register. Note that the hint to store a register variable in an actual register may or may not be honored, but in either case the relevant restrictions still apply. See C11, 6.7.1p6 (emphasis mine):

A declaration of an identifier for an object with storage-class specifier register suggests that access to the object be as fast as possible. The extent to which such suggestions are effective is implementation-defined.[footnote 121]

[footnote 121] The implementation may treat any register declaration simply as an auto declaration. However, whether or not addressable storage is actually used, the address of any part of an object declared with storage-class specifier register cannot be computed, either explicitly (by use of the unary & operator as discussed in 6.5.3.2) or implicitly (by converting an array name to a pointer as discussed in 6.3.2.1). Thus, the only operators that can be applied to an array declared with storage-class specifier register are sizeof and _Alignof.

In C++ it is simply an unused reserved keyword, but it's reasonable to assume that it was kept for syntactical compatibility with C code.

auto

In C, the auto storage class defines a variable of automatic storage, but it's not usually used since function-local variables are auto by default.

Similarly, it's reasonable to assume that it was initially carried over to C++ for syntactical compatibility only, although later it got its own meaning (type inference).

Community
  • 1
  • 1
Theodoros Chatzigiannakis
  • 28,773
  • 8
  • 68
  • 104
  • 2
    the auto keyword is because there's no type in the beginning of C, there are only storage classes [Where is the C auto keyword used?](http://stackoverflow.com/q/2192547/995714) – phuclv May 26 '16 at 09:25
  • 3
    @LưuVĩnhPhúc there was always `int` in C. I think `auto` hung around from BCPL and they left it in C to make porting code from BCPL to C easier – M.M May 26 '16 at 10:03
  • @M.M yes I was wrong about that. It's from B which doesn't have `int` keyword – phuclv May 26 '16 at 10:18
  • 1
    M.M: Ironically enough, `auto i = 10;` was allowed in both K&R and C90 in which case type was inferred using the implicit `int` rule. – Grzegorz Szpetkowski May 26 '16 at 10:19
  • 10
    Many, many years ago (late 80's) I saw a developer porting our software; he spent a few hours pasting "register" blindly in front of the first three variables in every function, and after that the whole application ran about 30% faster. – gnasher729 May 26 '16 at 15:07
  • 2
    @gnasher729: hope he submitted a feature request on the compiler, "please write a register allocator". – Steve Jessop May 26 '16 at 17:40
27

register in C served two purposes:

  • Hint to the compiler that the variable should be stored in a register for performance. This use is largely obsolete now.
  • Prevent the programmer from using the variable in ways that would prevent it from being stored in a register. This use is only somewhat obsolete IMO.

This is similar to const, which

  • Hints to the compiler that a variable may be stored in read-only memory.
  • Prevents the programmer from writing to the variable

As an example, consider this simplistic function:

int sum(const int *values, size_t length) {
    register int acc = 0;
    for (size_t i = 0; i < length; ++i) {
        acc += values[i];
    }
    return acc;
}

The programmer has written register to keep the accumulator off the stack, avoiding a memory write every time it's updated. If the implementation gets changed to something like this:

// Defined in some other translation unit
void add(int *dest, int src);

int sum(const int *values, size_t length) {
    register int acc = 0;
    for (size_t i = 0; i < length; ++i) {
        add(&acc, values[i]);
    }
    return acc;
}

The acc variable can no longer be stored in a register when its address is taken for the add() call, because registers have no address. The compiler will thus flag &acc as an error, letting you know that you've probably destroyed the performance of your code by preventing acc from living in a register.

This used to be a lot more important in the early days when compilers were dumber and variables would live in a single place for an entire function. Nowadays a variable can spend most of its life in a register, being moved onto the stack only temporarily when its address is taken. That is, this code:

/* Passed by reference for some reason. */
void debug(const int *value);

int sum(const int *values, size_t length) {
    int acc = 0;
    for (size_t i = 0; i < length; ++i) {
        acc += values[i];
    }
    debug(&acc);
    return acc;
}

would have caused acc to live on the stack for the whole function in older compilers. Modern compilers will keep acc in a register until just before the debug() call.

Modern C code does not generally use the register keyword.

Tavian Barnes
  • 12,477
  • 4
  • 45
  • 118
  • 2
    IMHO, the `register` keyword could be very useful if didn't completely forbid the address of a variable from being taken, but allowed the compiler to treat the address thus returned in a fashion similar to a "restrict" pointer. Thus, something like `sscanf(st, "%d", &value);` could, at the compiler's convenience, be converted to `{ int temp; sscanf(st, "%d", &temp); value=temp;}`; if the benefits of treating `value` as though its address had never been taken outweighed the cost of creating the temporary. – supercat May 26 '16 at 19:14
  • @supercat Compilers do that kind of transformation anyway! The only restriction you'd be lifting is that currently two `log(&acc)` calls from within the same function have to see the same address. – Tavian Barnes May 26 '16 at 19:29
  • 2
    Presently, compilers are forbidden from keeping a variable of type `int` in a register while accessing a pointer of type `int*` if it's possible but not certain that the pointer identifies the variable, and likewise for other types. If a variable's address has never been taken, proving that a pointer of type `int*` couldn't possibly access it is trivial. If the address has been exposed to outside code, such proof is often much more difficult if not impossible. – supercat May 26 '16 at 19:47
  • 1
    Further, if an automatic variable's address has never been exposed to outside code, it can be cached in a register across calls to outside code. Once the variable has been exposed to outside code, however, its value must be written out to RAM on every subsequent call to outside code. – supercat May 26 '16 at 19:53
  • @supercat In [this example](https://godbolt.org/g/5HNPXM), both inner loops run with `acc` in `%edx`. It is spilled and restored from the stack around each `bar()` call. You are right though, if there were intervening function calls in the loop (even if they don't touch `bar`!), it has to spill around those too ([example](https://godbolt.org/g/t628L4)). – Tavian Barnes May 26 '16 at 19:57
  • @supercat Surprisingly, `gcc` [still spills `acc`](https://godbolt.org/g/gIAsyF) in the first loop, before the address of `acc` is taken, even at `-O3`. I think it would be okay to run the first loop with `acc` in a register only. – Tavian Barnes May 26 '16 at 20:02
  • Replace the passed-in pointers with double-indirect pointers (accessed via e.g. `a1[0][i]`), and you'll see that passing the address of `acc` to outside code horribly degrades efficiency of the inner loops compared with copying it to a temp, passing the address of that temp, and copying it back. – supercat May 26 '16 at 20:08
  • @TavianBarnes slightly unrelated, but it that's vanilla C, don't name a function `log`, it will silently and painfully destroy any numeric code you have :-( – k_g May 26 '16 at 23:41
  • 2
    I'll note that `register` is still highly relevant in embedded programming. – chrylis -cautiouslyoptimistic- May 27 '16 at 02:49
10

C99 Rationale provides some more context of keyword register:

Rationale for International Standard — Programming Languages — C

§6.7.1 Storage-class specifiers

Because the address of a register variable cannot be taken, objects of storage class register effectively exist in a space distinct from other objects. (Functions occupy yet a third address space.) This makes them candidates for optimal placement, the usual reason for declaring registers; but it also makes them candidates for more aggressive optimization.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
  • Worth mentioning that compilers can easily tell if the address of a variable is taken or not, so saying `register` doesn't convey any additional information. – user541686 May 27 '16 at 07:00
  • @Mehrdad: If "register" were allowed for global variables, it could provide very useful information. It could also allow very useful information if code were allowed to take the address of register variables *but* the address thus taken would be treated as a restrict-qualified pointer [meaning that a compiler would be allowed to keep a variable in a register across all pointer accesses except those which occur between the time a variable's address is taken and the next time that variable is used by name]. – supercat Jun 03 '16 at 16:34