Is there a particular reason for calling conventions with predetermined order of registers used for passing arguments?

Question

For example, calling int func( int a, int * b...) will put a and b into registers r0, r1 and so on, call the function and return the result in r0 (remark: speaking very generally here, not related to any specific processor or calling convention; also, let's assume fast-call passing of arguments in registers).
Now, wouldn't it be better that each function is compiled with arguments passed in registers that are already preferred for their types of arguments (pointers to preferred pointer/base registers, array-like data to vector registers...), and not following one of few calling-convention rules using the more-or-less strict order of function arguments as given in a function prototype ?
This way the code might avoid few instructions used to shuffle arguments in registers before and after such a function call, and also often avoid reusing the same basic registers all over again etc. Of course, this would require some extra data for each function (presumably kept in a database or object files), but it would be beneficial even in case of dynamic linking.
So, are there any strong arguments against doing something like that in the 21st century, except maybe historical reasons ?

I think "presumably kept in a database or object files" pretty well demonstrates the problem with this proposal, at least w.r.t. C and C++. Separate compilation is separate compilation; there's no need for the object files which will be linked to be physically present (or even exist) at the time of compilation. — rici, Sep 11 '22 at 21:39
@rici I agree, but I also think that the benefits of doing something like this might be stronger than the argument you gave. — gox, Sep 11 '22 at 21:54
Having a well defined ABI also makes it possible to mix languages or compilers. — Mark Ransom, Sep 11 '22 at 22:01
@MarkRansom Right, but a new "loose" convention might do it as well. — gox, Sep 11 '22 at 22:04
What CPU do you have in mind that has "preferred pointer/base registers", as opposed to general-purpose registers? What's the C++ syntax for passing "array-like data"? — Igor Tandetnik, Sep 11 '22 at 22:35
@IgorTandetnik In x86 and later, (R/E)SI, (R/E)DI... have some special roles, too (for example, see [here](https://stackoverflow.com/a/23367772/19797016)), although they could be probably easily changed with ModRM and SIB prefixes (see [AMD64](https://www.amd.com/system/files/TechDocs/24594.pdf)). For "array-like data" maybe something like `uint8_t [16]` and similar. — gox, Sep 11 '22 at 23:00
A function like `void f(uint8_t arr[16])` is equivalent to `void f(uint8_t* arr)`; it takes a pointer. — Igor Tandetnik, Sep 12 '22 at 01:22
IMO, it is very naive (sorry) to think that some standardization is possible among the arguments passed to functions. There are zillion different cases. — , Sep 12 '22 at 07:13
@IgorTandetnik Formally yes, I thought about it more like a vector data or struct that might fit into a vector register (X/Y/ZMM). — gox, Sep 12 '22 at 13:48
@YvesDaoust You are right, it's probably naive, since we are still doing the stuff that were set in the '60s and '70s... sigh. However, the computing power today is way higher, so we could and should have a better description of functions, other than something which is based on our pretty much limited textual representation of their prototypes and name mangling. It would help compilers and linkers to better optimize and utilize all the functions and data in libraries etc. — gox, Sep 12 '22 at 14:00
I disagree with you, it is not a matter of computing power, just a matter of feasibility and usefulness. You seem to believe that this could yield significant performance gains: prove it. — , Sep 12 '22 at 14:23
@YvesDaoust I understand that almost no one would change something that is pretty much operational and refined over decades to gain 1 or 2 percents in efficiency -- standards are never perfect (neither here in programming, nor anywhere else), but provide a known environment. However, compilers sometimes do a terrific job, and sometimes they completely disappoint as if the compiled code was written by a kid. In my opinion, providing more information would help us to better shape our algorithms and data in the first place, and then help compilers to generate more efficient code. — gox, Sep 12 '22 at 14:39
No, I mean prove it by quantitative reasoning. Also notice that compilers already have good register allocation algorithms (even though register allocation is a NP-problem), and they surely would have addressed your idea long ago if there was any opportunity. — , Sep 12 '22 at 16:31
Surely transferring passed parameters to the most optimal register at the start of a function is the easiest and most efficient thing a compiler can do. Any efficiency gain is likely to be unmeasureable. — Mark Ransom, Sep 15 '22 at 18:04
@MarkRansom I generally agree, probably gaining only few clock cycles here and there compared to thousands of cycles a typical function takes, especially when various registers are renamed in a typical high-end CPU (or maybe the effect on this would be beneficial ?). Probably not worth to be implemented, but... who knows ? — gox, Sep 15 '22 at 18:32

icebp · Answer 1 · 2022-09-12T07:14:34.577

For functions that are called from a single place, or that are at the very least static so the compiler can know every call place (assuming that the compiler can prove that the address of that function is not passed around as a function pointer) this could be done.

The catch is that almost every time this can be done it is done, but in a slightly different way: the compiler can inline code, and once inlined, the calling convention is no longer needed, because there is no longer a call being made.

But let's get back to the data base idea: you could argue that this has no runtime cost. When generating the code the compiler checks the data base and generates the appropriate code. This doesn't really help in any way. You still have a calling convention, but instead of having one (or a few) that is respected by all code, you now have a different calling convention for each function. Sure, the compiler no longer needs to put the first argument in r0, but it needs to put it in r1 for foo, in r5 for bar, etc. There's still overhead for setting up the proper registers with the proper values. Knowing what registers to restore after such a function call also becomes harder. Calling convention specify clearly which registers are volatile (so their values are lost upon returning from a called function) and non-volatile (so their values are preserved).

A far more useful feature is to generate the code of the called function in such a way that it uses the registers that already happen to hold those values. This can happen when inlining code.

To add to this, I believe this is what Rust does in Rust-to-Rust calls. As far as I know, the language does not have a fixed calling convention. Instead, the compiler tries to generate code based on in which registers the values for the arguments are already present. Unfortunately I can't seem to find any official docs about this, but this rust-lang discussion may be of help.

Going one step further: not all code paths are known at compile time. Think about function pointers: if I have the following code:

typedef void (*my_function_ptr_t)(int arg1);

my_function_ptr_t get_function(int value) {
    switch (value) {
        case 0: return foo;
        case 1: return bar;
        default: return baz;
    }
}

void do_some_stuff(int a, int b) {
    my_function_ptr_t handler = get_function(a);
    handler(b);
}

Under the data base proposal foo, bar, and baz can have completely different calling conventions. This either means that you can't actually have working function pointers anymore, or that the data base needs to be accessible at runtime, and the compiler will generate code that will check it at runtime in order to properly call a function through a function pointer. This can have some serious overhead, for no actual gains.

This is by no means an exhaustive list of reasons, but the main idea is this: by having each function expect arguments to be in different registers you replace one calling convention with many calling conventions without gaining anything from it.

Right, this is a pretty much strong argument I asked for -- the case with function pointers you mentioned, I didn't think of it. However, in such a case the function itself should be declared through the `typedef` as in your example, meaning that the arguments should preferably have a globally set order etc. — gox, Sep 12 '22 at 14:14

Is there a particular reason for calling conventions with predetermined order of registers used for passing arguments?

1 Answers1