Return by value
So you've been writing some C after using Python or Go or Haskell or pretty much anything other than C and you're jealous of being able to return more than one thing from a function. How do you do it in C?
The standard way is to return one thing (perhaps the "primary" thing
you're returning) as the return value, and then have the caller
provide pointers for the rest as "output parameters". You could
return everything by value in a struct
, but it feels like all those
copies might be bad, maybe?
Let's check. To reduce inlining confusion, let's split it across multiple object files. So here's the interface:
typedef struct { int a; int b; } Pair; Pair return_pair(); void fill_pair(Pair* p); void fill_ints(int* a, int* b);
And the trivial implementation:
#include "lib.h" Pair return_pair() { Pair p = { 3, 5 }; return p; } void fill_pair(Pair* p) { p->a = 3; p->b = 5; } void fill_ints(int* a, int* b) { *a = 3; *b = 5; }
And finally here's main
to run it, including calls to the functions
so we can see what work the caller must do.
#include "lib.h" int main(int argc, char** argv) { Pair p1 = return_pair(); Pair p2; fill_pair(&p2); int a, b; fill_ints(&a, &b); return p1.a + p2.a + a; }
And now to a disassembler.
Starting with the last function, fill_ints()
. Passing in two
pointers means that two registers get addresses put into them:
0x00000000004004e7 <+23>: lea 0x8(%rsp),%rsi
0x00000000004004ec <+28>: lea 0xc(%rsp),%rdi
0x00000000004004f1 <+33>: callq 0x400530 <fill_ints>
and the implementation of fill_ints()
fills in the pointees. Pretty
much what you'd expect.
Dump of assembler code for function fill_ints:
0x0000000000400530 <+0>: movl $0x3,(%rdi)
0x0000000000400536 <+6>: movl $0x5,(%rsi)
0x000000000040053c <+12>: retq
The fill_pair
implementation is similar, but with just one pointer
and two offsets.
return_pair
is quite different:
Dump of assembler code for function return_pair:
0x0000000000400510 <+0>: movabs $0x500000003,%rax
0x000000000040051a <+10>: retq
Because two ints fit in a 64-bit register, the whole function can be implemented with one immediate load and no memory accesses!
But surely, you say, that's just because your Pair
type is simple.
How about pointers? If the second field were a pointer, it wouldn't
fit into a single register.
Here's what a pair of an int and a pointer compiles down to:
Dump of assembler code for function return_pair2:
0x0000000000400510 <+0>: mov $0x40062c,%edx
0x0000000000400515 <+5>: mov $0x3,%eax
0x000000000040051a <+10>: retq
Again no memory references, just registers.
Ok, how about something that can't fit in multiple registers? Like say a buffer.
typedef struct { int a; int big[1024]; } Pair; Pair return_pair3();
and the associated code:
Pair return_pair3() { Pair p; p.a = 3; p.big[0] = 5; return p; }
Here's the dump:
Dump of assembler code for function return_pair3:
0x0000000000400510 <+0>: sub $0xfa0,%rsp
0x0000000000400517 <+7>: mov %rdi,%rax
0x000000000040051a <+10>: movl $0x3,(%rdi)
0x0000000000400520 <+16>: movl $0x5,0x4(%rdi)
0x0000000000400527 <+23>: add $0xfa0,%rsp
0x000000000040052e <+30>: retq
To "return" a large structure, the caller provides stack space for it
and the function fills in the caller's copy — sorta like the return
value optimization. This code is the same as the code that
explicitly passes a pointer. (I don't get why this function adjusts
%rsp
, it seems like it doesn't even use it...)
In each of these cases, returning by value seems to equal to or better in terms of generated code to the approaches using pointers. So why not do it?
Here are some reasons. (Note that I'm avoiding C++ here, which has its own additional complicated rules as described in the above wikipedia article.)
-
Most importantly, you need to create a new tuple type whenever you want to pass more than one value around. It is inconvenient, especially when the caller already has a variable handy for the value it wants to get back from the function and could just pass its address.
-
Passing structures via registers appears to be a newer ABI; gcc has
-fpcc-struct-return
and-freg-struct-return
to select between them. But my system appears to be built with return-via-registers on (it appears that it as introduced into gcc around the year 2000) and even when I manually select returning via memory it just meansreturn_pair
andreturn_pair2
decompose into the behavior ofreturn_pair3
. -
If your structure contains any character buffer the function gains a bunch of checking code due to
-fstack-protector
, removing the benefit.
-
For larger structures, you may have to worry about stack space. But such things don't belong on the stack in the first place; you are working with pointers to them to start with so functions that fill in those pointers are more convenient anyway.
-
(This point and the following were contributed by Jeffrey Yasskin after the post was first published.) If you return several different variables, depending on conditions inside the function, NRVO doesn't kick in. This is often a missed optimization in the compiler, but we still have to deal with it.
-
If the return value owns some allocated space, you can often save allocation time by passing in a variable that already has the space allocated.
-
Insert your reason here. What else am I missing?