Friday, September 2, 2011

X86 Calling Conversion

Wikipedian has some definition: a calling convention is a scheme for how functions receive parameters from their caller and how they return a results.

Basically, it is a compiler ABI and varies on different platforms (like Windows and Linux). This is interesting and useful for debugging (at least for understanding how debuggers work...).

For example, a simple piece of code:
  1 void f(int arg1, int arg2, int arg3, int arg4, float arg5, int arg6, float arg7,
  2         float arg8, int arg9, int arg10, int arg11, int arg12)
  3 {
  4         printf("%d %d %d %d %f %d %f %f %d %d %d %d\n",
  5                 arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9,
  6                 arg10, arg11, arg12);
  7 }
  8
  9 void main()
 10 {
 11         f(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12);
 12 }


On Linux i386, above compiles to assembly:
[SID@test]$cc a.c -o a
a.c: In function ‘f’:
a.c:4: warning: incompatible implicit declaration of built-in function ‘printf’
[SID@test]$objdump -d a > a.s

In a.s, we can see the main function calls f function by passing every argument through the stack.
160 0804842e 
: 161 804842e: 55 push %ebp 162 804842f: 89 e5 mov %esp,%ebp 163 8048431: 83 e4 f0 and $0xfffffff0,%esp 164 8048434: 83 ec 30 sub $0x30,%esp 165 8048437: c7 44 24 2c 0c 00 00 movl $0xc,0x2c(%esp) 166 804843e: 00 167 804843f: c7 44 24 28 0b 00 00 movl $0xb,0x28(%esp) 168 8048446: 00 169 8048447: c7 44 24 24 0a 00 00 movl $0xa,0x24(%esp) 170 804844e: 00 171 804844f: c7 44 24 20 09 00 00 movl $0x9,0x20(%esp) 172 8048456: 00 173 8048457: b8 00 00 00 41 mov $0x41000000,%eax 174 804845c: 89 44 24 1c mov %eax,0x1c(%esp) 175 8048460: b8 00 00 e0 40 mov $0x40e00000,%eax 176 8048465: 89 44 24 18 mov %eax,0x18(%esp) 177 8048469: c7 44 24 14 06 00 00 movl $0x6,0x14(%esp) 178 8048470: 00 179 8048471: b8 00 00 a0 40 mov $0x40a00000,%eax 180 8048476: 89 44 24 10 mov %eax,0x10(%esp) 181 804847a: c7 44 24 0c 04 00 00 movl $0x4,0xc(%esp) 182 8048481: 00 183 8048482: c7 44 24 08 03 00 00 movl $0x3,0x8(%esp) 184 8048489: 00 185 804848a: c7 44 24 04 02 00 00 movl $0x2,0x4(%esp) 186 8048491: 00 187 8048492: c7 04 24 01 00 00 00 movl $0x1,(%esp) 188 8048499: e8 26 ff ff ff call 80483c4 189 804849e: c9 leave 190 804849f: c3 ret

Then on X86_64 Linux, the code compiles into following, where parameters are passed to f function through three ways: general purpose registers (di, si, dx, cx, r8d, r9d), xmm registers (xmm0~xmm2), and function stack.
159 000000000040054b 
: 160 40054b: 55 push %rbp 161 40054c: 48 89 e5 mov %rsp,%rbp 162 40054f: 48 83 ec 20 sub $0x20,%rsp 163 400553: c7 44 24 10 0c 00 00 movl $0xc,0x10(%rsp) 164 40055a: 00 165 40055b: c7 44 24 08 0b 00 00 movl $0xb,0x8(%rsp) 166 400562: 00 167 400563: c7 04 24 0a 00 00 00 movl $0xa,(%rsp) 168 40056a: 41 b9 09 00 00 00 mov $0x9,%r9d 169 400570: f3 0f 10 15 60 01 00 movss 0x160(%rip),%xmm2 # 4006d8 <__dso_handle+0 x30> 170 400577: 00 171 400578: f3 0f 10 0d 5c 01 00 movss 0x15c(%rip),%xmm1 # 4006dc <__dso_handle+0 x34> 172 40057f: 00 173 400580: 41 b8 06 00 00 00 mov $0x6,%r8d 174 400586: f3 0f 10 05 52 01 00 movss 0x152(%rip),%xmm0 # 4006e0 <__dso_handle+0 x38> 175 40058d: 00 176 40058e: b9 04 00 00 00 mov $0x4,%ecx 177 400593: ba 03 00 00 00 mov $0x3,%edx 178 400598: be 02 00 00 00 mov $0x2,%esi 179 40059d: bf 01 00 00 00 mov $0x1,%edi 180 4005a2: e8 1d ff ff ff callq 4004c4 181 4005a7: c9 leaveq 182 4005a8: c3 retq 183 4005a9: 90 nop 184 4005aa: 90 nop 185 4005ab: 90 nop 186 4005ac: 90 nop 187 4005ad: 90 nop 188 4005ae: 90 nop 189 4005af: 90 nop

So why the difference? Basically this is part of System V AMD64 ABI convention which GCC and ICC (Intel compiler) implements on Linux, BSD and Mac and which defines that rdi, rsi, rdx, rcx, r8, r9 can be used to pass down integer parameters and xmm0-7 can be used to pass down float point parameters.

This leads to another question, why not other registers? On X86_64, there are 16 general purpose registers that can save integers (rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp r8~r15), and 16 xmm registers that can save float points (xmm0~xmm15). They are divided by compiler ABI into volatile and non-volatile registers. Volatile registers are scratch registers presumed by the caller to be destroyed across a call. Nonvolatile registers are required to retain their values across a function call and must be saved by the callee if used. So volatile registers are naturally suitable for function arguments while there is overhead of using non-volatile registers (must be saved).

The calling conversion ABI is basically about which register is volatile/non-volatile, which is reserved for specially purpose (parameter passing, frame pointer, stack pointer, etc.), what is the order of arguments on stack, who (caller or callee) is responsible for cleaning up the stack, as well as stack layout/alignness.

Architecture Calling convention name Operating system, Compiler Parameters in registers Parameter order on stack Stack cleanup by Notes
64bit Microsoft x64 calling convention Windows (Microsoft compiler, Intel compiler) rcx/xmm0, rdx/xmm1, r8/xmm2, r9/xmm3 RTL (C) caller Stack aligned on 16 bytes. 32 bytes shadow space on stack. The specified 8 registers can only be used for parameter number 1,2,3 and 4.
System V AMD64 ABI convention Linux, BSD, Mac (GCC, Intel compiler) rdi, rsi, rdx, rcx, r8, r9, xmm0-7 RTL (C) caller Stack aligned on 16 bytes. Red zone below stack.

The above table is only for either user space application or kernel space functions. Likewise, there is always an exception. Here the exception is system calls. System calls trap user space context into kernel space and have specially requirement for parameter passing:
1. User-level applications use as integer registers for passing the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.

2. A system-call is done via the syscall instruction. The kernel destroys registers %rcx and %r11.

3. The number of the syscall has to be passed in register %rax.

4. System-calls are limited to six arguments, no argument is passed directly on
the stack.

5. Returning from the syscall, register %rax contains the result of the system-call. A value in the range between -4095 and -1 indicates an error, it is -errno.

6. Only values of class INTEGER or class MEMORY are passed to the kernel.

Thursday, September 1, 2011

CLFS 2010

I wasn't able to write down the details about CLFS (China Linux Filesystem and Storage workshop) last year (too busy at that time...). But luckily, I have written this slide to recode the major events in it. Here it is...

Virtual Machine File System

VMFS is an interesting paper. It tells us how VMware builds its high performance SAN file system. Since I was writing file system in ESX kernel in the passed year, I feel I got a good position in understand VMFS's designs and intentions. Here is the summary slides about it. I did my best to make sure no confidential information are leaked. All sentences and pictures are either from the paper or from Internet.


Google Megastore

Reading google's infrastructure paper is always joy, of which I have just had one recently. Google's megastore presents how Google builds transactional service on top of bigtable. Here is the slide I wrote about it...

RCU

RCU had been a mystery to me for quite a long time. I finally had sometime to walk it through and understand what happened and why. Here is the slide I presented to guys on Auguest's linuxfb.net (@linuxfb) seminar...

RCU
View more presentations from bergwolf