My Octopress Blog

A blogging framework for hackers.

Depicting Function Inlining by GCC

Inline Function

In C, if a particular function used has only a few lines in its body, and if the optimization level is set to 03 (preferably), some unexpected changes can be observed about how gcc handles this function.

What the compiler will do is that it replaces the call for this function, with the actual code of the function, called inlining.

The limit on the number of lines below which inlining is performed, strictly depends upon the gcc heuristics.

This is not all. In  the extreme case, if the small function mentioned above only does something like calculating a value after taking an input, then gcc will evaluate the function call, calculate the value, and directly paste it in the program instead of the function call itself.

Sweet, isn’t it?  

Test Program



    int sqr(int x)
        {
            int a;
            return x*x;
        }

    main()
        {
            printf(“%d\n”, sqr(10));
        }

Assembly Code

To view the assembly code.
    gcc -S -fomit-frame-pointer opt1.c

    less opt1.s


The assembly code is:
    sqr:
            subl    $16, %esp
            movl    20(%esp), %eax
            imull   20(%esp), %eax
            addl    $16, %esp
            ret
          
    main:
            pushl   %ebp
            movl    %esp, %ebp
            andl    $-16, %esp
            subl    $16, %esp
            movl    $10, (%esp)
            call    sqr
            movl    %eax, 4(%esp)
            movl    $.LC0, (%esp)
            call    printf
            leave
            ret
          
On optimization,

    gcc -S -O3 -fomit-frame-pointer opt1.c

        less opt1.s

The new code is:

    sqr:
            movl    4(%esp), %eax
            imull   %eax, %eax
            ret
    main:
            pushl   %ebp
            movl    %esp, %ebp
            andl    $-16, %esp
            subl    $16, %esp
            movl    $100, 4(%esp)
            movl    $.LC0, (%esp)
            call    printf
            leave
            ret

Here, the function sqr( ) does something very simple, and the input to the function is statically assigned. It means that the value of the input (10) will never change during runtime. Hence, the compiler will optimize the program even further, to the extreme that the square of 10 will be evaluated and the result pasted in the program instead of the original call to the function sqr( ).