Four Years Later

I have been having my PSA tested every six months and am pleased to say it is completely negative, meaning I have no signs of recurrence of the cancer. Since prostate cancer is slow-growing, it will be another six years before I can officially be pronounced “cured”.

Life Cycle of a Linux Program

Introduction

This is an investigation of the life cycle of a program in a Linux system.

Actually, there are two (at least) meanings of “program life cycle”:

  1. The development life cycle (requirements, design, code, test, deploy…)
  2. The execution life cycle of a program when it is run.

We will discuss the latter.  How is a new program begun?  How is it ended?  (What happens in between is largely up to the program itself.)

The discussion that follows assumes you have some familiarity with C programming, since our sample program was written in C and we will be examining some C code.  It also assumes you have some familiarity with the Linux application programming interface.

It also assumes you have some familiarity with programming and running in a Linux environment.

All source file references are relative to the root of the run-time library source tree and are specific to Fedora 17, the system from which I got this information.

This article describes what happens on an x86 (aka Intel IA-32) processor.  For other processor types, the machine instructions will be different but the concepts are the same.

Sample Program

Here is the program we will be investigating. It is the familiar “hello world” program:

#include <stdio.h>

int main()
{
    printf ("Hello, World!\n");
}

As you can see this program displays a simple message to standard output.

As you can also see this program uses one C library call, printf Typically, such functions are not linked into the executable file during the compilation process. Instead they are linked (i.e. added to the program) at run time,  once the program is started. The library code comes from a separate file, “libc.so”.  (The actual name on my system includes a version number:  libc-2.19.so.)  More complex programs may have many more such shared libaries, each coming from a separate “.so” file. Adding these libraries to your program is known as “late binding” or “run-time binding”. We’ll see how this is done momentarily.

Compiling the Program

We used the following gcc command to build the program:

$ gcc -o hello hello.c

This command creates the executable file, “hello”. This file contains the machine-language code for our program which we can examine with the objdump command:

$ objdump -d hello

hello:     file format elf32-i386

...

Disassembly of section .text:

 08048350 <_start>:
 8048350:   31 ed                   xor    %ebp,%ebp
 8048352:   5e                      pop    %esi
 8048353:   89 e1                   mov    %esp,%ecx
 8048355:   83 e4 f0                and    $0xfffffff0,%esp
 8048358:   50                      push   %eax
 8048359:   54                      push   %esp
 804835a:   52                      push   %edx
 804835b:   68 e0 84 04 08          push   $0x80484e0
 8048360:   68 70 84 04 08          push   $0x8048470
 8048365:   51                      push   %ecx
 8048366:   56                      push   %esi
 8048367:   68 4d 84 04 08          push   $0x804844d
 804836c:   e8 cf ff ff ff          call   8048340 <__libc_start_main@plt>
 8048371:   f4                      hlt    

...

 0804844d <main>:
 804844d:   55                      push   %ebp
 804844e:   89 e5                   mov    %esp,%ebp
 8048450:   83 e4 f0                and    $0xfffffff0,%esp
 8048453:   83 ec 10                sub    $0x10,%esp
 8048456:   c7 04 24 00 85 04 08    movl   $0x8048500,(%esp)
 804845d:   e8 be fe ff ff          call   8048320 <puts@plt>
 8048462:   c9                      leave  
 8048463:   c3                      ret

(<main> is the main function seen in our source file, hello.c, shown at the beginning of this article. <_start> is the program startup code that we’ll see again shortly.) There is other code as well, which I have deleted from the above output for clarity.

Note that the compiler substituted “puts” for “printf” as an optimization.

In addition to the machine-language instructions for the program, the executable file includes information about the functions to be loaded at run time from the .so libraries:

$ objdump -T  hello

hello:     file format elf32-i386

DYNAMIC SYMBOL TABLE:
00000000      DF *UND*  00000000  GLIBC_2.0   puts
00000000  w   D  *UND*  00000000              __gmon_start__
00000000      DF *UND*  00000000  GLIBC_2.0   __libc_start_main
080484fc g    DO .rodata    00000004  Base        _IO_stdin_used

We see puts as well as __libc_start_main, which we will encounter again shortly, and some other functions used internally by the C run-time library.

Running the Program — the shell

Normally the program would be started from a shell:

$ ./hello
Hello, World!

Every program needs its own process to run in.  Therefore the shell will fork a child process:

while ((pid = fork ()) < 0 && errno == EAGAIN && forksleep < FORKSLEEP_MAX)
{
     ...handle EAGAIN error
}

The shell, running in the child process,  will then call the execve system function to start the program:


execve (command, args, env);

Of course, not every program is started from a shell, but whatever program is used to start our program, the program that starts our program will very probably call fork and will certainly call one of the exec family of system calls.

execve() System Call — the Kernel

In the kernel, the execve system call will create a new memory space for the new program and map the program file into memory.

What do we mean by “map the program file into memory”?  In the early days of computers, before the use of virtual memory, programs were actually “loaded” into memory, meaning the entire program file was copied from some storage device such disk, tape, or cards, into memory.  For a large program this could take considerable time.

With the use of virtual memory it is only necessary for the kernel to construct data structures that specify where the various parts of the program should go in memory and where they should come from on disk.  With this mechanism, only the portions of the program that are needed are copied into memory and only once they are needed.  Some portions (such as error recovery routines) may never be needed and are thus never loaded into memory.

In order to map the .so library files, mentioned above, into memory the kernel maps one .so file , often called “ld.so” and referred to as “the dynamic loader” into the process’s memory.

(For more details about how the kernel handles the execve system call, see Understanding the Linux Kernel, 3rd Edition,  by Daniel P. Bover, Chapter 20.)

The kernel then begins running the new program, starting with code in ld.so.  This allows the loading of the additional .so files to be done from within ld.so in the user space instead of by the kernel.

How does the kernel actually transfer control to ld.so?  Normally when the kernel finishes a system call it goes through a return sequence which concludes with an iret (interrupt return) instruction or a sysexit instruction.  That instruction restores the process’s next instruction address to the IP (Instruction Pointer) register so that the next time the CPU fetches an instruction it is from the instruction following the system call.

In this case however, the program that executed the execve is no longer in the process’s memory:  it has been replaced by the new program.  So the kernel “diddles” with the stack where the return address was stored such that when the iret or sysexit instruction is executed, control “returns” to the first instruction of the new program which in this case is the instruction labeled _start within ld.so.

The Dyamic Loader — the Start of user-mode execution

ld.so begins with the following assembly language code (defined as the RTLD_START macro in sysdeps/i386/dl-machine.h.

#define RTLD_START asm (“\n\

_start:\n\
# Note that _dl_start gets the parameter in %eax.\n\
movl %esp, %eax\n\
call _dl_start\n\
_dl_start_user:\n\
# Save the user entry point address in %edi.\n\
movl %eax, %edi\n\
# Point %ebx at the GOT.\n\
call 0b\n\
addl $_GLOBAL_OFFSET_TABLE_, %ebx\n\
# See if we were run as a command with the executable file\n\
# name as an extra leading argument.\n\
movl _dl_skip_args@GOTOFF(%ebx), %eax\n\
# Pop the original argument count.\n\
popl %edx\n\
# Adjust the stack pointer to skip _dl_skip_args words.\n\
leal (%esp,%eax,4), %esp\n\
# Subtract _dl_skip_args from argc.\n\
subl %eax, %edx\n\
# Push argc back on the stack.\n\
push %edx\n\
# The special initializer gets called with the stack just\n\
# as the application’s entry point will see it; it can\n\
# switch stacks if it moves these contents over.\n\
” RTLD_START_SPECIAL_INIT “\n\
# Load the parameters again.\n\
# (eax, edx, ecx, *–esp) = (_dl_loaded, argc, argv, envp)\n\
movl _rtld_local@GOTOFF(%ebx), %eax\n\
leal 8(%esp,%edx,4), %esi\n\
leal 4(%esp), %ecx\n\
movl %esp, %ebp\n\
# Make sure _dl_init is run with 16 byte aligned stack.\n\
andl $-16, %esp\n\
pushl %eax\n\
pushl %eax\n\
pushl %ebp\n\
pushl %esi\n\
# Clear %ebp, so that even constructors have terminated backchain.\n\
xorl %ebp, %ebp\n\
# Call the function to run the initializers.\n\
call _dl_init_internal@PLT\n\
# Pass our finalizer function to the user in %edx, as per ELF ABI.\n\
leal _dl_fini@GOTOFF(%ebx), %edx\n\
# Restore %esp _start expects.\n\
movl (%esp), %esp\n\
# Jump to the user’s entry point.\n\
jmp *%edi\n\

This code begins by calling _dl_start which is written in C and is in debug/glibc-2.15-a316c1f/elf/rtld.c.

We can use strace (which traces system calls) to follow this startup code.  Here is the output from that program:

$ strace ./hello
...
brk(0)                                  = 0x85b0000

The call to brk(0) is a “trick” to determine the location of the program’s heap.  (The heap is the memory area used for dynamic memory by the program.)

access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)

This call to access looks for a file called “/etc/ld.so.nohwcap” but the call returns the error, ENOENT, meaning the file does not exist.

mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7704000

This call to mmap2 requests 8K of additional memory from the kernel which maps it into address 0xb7704000.

Note: There is a Linux security feature, called “address space layout randomization”, which is designed to deter certain forms of hacking by making it difficult to predict where code will reside. The result is that memory is allocated in different locations each time the program is run. For example, the above memory area was allocated by the kernel at 0xb7704000. However, on a previous run of the same program on the same system, this memory had been allocated at 0xb77b7000.

access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)

Looking for another file, “/etc/ld.so.preload” which also does not exist.

open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=97882, ...}) = 0
mmap2(NULL, 97882, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76ec000
close(3)                                = 0

The above four system calls result in mapping the file /etc/ld.so.cache into memory at 0xb76ec000.

  • open() opens the file.
  • fstat64() returns, among other things, the size of the file (which will be used by the mmap2 call).
  • mmap2() maps the file into memory.
  • and close() closes the file.

ld.so.cache is a file that contains information about the location of system libraries within the file system.  This file, which was created by the ldconfig utility program, is used to speed up the locating of standard shared libraries.

open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\233\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1754876, ...}) = 0
mmap2(NULL, 1759868, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb753e000
mmap2(0xb76e6000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a8000) = 0xb76e6000
mmap2(0xb76e9000, 10876, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb76e9000
close(3)                                = 0

The above steps are where the C run-time library, libc.so, is actually mapped into memory. There are three calls to mmap2 for three portions of the file: executable instructions, constant data, and global variable data.

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb753d000

This system call adds an additional memory area immediately above the area used by the main program. This area will be used for various housekeeping information about the program.

set_thread_area({entry_number:-1 -> 6, base_addr:0xb753d940, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0

This call to set_thread_area tells the kernel to set up a TLS (Thread Local Storage) data area. Note that the address is within the memory area most recently allocated.

mprotect(0xb76e6000, 8192, PROT_READ)   = 0
mprotect(0x8049000, 4096, PROT_READ)    = 0
mprotect(0xb7727000, 4096, PROT_READ)   = 0

These mprotect statements change the memory protection to read-only for lib.so constants, the program’s constants, and ld.so’s own constants,
respectively.

munmap(0xb76ec000, 97882)               = 0

The memory area previously mapped from ld.so.cache is now removed from memory by munmap.

At this point the shared libaries are mapped into memory.  (In our case, there is only one, libc.so)

Control returns from _ld_start to _start which falls through to  _dl_user_start which continues with this code previously shown from the RTLD_START macro:

# Pass our finalizer function to the user in %edx, as per ELF ABI.\n\
leal _dl_fini@GOTOFF(%ebx), %edx\n\
# Restore %esp _start expects.\n\
movl (%esp), %esp\n\
# Jump to the user’s entry point.\n\
jmp *%edi\n\

The jmp (jump) instruction at the end of the above code transfers to the entry point of our program.  Our program’s entry point was passed from the kernel in the %eax register.  It was previously moved to the %edi register by this code:

# Save the user entry point address in %edi.\n\
movl %eax, %edi\n\

With the jmp instruction, our program now begins at a routine called _start.  (Note this is not the same as the _start function in ld.so; each instance of _start is defined locally within its own module.)

Beginning our Program Code — the C Run-time Library

Upon entry to the program (at _start) the following information has been provided to the program:

  • The command-line arguments and environment variables are loaded into the top end of the stack memory area.
  • The stack pointer is set just below the above data.
  • argc and argv are then pushed onto the stack.   These are the count and address of the command line arguments, respectively.

(The above three steps were done by the kernel as part of the execve processing.)

Our program begins with:


0x8048ba8 <_start>      xor    %ebp,%ebp

The xor instruction shown above (the first instruction of the program) sets the %ebp register to zero. This register is used to keep track of stack frames used by C functions, and setting this value to zero means this is the end of the set of stack frames.

0x8048baa <_start+2>    pop    %esi 
0x8048bab <_start+3>    mov    %esp,%ecx

The above instructions get argc and argv from the stack to the %esi and %ecx registers respectively.

0x8048bad <_start+5>    and    $0xfffffff0,%esp

This makes sure the stack pointer is on a word boundary, i.e. on an address divisible by 16.

0x8048bb0 <_start+8>    push   %eax 
0x8048bb1 <_start+9>    push   %esp <stack end>
0x8048bb2 <_start+10>   push   %edx  <_dl_fini> [from ld.so]
0x8048bb3 <_start+11>   push   $0x8049340 <__libc_csu_fini>
0x8048bb8 <_start+16>   push   $0x80492a0 <__libc_csu_init>
0x8048bbd <_start+21>   push   %ecx  <argv> [saved above]
0x8048bbe <_start+22>   push   %esi  <argc> [saved above]
0x8048bbf <_start+23>   push   $0x8048ce0 <main>
0x8048bc4 <_start+28>   call   0x8048d00 <__libc_start_main>

The above instructions push the arguments for the subsequent function call onto the stack, and then call __libc_start_main, the first C-language code in the program.

0x8048bc9 <_start+33>   hlt

This instruction would be executed if __libc_start_main returned to its caller, but that should never happen. If it did, hlt (halt) is a privileged instruction and will cause the program to fail.

The following code is from debug/glibc-2.15-a316c1f/csu/libc-start.c.
It shows the entry into __libc_start_main.

/* Note: the fini parameter is ignored here for shared library.  It
   is registered with __cxa_atexit.  This had the disadvantage that
   finalizers were called in more than one place.  */
STATIC int
LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
         int argc, char *__unbounded *__unbounded ubp_av,
#ifdef LIBC_START_MAIN_AUXVEC_ARG
         ElfW(auxv_t) *__unbounded auxvec,
#endif
         __typeof (main) init,
         void (*fini) (void),
         void (*rtld_fini) (void), void *__unbounded stack_end)

__libc_start_main (main=0x8048430 <main>, argc=1, ubp_av=0xbfffefa4, 
    init=0x8048450 <__libc_csu_init>, fini=0x80484c0 <__libc_csu_fini>, 
    rtld_fini=0x42bfaa90 <_dl_fini>, stack_end=0xbfffef9c) at libc-start.c:96

At this point a number of functions are called (not shown) that initialize the C run-time environment.

Beginning the main() Function

Next we see the following code:

 /* Nothing fancy, just call the function. */
 result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
#endif

 exit (result);

The above call to main is where the user’s code is started. (All C programs begin with a function named “main”.)

At this point our program’s code, beginning at main() starts to execute at last.  Our program does whatever it was programmed to do (in our case print a message as we saw at the top of this article).

Program exit — Back to the C Run-time Library

The C program can terminate either by calling “exit” or by returning from the main function. Our example program (shown at the top of this article) does the latter. In that case the following code is executed from the run-time library after the return from main:

258       exit (result);

So either our program calls exit or the run-time library calls it for us after main returns.

Many people think exit is a system call, but it is actually a C library function in debug/glibc-2.15-a316c1f/stdlib/exit.c:

98      exit (int status)
99      {           
100       __run_exit_handlers (status, &__exit_funcs, true);
101     }

As you can see, exit is very simple; it just calls __run_exit_handlers which looks like this:

/* Call all functions registered with `atexit' and `on_exit',
   in the reverse of the order in which they were registered
   perform stdio cleanup, and terminate program execution with STATUS.  */
void
attribute_hidden
__run_exit_handlers (int status, struct exit_function_list **listp,
             bool run_list_atexit)
{
  /* We do it this way to handle recursive calls to exit () made by
     the functions registered with `atexit' and `on_exit'. We call
     everyone on the list and use the status value in the last
     exit (). */
...
  _exit (status);
}

(The omitted code simply loops through calling a list of functions to handle process cleanup before it exits.)

The _exit() System Call — Back to the Kernel

As you can see above, the last thing this function does is call _exit, which really is a system call, defined in debug/glibc-2.15-a316c1f/sysdeps/unix/sysv/linux/i386/_exit.S:

    .text
    .type   _exit,@function
    .global _exit
_exit:
    movl    4(%esp), %ebx

    /* Try the new syscall first.  */
#ifdef __NR_exit_group
    movl    $__NR_exit_group, %eax
    ENTER_KERNEL
#endif

    /* Not available.  Now the old one.  */
    movl    $__NR_exit, %eax
    /* Don't bother using ENTER_KERNEL here.  If the exit_group
       syscall is not available AT_SYSINFO isn't either.  */
    int     $0x80

    /* This must not fail.  Be sure we don't return.  */
    hlt
    .size   _exit,.-_exit

This code places the intended system call number (NR_exit_group) into the %eax register to pass it to the kernel.  It then invokes ENTER_KERNEL, which is an assembler macro that expands to this machine-language code:

0x42cc57ad <_exit+9>            call   *%gs:0x10

which calls this machine-language code, the typical sequence for calling a system service in the kernel.

0xb7fff414 <__kernel_vsyscall>          push   %ecx 
0xb7fff415 <__kernel_vsyscall+1>        push   %edx
0xb7fff416 <__kernel_vsyscall+2>        push   %ebp
0xb7fff417 <__kernel_vsyscall+3>        mov    %esp,%ebp
0xb7fff419 <__kernel_vsyscall+5>        sysenter

The sysenter instruction will cause a transition to kernel mode.

Usually the kernel returns to the user’s program when it is finished, but in the case of _exit the kernel does not return. Instead, it deletes the memory space occupied by our program. The process it was running in will be marked “defunct”.

waitpid() — Back to the Shell

The process will be completely deleted once its parent (usually the shell) gathers the defunct process’s completion status:

  pid = waitpid (-1, &status, waitpid_flags);

Once waitpid is called by the process’s parent, both the child process and the program it was running are gone from memory.

Thus the program’s execution life cycle is ended.

Other Ways a Program May Terminate

In addition to calling exit to terminate the process,, as was described above, there are alternative ways in which a program may terminate:

  • The process can be abnormally terminated due to an error or action by another program or by the user.  (In that case the C library cleanup code by the exit() call will not be performed.)
  • The program can issue another execve call which will start a new program in place of the current program.

Conclusion

Here is a summary of the steps described in this article. .

  1. Frequently, but not always, some process (often a shell) forks a new process for the new program.
  2. In the process that is to run the new program, the existing program calls execve.
  3. The kernel releases the old program’s address space and begins building a new address space.
  4. The kernel maps the program into the new address space.
  5. If the program uses dynamic libraries,  then the kernel maps ld.so into the new address space.
  6. If the program uses dynamic libraries, the kernel gives control to ld.so within the process context of the new program.  ld.so then causes any shared libraries to be mapped into memory.
  7. ld.so then transfers control to the new program for the first time at the label, _start.
  8. _start saves some input parameters from the system and then call _libc_start_main which initializes the C run time library.
  9. _libc_start_c_main calls main, the beginning of the application program code.
  10. The program runs until:
    1. it terminates by calling exit or returning from the main function.  (Continue to step 11.)
    2. it calls execve to begin a new program.  (Go back to step 2.)
    3. The process is abnormally terminated.  (Skip to step 13.)
  11. The C run-time library cleans up.
  12. The run-time library calls _exit, the system call that terminates the process.
  13. The kernel releases the memory and other resources of the just-terminated process.
  14. The parent process issues a wait call for the process that just terminated.
  15. The kernel releases the terminated process’s task structure, the last remnant of the program.