Introduction
This is an investigation of the life cycle of a program in a Linux system.
Actually, there are (at least) two meanings of “program life cycle”:
- The development life cycle (requirements, design, code, test, deploy…)
- The execution life cycle of a program when it is run.
We will discuss the latter. How is a new program begun? How is it ended? (What happens in between is largely up to the program itself.)
The discussion that follows assumes you have some familiarity with C programming, since our sample program was written in C and we will be examining some C code. It also assumes you have some familiarity with the Linux application programming interface.
It also assumes you have some familiarity with programming and running in a Linux environment.
All source file references are relative to the root of the run-time library source tree and are specific to Fedora 17, the system from which I got this information.
This article describes what happens on an x86 (aka Intel IA-32) processor. For other processor types, the machine instructions will be different but the concepts are the same.
Sample Program
Here is the program we will be investigating. It is the familiar “hello world” program:
#include <stdio.h>
int main()
{
printf ("Hello, World!\n");
}
As you can see this program displays a simple message to standard output.
As you can also see this program uses one C library call, printf Typically, such functions are not linked into the executable file during the compilation process. Instead they are linked (i.e. added to the program) at run time, once the program is started. The library code comes from a separate file, “libc.so”. (The actual name on my system includes a version number: libc-2.19.so.) More complex programs may have many more such shared libaries, each coming from a separate “.so” file. Adding these libraries to your program at run time (as opposed to compile time) is known as “late binding” or “run-time binding”. We’ll see how this is done momentarily.
Compiling the Program
We used the following gcc command to build the program:
$ gcc -o hello hello.c
This command creates the executable file, “hello”. This file contains the machine-language code for our program which we can examine with the objdump command:
$ objdump -d hello
hello: file format elf32-i386
...
Disassembly of section .text:
08048350 <_start>:
8048350: 31 ed xor %ebp,%ebp
8048352: 5e pop %esi
8048353: 89 e1 mov %esp,%ecx
8048355: 83 e4 f0 and $0xfffffff0,%esp
8048358: 50 push %eax
8048359: 54 push %esp
804835a: 52 push %edx
804835b: 68 e0 84 04 08 push $0x80484e0
8048360: 68 70 84 04 08 push $0x8048470
8048365: 51 push %ecx
8048366: 56 push %esi
8048367: 68 4d 84 04 08 push $0x804844d
804836c: e8 cf ff ff ff call 8048340 <__libc_start_main@plt>
8048371: f4 hlt
...
0804844d <main>:
804844d: 55 push %ebp
804844e: 89 e5 mov %esp,%ebp
8048450: 83 e4 f0 and $0xfffffff0,%esp
8048453: 83 ec 10 sub $0x10,%esp
8048456: c7 04 24 00 85 04 08 movl $0x8048500,(%esp)
804845d: e8 be fe ff ff call 8048320 <puts@plt>
8048462: c9 leave
8048463: c3 ret
(<main> is the main function seen in our source file, hello.c, shown at the beginning of this article. <_start> is the program startup code that we’ll see again shortly.) There is other code as well, which I have deleted from the above output for clarity.
Note that the compiler substituted “puts” for “printf” as an optimization.
In addition to the machine-language instructions for the program, the executable file includes information about the functions to be loaded at run time from the .so libraries:
$ objdump -T hello
hello: file format elf32-i386
DYNAMIC SYMBOL TABLE:
00000000 DF *UND* 00000000 GLIBC_2.0 puts
00000000 w D *UND* 00000000 __gmon_start__
00000000 DF *UND* 00000000 GLIBC_2.0 __libc_start_main
080484fc g DO .rodata 00000004 Base _IO_stdin_used
We see puts as well as __libc_start_main, which we will encounter again shortly, and some other functions used internally by the C run-time library.
Running the Program — the shell
Normally the program would be started from a shell:
$ ./hello
Hello, World!
Every program needs its own process to run in. As I like to say, “Every program needs a process and every process needs a program.” Therefore the shell needs to create a new process in which to run hello, so it will fork a child process:
while ((pid = fork ()) < 0 && errno == EAGAIN && forksleep < FORKSLEEP_MAX)
{
...handle EAGAIN error
}
The shell, running in the child process, will then call the execve system function to start the program:
execve (command, args, env);
Of course, not every program is started from a shell, but whatever program is used to start our program, the program that starts our program will very probably call fork and will certainly call one of the exec family of system calls.
execve() System Call — the Kernel
In the kernel, the execve system call will first check the execve parameters for errors. If none are found it will then delete the address space of the calling process. (In our example, this address space contains the shell. Note that the process’s parent also contains the shell, which will continue to run independently. )
The kernel will then create a new memory space for the new program and map the program file into memory.
What do we mean by “map the program file into memory”? In the early days of computers, before the use of virtual memory, programs were actually “loaded” into memory, meaning the entire program file was copied from some storage device such disk, tape, or cards, into memory. For a large program this could take considerable time.
With the use of virtual memory it is only necessary for the kernel to construct data structures that specify where the various parts of the program should go in memory and where they should come from on disk. With this mechanism, only the portions of the program that are needed are copied into memory and only once they are needed. Some portions (such as error recovery routines) may never be needed and are thus never loaded into memory.
In order to map the .so library files, mentioned above, into memory the kernel maps one .so file , referred to as “the dynamic loader” into the process’s memory. (In almost all cases the format of the file being started (./hello in our example) is in a format called ELF (executable and linking format). In the case of ELF files, the identity of the dynamic loader .so file is specified within the executable file’s header.
For more details about how the kernel handles the execve system call, see Understanding the Linux Kernel, 3rd Edition, by Daniel P. Bover, Chapter 20.)
The kernel then begins running the new program, starting with code in the dynamic loader .so file. This allows the loading of the additional .so files to be done from within the dynamic loader, running in the user space instead of by the kernel.
How does the kernel actually transfer control to the dynamic loader? Normally when the kernel finishes a system call it goes through a return sequence which concludes with an iret (interrupt return) instruction or a sysexit instruction. That instruction restores the process’s next instruction address to the IP (Instruction Pointer) register so that the next time the CPU fetches an instruction it is from the instruction following the system call.
In this case however, the program that executed the execve is no longer in the process’s memory: it has been replaced by the new program. So the kernel “diddles” with the stack where the return address was stored such that when the iret or sysexit instruction is executed, control “returns” to the first instruction of the new program which in this case is the instruction labeled _start within the dynamic loader.
The Dyamic Loader — the Start of user-mode execution
The dynamic loader begins with the following assembly language code (defined as the RTLD_START macro in sysdeps/i386/dl-machine.h.
#define RTLD_START asm (“\n\
…
_start:\n\
# Note that _dl_start gets the parameter in %eax.\n\
movl %esp, %eax\n\
call _dl_start\n\
_dl_start_user:\n\
# Save the user entry point address in %edi.\n\
movl %eax, %edi\n\
# Point %ebx at the GOT.\n\
call 0b\n\
addl $_GLOBAL_OFFSET_TABLE_, %ebx\n\
# See if we were run as a command with the executable file\n\
# name as an extra leading argument.\n\
movl _dl_skip_args@GOTOFF(%ebx), %eax\n\
# Pop the original argument count.\n\
popl %edx\n\
# Adjust the stack pointer to skip _dl_skip_args words.\n\
leal (%esp,%eax,4), %esp\n\
# Subtract _dl_skip_args from argc.\n\
subl %eax, %edx\n\
# Push argc back on the stack.\n\
push %edx\n\
# The special initializer gets called with the stack just\n\
# as the application’s entry point will see it; it can\n\
# switch stacks if it moves these contents over.\n\
” RTLD_START_SPECIAL_INIT “\n\
# Load the parameters again.\n\
# (eax, edx, ecx, *–esp) = (_dl_loaded, argc, argv, envp)\n\
movl _rtld_local@GOTOFF(%ebx), %eax\n\
leal 8(%esp,%edx,4), %esi\n\
leal 4(%esp), %ecx\n\
movl %esp, %ebp\n\
# Make sure _dl_init is run with 16 byte aligned stack.\n\
andl $-16, %esp\n\
pushl %eax\n\
pushl %eax\n\
pushl %ebp\n\
pushl %esi\n\
# Clear %ebp, so that even constructors have terminated backchain.\n\
xorl %ebp, %ebp\n\
# Call the function to run the initializers.\n\
call _dl_init_internal@PLT\n\
# Pass our finalizer function to the user in %edx, as per ELF ABI.\n\
leal _dl_fini@GOTOFF(%ebx), %edx\n\
# Restore %esp _start expects.\n\
movl (%esp), %esp\n\
# Jump to the user’s entry point.\n\
jmp *%edi\n\
This code begins by calling _dl_start which is written in C and is in debug/glibc-2.15-a316c1f/elf/rtld.c.
We can use strace (which traces system calls) to follow this startup code. Here is the output from that program:
$ strace ./hello ...
brk(0) = 0x85b0000
The call to brk(0) is a “trick” to determine the location of the program’s heap. (The heap is the memory area used for dynamic memory by the program.)
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
This call to access looks for a file called “/etc/ld.so.nohwcap” but the call returns the error, ENOENT, meaning the file does not exist.
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7704000
This call to mmap2 requests 8K of additional memory from the kernel which maps it into address 0xb7704000.
Note: There is a Linux security feature, called “address space layout randomization”, which is designed to deter certain forms of hacking by making it difficult to predict where code will reside. The result is that memory is allocated in different locations each time the program is run. For example, the above memory area was allocated by the kernel at 0xb7704000. However, on a previous run of the same program on the same system, this memory had been allocated at 0xb77b7000.
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
Looking for another file, “/etc/ld.so.preload” which also does not exist.
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=97882, ...}) = 0
mmap2(NULL, 97882, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76ec000
close(3) = 0
The above four system calls result in mapping the file /etc/ld.so.cache into memory at 0xb76ec000.
- open() opens the file.
- fstat64() returns, among other things, the size of the file (which will be used by the mmap2 call).
- mmap2() maps the file into memory.
- and close() closes the file.
ld.so.cache is a file that contains information about the location of system libraries within the file system. This file, which was created by the ldconfig utility program, is used to speed up the locating of standard shared libraries.
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\233\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1754876, ...}) = 0
mmap2(NULL, 1759868, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb753e000
mmap2(0xb76e6000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a8000) = 0xb76e6000
mmap2(0xb76e9000, 10876, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb76e9000
close(3) = 0
The above steps are where the C run-time library, libc.so, is actually mapped into memory. There are three calls to mmap2 for three portions of the file: executable instructions, constant data, and global variable data.
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb753d000
This system call adds an additional memory area immediately above the area used by the main program. This area will be used for various housekeeping information about the program.
set_thread_area({entry_number:-1 -> 6, base_addr:0xb753d940, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
This call to set_thread_area tells the kernel to set up a TLS (Thread Local Storage) data area. Note that the address is within the memory area most recently allocated.
mprotect(0xb76e6000, 8192, PROT_READ) = 0
mprotect(0x8049000, 4096, PROT_READ) = 0
mprotect(0xb7727000, 4096, PROT_READ) = 0
These mprotect statements change the memory protection to read-only for lib.so constants, the program’s constants, and ld.so’s own constants,
respectively.
munmap(0xb76ec000, 97882) = 0
The memory area previously mapped from ld.so.cache is now removed from memory by munmap.
At this point the shared libaries are mapped into memory. (In our case, there is only one, libc.so)
Control returns from _ld_start to _start which falls through to _dl_user_start which continues with this code previously shown from the RTLD_START macro:
# Pass our finalizer function to the user in %edx, as per ELF ABI.\n\
leal _dl_fini@GOTOFF(%ebx), %edx\n\
# Restore %esp _start expects.\n\
movl (%esp), %esp\n\
# Jump to the user’s entry point.\n\
jmp *%edi\n\
The jmp (jump) instruction at the end of the above code transfers to the entry point of our program. Our program’s entry point was passed from the kernel in the %eax register. It was previously moved to the %edi register by this code:
# Save the user entry point address in %edi.\n\
movl %eax, %edi\n\
With the jmp instruction, our program now begins at a routine called _start. (Note this is not the same as the _start function in ld.so; each instance of _start is defined locally within its own module.)
Beginning our Program Code — the C Run-time Library
Upon entry to the program (at _start) the following information has been provided to the program:
- The command-line arguments and environment variables are loaded into the top end of the stack memory area.
- The stack pointer is set just below the above data.
- argc and argv are then pushed onto the stack. These are the count and address of the command line arguments, respectively.
(The above three steps were done by the kernel as part of the execve processing.)
Our program begins with:
0x8048ba8 <_start> xor %ebp,%ebp
The xor instruction shown above (the first instruction of the program) sets the %ebp register to zero. This register is used to keep track of stack frames used by C functions, and setting this value to zero means this is the end of the set of stack frames.
0x8048baa <_start+2> pop %esi
0x8048bab <_start+3> mov %esp,%ecx
The above instructions get argc and argv from the stack to the %esi and %ecx registers respectively.
0x8048bad <_start+5> and $0xfffffff0,%esp
This makes sure the stack pointer is on a word boundary, i.e. on an address divisible by 16.
0x8048bb0 <_start+8> push %eax
0x8048bb1 <_start+9> push %esp <stack end>
0x8048bb2 <_start+10> push %edx <_dl_fini>
0x8048bb3 <_start+11> push $0x8049340 <__libc_csu_fini>
0x8048bb8 <_start+16> push $0x80492a0 <__libc_csu_init>
0x8048bbd <_start+21> push %ecx <argv> [saved above]
0x8048bbe <_start+22> push %esi <argc> [saved above]
0x8048bbf <_start+23> push $0x8048ce0 <main>
0x8048bc4 <_start+28> call 0x8048d00 <__libc_start_main>
The above instructions push the arguments for the subsequent function call onto the stack, and then call __libc_start_main, the first C-language code in the program.
0x8048bc9 <_start+33> hlt
This instruction would be executed if __libc_start_main returned to its caller, but that should never happen. If it did, hlt (halt) is a privileged instruction and will cause the program to fail.
The following code is from debug/glibc-2.15-a316c1f/csu/libc-start.c.
It shows the entry into __libc_start_main.
/* Note: the fini parameter is ignored here for shared library. It
is registered with __cxa_atexit. This had the disadvantage that
finalizers were called in more than one place. */
STATIC int
LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
int argc, char *__unbounded *__unbounded ubp_av,
#ifdef LIBC_START_MAIN_AUXVEC_ARG
ElfW(auxv_t) *__unbounded auxvec,
#endif
__typeof (main) init,
void (*fini) (void),
void (*rtld_fini) (void), void *__unbounded stack_end)
At this point a number of functions are called (not shown) that initialize the C run-time environment.
Beginning the main() Function
Next we see the following code:
/* Nothing fancy, just call the function. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
#endif
exit (result);
The above call to main is where the user’s code is started. (All C programs begin with a function named “main”.)
At this point our program’s code, beginning at main() starts to execute at last. Our program does whatever it was programmed to do (in our case print a message as we saw at the top of this article).
Program exit — Back to the C Run-time Library
The C program can terminate either by calling “exit” or by returning from the main function. Our example program (shown at the top of this article) does the latter. In that case the following code is executed from the run-time library after the return from main:
258 exit (result);
So either our program calls exit or the run-time library calls it for us after main returns.
Many people think exit is a system call, but it is actually a C library function in debug/glibc-2.15-a316c1f/stdlib/exit.c:
98 exit (int status)
99 {
100 __run_exit_handlers (status, &__exit_funcs, true);
101 }
As you can see, exit is very simple; it just calls __run_exit_handlers which looks like this:
/* Call all functions registered with `atexit' and `on_exit',
in the reverse of the order in which they were registered
perform stdio cleanup, and terminate program execution with STATUS. */
void
attribute_hidden
__run_exit_handlers (int status, struct exit_function_list **listp,
bool run_list_atexit)
{
/* We do it this way to handle recursive calls to exit () made by
the functions registered with `atexit' and `on_exit'. We call
everyone on the list and use the status value in the last
exit (). */
...
_exit (status);
}
(The omitted code simply loops through calling a list of functions to handle process cleanup before it exits.)
The _exit() System Call — Back to the Kernel
As you can see above, the last thing this function does is call _exit, which really is a system call, defined in debug/glibc-2.15-a316c1f/sysdeps/unix/sysv/linux/i386/_exit.S:
.text
.type _exit,@function
.global _exit
_exit:
movl 4(%esp), %ebx
/* Try the new syscall first. */
#ifdef __NR_exit_group
movl $__NR_exit_group, %eax
ENTER_KERNEL
#endif
/* Not available. Now the old one. */
movl $__NR_exit, %eax
/* Don't bother using ENTER_KERNEL here. If the exit_group
syscall is not available AT_SYSINFO isn't either. */
int $0x80
/* This must not fail. Be sure we don't return. */
hlt
.size _exit,.-_exit
This code places the intended system call number (NR_exit_group) into the %eax register to pass it to the kernel. It then invokes ENTER_KERNEL, which is an assembler macro that expands to this machine-language code:
0x42cc57ad <_exit+9> call *%gs:0x10
which calls this machine-language code, the typical sequence for calling a system service in the kernel.
0xb7fff414 <__kernel_vsyscall> push %ecx
0xb7fff415 <__kernel_vsyscall+1> push %edx
0xb7fff416 <__kernel_vsyscall+2> push %ebp
0xb7fff417 <__kernel_vsyscall+3> mov %esp,%ebp
0xb7fff419 <__kernel_vsyscall+5> sysenter
The sysenter instruction will cause a transition to kernel mode.
Usually the kernel returns to the user’s program when it is finished, but in the case of _exit the kernel does not return. Instead, it deletes the memory space occupied by our program. The process it was running in will be marked “defunct”.
waitpid() — Back to the Shell
Everything said above starting with “execve() System Call” was performed in the child process started by the fork() system call. Meanwhile the parent process (the shell in our example) will continue processing. In general the parent will want to wait for the child to finish. To do that, the shell calls the system service, waitpid:
pid = waitpid (-1, &status, waitpid_flags);
Once waitpid returns to the process’s parent, both the child process and the program it was running are gone from memory.
Thus the program’s execution life cycle is ended.
Other Ways a Program May Terminate
In addition to calling exit to terminate the process,, as was described above, there are alternative ways in which a program may terminate:
- The process can be abnormally terminated due to an error or action by another program or by the user. (In that case the C library cleanup code by the exit() call will not be performed.)
- The program can issue another execve call which will start a new program in place of the current program.
Conclusion
Here is a summary of the steps described in this article. .
- Frequently, but not always, some process (often a shell) forks a new process for the new program.
- In the process that is to run the new program, the existing program calls execve.
- The kernel releases the old program’s address space and begins building a new address space.
- The kernel maps the program into the new address space.
- If the program uses dynamic libraries, then the kernel maps ld.so into the new address space.
- If the program uses dynamic libraries, the kernel gives control to ld.so within the process context of the new program. ld.so then causes any shared libraries to be mapped into memory.
- ld.so then transfers control to the new program for the first time at the label, _start.
- _start saves some input parameters from the system and then call _libc_start_main which initializes the C run time library.
- _libc_start_c_main calls main, the beginning of the application program code.
- The program runs until:
- it terminates by calling exit or returning from the main function. (Continue to step 11.)
- it calls execve to begin a new program. (Go back to step 2.)
- The process is abnormally terminated. (Skip to step 13.)
- The C run-time library cleans up.
- The run-time library calls _exit, the system call that terminates the process.
- The kernel releases the memory and other resources of the just-terminated process.
- The parent process issues a wait call for the process that just terminated.
- The kernel releases the terminated process’s task structure, the last remnant of the program.
I just ran across this web page, older than mine, but also analyzes the ubiquitous “hello world” program; in this case it looks at what happens when the program calls printf().
http://www.osteras.info/personal/2013/10/11/hello-world-analysis.html