Just want to write about, how a C program becomes a process.
The lifecycle of a code , right from writing into an IDE to becoming an executable has below mentioned steps.
PreProcesing : is the first pass of any C compilation. It processes include-files, conditional compilation instructions and macros.
Compilation : is the second pass. It takes the output of the preprocessor, and the source code, and generates assembler source code.
Assembly : is the third stage of compilation. It takes the assembly source code and produces an assembly listing with offsets. The assembler output is stored in an object file.
Linking : is the final stage of compilation. It takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. In doing so, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses (a process called relocation).
An executable can be made up of a number of source files which can be compiled and assembled into their object files respectively, independently.
- In a typical system, a number of programs will be running. Each program relies on a number of functions, some of which will be standard C library functions, like printf(), malloc(), strcpy(), etc. and some are non-standard or user defined functions.
- If every program uses the standard C library, it means that each program would normally have a unique copy of this particular library present within it. Unfortunately, this result in wasted resources, degrade the efficiency and performance.
- Since the C library is common, it is better to have each program reference the common, one instance of that library, instead of having each program contain a copy of the library.
- This is implemented during the linking process where some of the objects are linked during the link time whereas some done during the run time (deferred/dynamic linking).
- The term ‘statically linked’ means that the program and the particular library that it’s linked against are combined together by the linker at link time.
- This means that the binding between the program and the particular library is fixed and known at link time before the program run. It also means that we can’t change this binding, unless we re-link the program with a new version of the library.
- Programs that are linked statically are linked against archives of objects (libraries) that typically have the extension of .a. An example of such a collection of objects is the standard C library, libc.a.
- You might consider linking a program statically for example, in cases where you weren’t sure whether the correct version of a library will be available at runtime, or if you were testing a new version of a library that you don’t yet want to install as shared.
- For gcc, the –static option can be used during the compilation/linking of the program.
gcc –static filename.c –o filename
- The drawback of this technique is that the executable is quite big in size, all the needed information need to be brought together.
- The term ‘dynamically linked’ means that the program and the particular library it references are not combined together by the linker at link time.
- Instead, the linker places information into the executable that tells the loader which shared object module the code is in and which runtime linker should be used to find and bind the references.
- This means that the binding between the program and the shared object is done at runtime that is before the program starts, the appropriate shared objects are found and bound.
- This type of program is called a partially bound executable, because it isn’t fully resolved. The linker, at link time, didn’t cause all the referenced symbols in the program to be associated with specific code from the library.
- Instead, the linker simply said something like: “This program calls some functions within a particular shared object, so I’ll just make a note of which shared object these functions are in, and continue on”.
- Symbols for the shared objects are only verified for their validity to ensure that they do exist somewhere and are not yet combined into the program.
- The linker stores in the executable program, the locations of the external libraries where it found the missing symbols. Effectively, this defers the binding until runtime.
- Programs that are linked dynamically are linked against shared objects that have the extension .so. An example of such an object is the shared object version of the standard C library, libc.so.
- The advantageous to defer some of the objects/modules during the static linking step until they are finally needed (during the run time) includes:
- Program files (on disk) become much smaller because they need not hold all necessary text and data segments information. It is very useful for portability.
- Standard libraries may be upgraded or patched without every one program need to be re-linked. This clearly requires some agreed module-naming convention that enables the dynamic linker to find the newest, installed module such as some version specification. Furthermore the distribution of the libraries is in binary form (no source), including dynamically linked libraries (DLLs) and when you change your program you only have to recompile the file that was changed.
- Software vendors need only provide the related libraries module required. Additional runtime linking functions allow such programs to programmatically-link the required modules only.
- In combination with virtual memory, dynamic linking permits two or more processes to share read-only executable modules such as standard C libraries. Using this technique, only one copy of a module needs be resident in memory at any time, and multiple processes, each can executes this shared code (read only). This results in a considerable memory saving, although demands an efficient swapping policy.