A compiler is a program that translates source code into object code to be understood by a specific central processing unit (CPU). The act of translating source code into object code is known as compilation. Compilation is typically used for programs that translate source code from a high-level programming language (such as C++) to a low-level programming language (such as machine code) to create an executable program. Likewise, when a low-level language is converted into a high-level language, the process is called decompilation.
Phases of a compiler
A compiler executes its processes in phases to promote efficient design and correct transformations of source input to target output. The phases are as follows:
1. Lexical Analyzer
It is also called a scanner. The compiler converts the sequence of characters that appear in the source code into a series of string characters known as tokens. These tokens are defined by regular expressions which are understood by the lexical analyzer. It also removes lexical errors, comments, and whitespace.
2. Syntax Analyzer
The syntax analyzer constructs the parse tree, which is constructed to check for ambiguity in the given grammar. The syntax analyzer takes all tokens one by one and uses Context Free Grammar to construct the parse tree. Syntax error can be detected if the input is not in accordance with the grammar.
3. Semantic Analyzer
The semantic analyzer verifies the parse tree constructed by the syntax analyzer. It also does type checking, label checking, and flow control checking.
4. Intermediate Code Generator
The intermediate code generator generates intermediate code for execution by a machine. Intermediate code is converted into machine language using the last two phases, which are platform dependent.
5. Code Optimizer
The code optimizer transforms the code so that it consumes fewer resources and produces more speed. The meaning of the code that is being transformed is not altered.
6. Target Code Generator
This is the final step in the final stage of compilation. The target code generator writes code that a machine can understand and also registers allocation, instruction, and selection. The output is dependent on the type of assembler. The optimized code is then converted into machine code, forming the input to the linker and loader.
Types of compilers
There are many types of compilers, such as:
- Cross compiler: The compiled program runs on a computer that has a different operating system or CPU from the one which the compiler runs on. It’s capable of creating code for a platform other than the one on which the compiler is running
- Source-to-source compiler: Also known as a transcompiler, it translates source code written in one programming language into source code of another programming language.