Saturday, 15 October 2016

Compiler Terminology1

checking for meaning
This is an alternative phrase for semantic analysis.
comment
Part of a source program which is intended for a human reader and which is ignored by the compiler. As a minimum, all variables and all routines should have explanatory comments. A comment may attempt to explain nearby code, but problems ensue if the code is updated and the comment is not. Ideally, comments should tell WHAT code does, not HOW; the code should be self-explanatory.
compilation
The act of translation performed by a compiler.
compiler
A computer program that translates a computer program written in one computer language (called the source language) into an equivalent program written in another computer language (called the target language).
compile-time
The period during which a program is being compiled, as distinct from run-time when a program is actually running. Certain errors can be detected at compile-time (e.g. misspelt identifier) while other errors may not be detectable until run-time (e.g. division by 0).
GNU
A recursive acronym "GNU's Not Unix". Independently written Unix lookalike, which can be combined with Linux kernel to produce GNU/Linux - a useful free operating system.
GPL
GNU General Public Licence. Much copyrighted free software, including GNU/Linux, is available under this licence. The general intention is to ensure that such software remains freely available for anyone to use and/or modify.
identifier
A name used in a program. Normally some (possibly unpronouncable) combination of letters and digits which starts with a letter. Some programming languages may allow other characters such as underline '_' or hash '#'. Identifiers may be defined by the programmer or they may be pre-defined by the language (e.g. 'sqrt' is often a pre-defined identifier for referring to a 'square-root' function). Older programming languages used to limit identifiers to a maximum of 6 characters - the resulting abbreviations tended to make programs more difficult to understand. Many programming languages allow identifiers of almost any length, though some may only take the first 32 characters into account. There are two main styles of writing long identifiers: either now_for_some_long_name or NowForSomeLongName.
implementation language
This is the programming language in which the compiler or interpreter is written. It might be the same as either the source language or the target language.
intermediate language
Some moderately low-level language used as an interface between the front-end and the back-end of a compiler.
interpreter
A computer program which examines a computer program written in some source language and carries out the actions required by that program more or less directly, without translating it into some other language. May well be slower than a compiled program, especially if there is a lot of calculation.
ut problems ensue if the code is updated and the comment is not. Ideally, comments should tell WHAT code does, not HOW; the code should be self-explanatory.
compilation
The act of translation performed by a compiler.
compiler
A computer program that translates a computer program written in one computer language (called the source language) into an equivalent program written in another computer language (called the target language).
compile-time
The period during which a program is being compiled, as distinct from run-time when a program is actually running. Certain errors can be detected at compile-time (e.g. misspelt identifier) while other errors may not be detectable until run-time (e.g. division by 0).
GNU
A recursive acronym "GNU's Not Unix". Independently written Unix lookalike, which can be combined with Linux kernel to produce GNU/Linux - a useful free operating system.
GPL
GNU General Public Licence. Much copyrighted free software, including GNU/Linux, is available under this licence. The general intention is to ensure that such software remains freely available for anyone to use and/or modify.
identifier
A name used in a program. Normally some (possibly unpronouncable) combination of letters and digits which starts with a letter. Some programming languages may allow other characters such as underline '_' or hash '#'. Identifiers may be defined by the programmer or they may be pre-defined by the language (e.g. 'sqrt' is often a pre-defined identifier for referring to a 'square-root' function). Older programming languages used to limit identifiers to a maximum of 6 characters - the resulting abbreviations tended to make programs more difficult to understand. Many programming languages allow identifiers of almost any length, though some may only take the first 32 characters into account. There are two main styles of writing long identifiers: either now_for_some_long_name or NowForSomeLongName.
implementation language
This is the programming language in which the compiler or interpreter is written. It might be the same as either the source language or the target language.
intermediate language
Some moderately low-level language used as an interface between the front-end and the back-end of a compiler.
interpreter
A computer program which examines a computer program written in some source language and carries out the actions required by that program more or less directly, without translating it into some other language. May well be slower than a compiled program, especially if there is a lot of calculation.

J to R[edit]

just-in-time compilation
A cross between a compiler and an interpreter. The source language is parsed in real-time and translated to machine language which is run immediately. The underlying machine runs the code whereas with an interpreter, it does the work itself.
keyword
Many programming languages reserve some identifiers as keywords for use when indicating the structure of a program, e.g. if is often used to indicate some conditional code. Languages such as Pascal/C/C++ have around 50 reserved keywords, Fortran doesn't have any, COBOL has around 300.
lexical analysis
This is an alternative name for scanning or tokenisation. The function of lexical analysis is to scan the source program (a sequence of characters arranged on lines) and convert it to a sequence of valid tokens. Any comments are usually removed at this stage as well. Syntax analysis is responsible for checking that it is a valid sequence.
machine language
This is the lowest level language. It consists of just binary digits. It was only ever used when computers were first invented to create the first compilers.
parsing
This is an alternative name for syntax analysis.
pass
The number of times that the source text has to be scanned or rescanned in order to compile the program. Due to limited main memory, some early compilers had a large number of passes (about 60).
pragma
This is sometimes referred to as a significant comment. The intention is that of passing directives to the compiler or the preprocessor at various points in the source program. Such directives might indicate whether speed or space is more important, or if certain checks should be suppressed, or if a particular routine is to be timed, etc.
preprocessor
A program that takes text and performs lexical conversions on it. The conversions may include macro substitution, conditional inclusion, and inclusion of other files. This can be used to write platform-independent code by excluding source files that are not necessary on certain platforms without changing the build instructions.
reserved word
This is an alternative phrase for keyword.
run-time
The period when a program is running/being executed/doing some hopefully useful work, as distinct from compile-time when the program is being translated.
scanning
This is an alternative word for lexical analysis or tokenisation.
semantic analysis
The function of semantic analysis is to check that the source program is meaningful. Note that a program can have a valid meaning and still be incorrect if it doesn't do what was really intended.
source language
The language accepted as input by a compiler, and translated/compiled into a target language. It is normally a high-level language, written using some mixture of English words and mathematical notation.
syntax analysis
This is an alternative name for parsing. The function of syntax analysis is to check that the source program is grammatically correct, i.e. that we have a valid sequence oftokens. Checking that the source program actually means something is the job of semantic analysis.
target language
The language in which the output of a compiler is written. It is normally a low-level language such as assembler, written with somewhat cryptic abbreviations for machine instructions, but may instead be machine code for some actual or virtual computer.
token
A fundamental symbol as processed by syntax analysis. A token may be an identifier 'Result', a reserved keyword if, a compound symbol '<=', or a single character '+'.
tokenisation
This is an alternative word for lexical analysis or scanning. It means 'conversion to tokens'.

No comments:

Post a Comment