Principal An Introduction to GCC
Due to the technical work on the site downloading books (as well as file conversion and sending books to email/kindle) may be unstable from May, 27 to May, 28 Also, for users who have an active donation now, we will extend the donation period.

An Introduction to GCC

,
I've been reading and using this little book for the past few weeks. I'm an experienced C programmer but I wanted to switch to GCC having abandoned both Borland (my version was getting quite old)
Año:
2004
Editorial:
Network Theory Ltd.
Idioma:
english
Páginas:
124
ISBN 10:
0954161793
ISBN 13:
9780954161798
File:
PDF, 526 KB
Descarga (pdf, 526 KB)

You may be interested in Powered by Rec2Me

 

The Definitive Guide to GCC

Año:
2006
Idioma:
english
File:
PDF, 11.33 MB

An Introduction to GCC

Año:
2004
Idioma:
english
File:
PDF, 1.40 MB

Most frequently terms

 
 
You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
An Introduction to GCC
for the GNU Compilers gcc and g++

Brian Gough
Foreword by Richard M. Stallman

A catalogue record for this book is available from the British Library.
First printing, March 2004 (7/3/2004).
Published by Network Theory Limited.
15 Royal Park
Bristol
BS8 3AL
United Kingdom
Email: info@network-theory.co.uk
ISBN 0-9541617-9-3
Further information about this book is available from
http://www.network-theory.co.uk/gcc/intro/
Cover Image: From a layout of a fast, energy-efficient hardware stack.(1)
Image created with the free Electric VLSI design system by Steven Rubin
of Static Free Software (www.staticfreesoft.com). Static Free Software
provides support for Electric to the electronics design industry.
c 2004 Network Theory Ltd.
Copyright °
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2
or any later version published by the Free Software Foundation; with no
Invariant Sections, with the Front-Cover Texts being “A Network Theory
Manual”, and with the Back-Cover Texts as in (a) below. A copy of
the license is included in the section entitled “GNU Free Documentation
License”.
(a) The Back-Cover Text is: “The development of this manual was funded
entirely by Network Theory Ltd. Copies published by Network Theory
Ltd raise money for more free documentation.”
The Texinfo source for this manual may be obtained from:
http://www.network-theory.co.uk/gcc/intro/src/
(1)

“A Fast and Energy-Efficient Stack” by J. Ebergen, D. Finchelstein, R. Kao,
J. Lexau and R. Hopkins.

i

Table of Contents
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1
1.2
1.3
1.4

2

3
4
4
5

Compiling a C program . . . . . . . . . . . . . . 7
2.1
2.2
2.3
2.4

2.5
2.6
2.7

3

A brief history of GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Major features of GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Progr; amming in C and C++ . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conventions used in this manual. . . . . . . . . . . . . . . . . . . . . . .

Compiling a simple C program . . . . . . . . . . . . . . . . . . . . . . . . 7
Finding errors in a simple program . . . . . . . . . . . . . . . . . . . . 8
Compiling multiple source files . . . . . . . . . . . . . . . . . . . . . . . . 9
Compiling files independently . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Creating object files from source files . . . . . . . . 11
2.4.2 Creating executables from object files. . . . . . . . 11
2.4.3 Link order of object files. . . . . . . . . . . . . . . . . . . . 12
Recompiling and relinking . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Linking with external libraries . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.1 Link order of libraries . . . . . . . . . . . . . . . . . . . . . . 15
Using library header files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Compilation options . . . . . . . . . . . . . . . . . 19
3.1

3.2
3.3

3.4
3.5

Setting search paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Search path example . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Environment variables . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Extended search paths . . . . . . . . . . . . . . . . . . . . .
Shared libraries and static libraries . . . . . . . . . . . . . . . . . . .
C language standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 ANSI/ISO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.2 Strict ANSI/ISO . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.3 Selecting specific standards . . . . . . . . . . . . . . . . .
Warning options in -Wall . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Additional warning options . . . . . . . . . . . . . . . . . . . . . . . . . .

19
20
21
22
23
25
26
28
28
29
30

ii

An Introduction to GCC

4

Using the preprocessor . . . . . . . . . . . . . . 35
4.1
4.2
4.3

5

Compiling for debugging . . . . . . . . . . . . 41
5.1
5.2

6

6.2
6.3
6.4
6.5
6.6
6.7

45
45
46
47
47
49
49
50
52
53

Compiling a simple C++ program . . . . . . . . . . . . . . . . . . . . .
Using the C++ standard library. . . . . . . . . . . . . . . . . . . . . . .
Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.1 Using C++ standard library templates . . . . . . .
7.3.2 Providing your own templates. . . . . . . . . . . . . . .
7.3.3 Explicit template instantiation . . . . . . . . . . . . . .
7.3.4 The export keyword . . . . . . . . . . . . . . . . . . . . . . .

55
56
57
57
58
60
61

Platform-specific options . . . . . . . . . . . . 63
8.1
8.2
8.3
8.4
8.5

9

Source-level optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1.1 Common subexpression elimination. . . . . . . . . .
6.1.2 Function inlining. . . . . . . . . . . . . . . . . . . . . . . . . . .
Speed-space tradeoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1 Loop unrolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Optimization levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Optimization and debugging . . . . . . . . . . . . . . . . . . . . . . . . .
Optimization and compiler warnings . . . . . . . . . . . . . . . . . .

Compiling a C++ program . . . . . . . . . . . 55
7.1
7.2
7.3

8

Examining core files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Displaying a backtrace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Compiling with optimization . . . . . . . . . 45
6.1

7

Defining macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Macros with values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Preprocessing source files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Intel and AMD x86 options . . . . . . . . . . . . . . . . . . . . . . . . . .
DEC Alpha options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SPARC options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
POWER/PowerPC options . . . . . . . . . . . . . . . . . . . . . . . . . .
Multi-architecture support . . . . . . . . . . . . . . . . . . . . . . . . . . .

63
64
65
65
66

Troubleshooting. . . . . . . . . . . . . . . . . . . . . 69
9.1
9.2
9.3

Help for command-line options . . . . . . . . . . . . . . . . . . . . . . . 69
Version numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Verbose compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

iii

10

Compiler-related tools. . . . . . . . . . . . . . 73
10.1
10.2
10.3

11

How the compiler works . . . . . . . . . . . . 81
11.1
11.2
11.3
11.4
11.5

12

An overview of the compilation process . . . . . . . . . . . . . .
The preprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The assembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The linker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81
81
82
83
83

Examining compiled files . . . . . . . . . . . 85
12.1
12.2
12.3

13

Creating a library with the GNU archiver . . . . . . . . . . . . 73
Using the profiler gprof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Coverage testing with gcov . . . . . . . . . . . . . . . . . . . . . . . . . 77

Identifying files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Examining the symbol table . . . . . . . . . . . . . . . . . . . . . . . . 86
Finding dynamically linked libraries . . . . . . . . . . . . . . . . . 86

Getting help . . . . . . . . . . . . . . . . . . . . . . . 89

Further reading . . . . . . . . . . . . . . . . . . . . . . . . 91
Acknowledgements . . . . . . . . . . . . . . . . . . . . . 93
Other books from the publisher . . . . . . . . . 95
Free software organizations . . . . . . . . . . . . . 97
GNU Free Documentation License . . . . . . 99
ADDENDUM: How to use this License for your
documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

iv

An Introduction to GCC

Foreword

1

Foreword
This foreword has been kindly contributed by Richard M. Stallman, the
principal author of GCC and founder of the GNU Project.
This book is a guide to getting started with GCC, the GNU Compiler
Collection. It will tell you how to use GCC as a programming tool. GCC
is a programming tool, that’s true—but it is also something more. It is
part of a 20-year campaign for freedom for computer users.
We all want good software, but what does it mean for software to
be “good”? Convenient features and reliability are what it means to be
technically good, but that is not enough. Good software must also be
ethically good: it has to respect the users’ freedom.
As a user of software, you should have the right to run it as you see
fit, the right to study the source code and then change it as you see fit,
the right to redistribute copies of it to others, and the right to publish a
modified version so that you can contribute to building the community.
When a program respects your freedom in this way, we call it free software.
Before GCC, there were other compilers for C, Fortran, Ada, etc. But
they were not free software; you could not use them in freedom. I wrote
GCC so we could use a compiler without giving up our freedom.
A compiler alone is not enough—to use a computer system, you need
a whole operating system. In 1983, all operating system for modern computers were non-free. To remedy this, in 1984 I began developing the
GNU operating system, a Unix-like system that would be free software.
Developing GCC was one part of developing GNU.
By the early 90s, the nearly-finished GNU operating system was completed by the addition of a kernel, Linux, that became free software in
1992. The combined GNU/Linux operating system has achieved the goal
of making it possible to use a computer in freedom. But freedom is never
automatically secure, and we need to work to defend it. The Free Software
Movement needs your support.
Richard M. Stallman
February 2004

2

An Introduction to GCC

Chapter 1: Introduction

3

1 Introduction
The purpose of this book is to explain the use of the GNU C and C++
compilers, gcc and g++. After reading this book you should understand
how to compile a program, and how to use basic compiler options for
optimization and debugging. This book does not attempt to teach the C
or C++ languages themselves, since this material can be found in many
other places (see [Further reading], page 91).
Experienced programmers who are familiar with other systems, but
new to the GNU compilers, can skip the early sections of the chapters
“Compiling a C program”, “Using the preprocessor” and “Compiling a
C++ program”. The remaining sections and chapters should provide a
good overview of the features of GCC for those already know how to use
other compilers.

1.1 A brief history of GCC
The original author of the GNU C Compiler (GCC) is Richard Stallman,
the founder of the GNU Project.
The GNU project was started in 1984 to create a complete Unix-like
operating system as free software, in order to promote freedom and cooperation among computer users and programmers. Every Unix-like operating system needs a C compiler, and as there were no free compilers in
existence at that time, the GNU Project had to develop one from scratch.
The work was funded by donations from individuals and companies to the
Free Software Foundation, a non-profit organization set up to support the
work of the GNU Project.
The first release of GCC was made in 1987. This was a significant
breakthrough, being the first portable ANSI C optimizing compiler released as free software. Since that time GCC has become one of the most
important tools in the development of free software.
A major revision of the compiler came with the 2.0 series in 1992,
which added the ability to compile C++. In 1997 an experimental branch
of the compiler (EGCS) was created, to improve optimization and C++
support. Following this work, EGCS was adopted as the new main-line of
GCC development, and these features became widely available in the 3.0
release of GCC in 2001.
Over time GCC has been extended to support many additional languages, including Fortran, ADA, Java and Objective-C. The acronym

4

An Introduction to GCC

GCC is now used to refer to the “GNU Compiler Collection”. Its development is guided by the GCC Steering Committee, a group composed
of representatives from GCC user communities in industry, research and
academia.

1.2 Major features of GCC
This section describes some of the most important features of GCC.
First of all, GCC is a portable compiler—it runs on most platforms
available today, and can produce output for many types of processors. In
addition to the processors used in personal computers, it also supports
microcontrollers, DSPs and 64-bit CPUs.
GCC is not only a native compiler—it can also cross-compile any program, producing executable files for a different system from the one used
by GCC itself. This allows software to be compiled for embedded systems
which are not capable of running a compiler. GCC is written in C with
a strong focus on portability, and can compile itself, so it can be adapted
to new systems easily.
GCC has multiple language frontends, for parsing different languages.
Programs in each language can be compiled, or cross-compiled, for any
architecture. For example, an ADA program can be compiled for a microcontroller, or a C program for a supercomputer.
GCC has a modular design, allowing support for new languages and
architectures to be added. Adding a new language front-end to GCC
enables the use of that language on any architecture, provided that the
necessary run-time facilities (such as libraries) are available. Similarly,
adding support for a new architecture makes it available to all languages.
Finally, and most importantly, GCC is free software, distributed under
the GNU General Public License (GNU GPL).(1) This means you have
the freedom to use and to modify GCC, as with all GNU software. If you
need support for a new type of CPU, a new language, or a new feature
you can add it yourself, or hire someone to enhance GCC for you. You
can hire someone to fix a bug if it is important for your work.
Furthermore, you have the freedom to share any enhancements you
make to GCC. As a result of this freedom you can also make use of
enhancements to GCC developed by others. The many features offered
by GCC today show how this freedom to cooperate works to benefit you,
and everyone else who uses GCC.
(1)

For details see the license file ‘COPYING’ distributed with GCC.

Chapter 1: Introduction

5

1.3 Programming in C and C++
C and C++ are languages that allow direct access to the computer’s memory. Historically, they have been used for writing low-level systems software, and applications where high-performance or control over resource
usage are critical. However, great care is required to ensure that memory is accessed correctly, to avoid corrupting other data-structures. This
book describes techniques that will help in detecting potential errors during compilation, but the risk in using languages like C or C++ can never
be eliminated.
In addition to C and C++ the GNU Project also provides other highlevel languages, such as GNU Common Lisp (gcl), GNU Smalltalk (gst),
the GNU Scheme extension language (guile) and the GNU Compiler for
Java (gcj). These languages do not allow the user to access memory
directly, eliminating the possibility of memory access errors. They are a
safer alternative to C and C++ for many applications.

1.4 Conventions used in this manual
This manual contains many examples which can be typed at the keyboard.
A command entered at the terminal is shown like this,
$ command
followed by its output. For example:
$ echo "hello world"
hello world
The first character on the line is the terminal prompt, and should not be
typed. The dollar sign ‘$’ is used as the standard prompt in this manual,
although some systems may use a different character.
When a command in an example is too long to fit in a single line it is
wrapped and then indented on subsequent lines, like this:
$ echo "an example of a line which is too long to fit
in this manual"
When entered at the keyboard, the entire command should be typed on
a single line.
The example source files used in this manual can be downloaded from
the publisher’s website,(2) or entered by hand using any text editor, such
as the standard GNU editor, emacs. The example compilation commands
use gcc and g++ as the names of the GNU C and C++ compilers, and cc
to refer to other compilers. The example programs should work with any
(2)

See http://www.network-theory.co.uk/gcc/intro/

6

An Introduction to GCC

version of GCC. Any command-line options which are only available in
recent versions of GCC are noted in the text.
The examples assume the use of a GNU operating system—there may
be minor differences in the output on other systems. Some non-essential
and verbose system-dependent output messages (such as very long system
paths) have been edited in the examples for brevity. The commands for
setting environment variables use the syntax of the standard GNU shell
(bash), and should work with any version of the Bourne shell.

Chapter 2: Compiling a C program

7

2 Compiling a C program
This chapter describes how to compile C programs using gcc. Programs
can be compiled from a single source file or from multiple source files, and
may use system libraries and header files.
Compilation refers to the process of converting a program from the
textual source code, in a programming language such as C or C++, into
machine code, the sequence of 1’s and 0’s used to control the central
processing unit (CPU) of the computer. This machine code is then stored
in a file known as an executable file, sometimes referred to as a binary
file.

2.1 Compiling a simple C program
The classic example program for the C language is Hello World. Here is
the source code for our version of the program:
#include <stdio.h>
int
main (void)
{
printf ("Hello, world!\n");
return 0;
}
We will assume that the source code is stored in a file called ‘hello.c’.
To compile the file ‘hello.c’ with gcc, use the following command:
$ gcc -Wall hello.c -o hello
This compiles the source code in ‘hello.c’ to machine code and stores
it in an executable file ‘hello’. The output file for the machine code is
specified using the ‘-o’ option. This option is usually given as the last
argument on the command line. If it is omitted, the output is written to
a default file called ‘a.out’.
Note that if a file with the same name as the executable file already
exists in the current directory it will be overwritten.
The option ‘-Wall’ turns on all the most commonly-used compiler
warnings—it is recommended that you always use this option! There are
many other warning options which will be discussed in later chapters, but
‘-Wall’ is the most important. GCC will not produce any warnings unless

8

An Introduction to GCC

they are enabled. Compiler warnings are an essential aid in detecting
problems when programming in C and C++.
In this case, the compiler does not produce any warnings with the
‘-Wall’ option, since the program is completely valid. Source code which
does not produce any warnings is said to compile cleanly.
To run the program, type the path name of the executable like this:
$ ./hello
Hello, world!
This loads the executable file into memory and causes the CPU to begin
executing the instructions contained within it. The path ./ refers to the
current directory, so ./hello loads and runs the executable file ‘hello’
located in the current directory.

2.2 Finding errors in a simple program
As mentioned above, compiler warnings are an essential aid when programming in C and C++. To demonstrate this, the program below contains a subtle error: it uses the function printf incorrectly, by specifying
a floating-point format ‘%f’ for an integer value:
#include <stdio.h>
int
main (void)
{
printf ("Two plus two is %f\n", 4);
return 0;
}
This error is not obvious at first sight, but can be detected by the compiler
if the warning option ‘-Wall’ has been enabled.
Compiling the program above, ‘bad.c’, with the warning option
‘-Wall’ produces the following message:
$ gcc -Wall bad.c -o bad
bad.c: In function ‘main’:
bad.c:6: warning: double format, different
type arg (arg 2)
This indicates that a format string has been used incorrectly in the file
‘bad.c’ at line 6. The messages produced by GCC always have the form
file:line-number:message. The compiler distinguishes between error messages, which prevent successful compilation, and warning messages which
indicate possible problems (but do not stop the program from compiling).
In this case, the correct format specifier would have been ‘%d’ (the
allowed format specifiers for printf can be found in any general book on

Chapter 2: Compiling a C program

9

C, such as the GNU C Library Reference Manual, see [Further reading],
page 91).
Without the warning option ‘-Wall’ the program appears to compile
cleanly, but produces incorrect results:
$ gcc bad.c -o bad
$ ./bad
Two plus two is 2.585495
(incorrect output)
The incorrect format specifier causes the output to be corrupted, because
the function printf is passed an integer instead of a floating-point number. Integers and floating-point numbers are stored in different formats
in memory, and generally occupy different numbers of bytes, leading to a
spurious result. The actual output shown above may differ, depending on
the specific platform and environment.
Clearly, it is very dangerous to develop a program without checking
for compiler warnings. If there are any functions which are not used
correctly they can cause the program to crash, or to produce incorrect
results. Turning on the compiler warning option ‘-Wall’ will catch many
of the commonest errors which occur in C programming.

2.3 Compiling multiple source files
A program can be split up into multiple files. This makes it easier to edit
and understand, especially in the case of large programs—it also allows
the individual parts to be compiled independently.
In the following example we will split up the program Hello World into
three files: ‘main.c’, ‘hello_fn.c’ and the header file ‘hello.h’. Here is
the main program ‘main.c’:
#include "hello.h"
int
main (void)
{
hello ("world");
return 0;
}
The original call to the printf system function in the previous program
‘hello.c’ has been replaced by a call to a new external function hello,
which we will define in a separate file ‘hello_fn.c’.
The main program also includes the header file ‘hello.h’ which will
contain the declaration of the function hello. The declaration is used
to ensure that the types of the arguments and return value match up
correctly between the function call and the function definition. We no

10

An Introduction to GCC

longer need to include the system header file ‘stdio.h’ in ‘main.c’ to
declare the function printf, since the file ‘main.c’ does not call printf
directly.
The declaration in ‘hello.h’ is a single line specifying the prototype
of the function hello:
void hello (const char * name);
The definition of the function hello itself is contained in the file
‘hello_fn.c’:
#include <stdio.h>
#include "hello.h"
void
hello (const char * name)
{
printf ("Hello, %s!\n", name);
}
This function prints the message “Hello, name!” using its argument as
the value of name.
Incidentally, the difference between the two forms of the include statement #include "FILE.h" and #include <FILE.h> is that the former
searches for ‘FILE.h’ in the current directory before looking in the system header file directories. The include statement #include <FILE.h>
searches the system header files, but does not look in the current directory by default.
To compile these source files with gcc, use the following command:
$ gcc -Wall main.c hello_fn.c -o newhello
In this case, we use the ‘-o’ option to specify a different output file for
the executable, ‘newhello’. Note that the header file ‘hello.h’ is not
specified in the list of files on the command line. The directive #include
"hello.h" in the source files instructs the compiler to include it automatically at the appropriate points.
To run the program, type the path name of the executable:
$ ./newhello
Hello, world!
All the parts of the program have been combined into a single executable
file, which produces the same result as the executable created from the
single source file used earlier.

Chapter 2: Compiling a C program

11

2.4 Compiling files independently
If a program is stored in a single file then any change to an individual
function requires the whole program to be recompiled to produce a new
executable. The recompilation of large source files can be very timeconsuming.
When programs are stored in independent source files, only the files
which have changed need to be recompiled after the source code has been
modified. In this approach, the source files are compiled separately and
then linked together—a two stage process. In the first stage, a file is
compiled without creating an executable. The result is referred to as an
object file, and has the extension ‘.o’ when using GCC.
In the second stage, the object files are merged together by a separate
program called the linker. The linker combines all the object files together
to create a single executable.
An object file contains machine code where any references to the memory addresses of functions (or variables) in other files are left undefined.
This allows source files to be compiled without direct reference to each
other. The linker fills in these missing addresses when it produces the
executable.

2.4.1 Creating object files from source files
The command-line option ‘-c’ is used to compile a source file to an object
file. For example, the following command will compile the source file
‘main.c’ to an object file:
$ gcc -Wall -c main.c
This produces an object file ‘main.o’ containing the machine code for the
main function. It contains a reference to the external function hello, but
the corresponding memory address is left undefined in the object file at
this stage (it will be filled in later by linking).
The corresponding command for compiling the hello function in the
source file ‘hello_fn.c’ is:
$ gcc -Wall -c hello_fn.c
This produces the object file ‘hello_fn.o’.
Note that there is no need to use the option ‘-o’ to specify the name
of the output file in this case. When compiling with ‘-c’ the compiler
automatically creates an object file whose name is the same as the source
file, with ‘.o’ instead of the original extension.
There is no need to put the header file ‘hello.h’ on the command line,
since it is automatically included by the #include statements in ‘main.c’
and ‘hello_fn.c’.

12

An Introduction to GCC

2.4.2 Creating executables from object files
The final step in creating an executable file is to use gcc to link the object
files together and fill in the missing addresses of external functions. To
link object files together, they are simply listed on the command line:
$ gcc main.o hello_fn.o -o hello
This is one of the few occasions where there is no need to use the ‘-Wall’
warning option, since the individual source files have already been successfully compiled to object code. Once the source files have been compiled,
linking is an unambiguous process which either succeeds or fails (it fails
only if there are references which cannot be resolved).
To perform the linking step gcc uses the linker ld, which is a separate
program. On GNU systems the GNU linker, GNU ld, is used. Other
systems may use the GNU linker with GCC, or may have their own linkers.
The linker itself will be discussed later (see Chapter 11 [How the compiler
works], page 81). By running the linker, gcc creates an executable file
from the object files.
The resulting executable file can now be run:
$ ./hello
Hello, world!
It produces the same output as the version of the program using a single
source file in the previous section.

2.4.3 Link order of object files
On Unix-like systems, the traditional behavior of compilers and linkers
is to search for external functions from left to right in the object files
specified on the command line. This means that the object file which
contains the definition of a function should appear after any files which
call that function.
In this case, the file ‘hello_fn.o’ containing the function hello should
be specified after ‘main.o’ itself, since main calls hello:
$ gcc main.o hello_fn.o -o hello
(correct order)
With some compilers or linkers the opposite ordering would result in an
error,
$ cc hello_fn.o main.o -o hello
(incorrect order)
main.o: In function ‘main’:
main.o(.text+0xf): undefined reference to ‘hello’
because there is no object file containing hello after ‘main.o’.
Most current compilers and linkers will search all object files, regardless of order, but since not all compilers do this it is best to follow the
convention of ordering object files from left to right.

Chapter 2: Compiling a C program

13

This is worth keeping in mind if you ever encounter unexpected problems with undefined references, and all the necessary object files appear
to be present on the command line.

2.5 Recompiling and relinking
To show how source files can be compiled independently we will edit the
main program ‘main.c’ and modify it to print a greeting to everyone
instead of world:
#include "hello.h"
int
main (void)
{
hello ("everyone"); /* changed from "world" */
return 0;
}
The updated file ‘main.c’ can now be recompiled with the following command:
$ gcc -Wall -c main.c
This produces a new object file ‘main.o’. There is no need to create a
new object file for ‘hello_fn.c’, since that file and the related files that
it depends on, such as header files, have not changed.
The new object file can be relinked with the hello function to create
a new executable file:
$ gcc main.o hello_fn.o -o hello
The resulting executable ‘hello’ now uses the new main function to produce the following output:
$ ./hello
Hello, everyone!
Note that only the file ‘main.c’ has been recompiled, and then relinked
with the existing object file for the hello function. If the file ‘hello_fn.c’
had been modified instead, we could have recompiled ‘hello_fn.c’ to
create a new object file ‘hello_fn.o’ and relinked this with the existing
file ‘main.o’.(1)
In general, linking is faster than compilation—in a large project with
many source files, recompiling only those that have been modified can
make a significant saving. The process of recompiling only the modified
(1)

If the prototype of a function has changed, it is necessary to modify and
recompile all of the other source files which use it.

14

An Introduction to GCC

files in a project can be automated using GNU Make (see [Further reading], page 91).

2.6 Linking with external libraries
A library is a collection of precompiled object files which can be linked
into programs. The most common use of libraries is to provide system
functions, such as the square root function sqrt found in the C math
library.
Libraries are typically stored in special archive files with the extension
‘.a’, referred to as static libraries. They are created from object files with
a separate tool, the GNU archiver ar, and used by the linker to resolve
references to functions at compile-time. We will see later how to create
libraries using the ar command (see Chapter 10 [Compiler-related tools],
page 73). For simplicity, only static libraries are covered in this section—
dynamic linking at runtime using shared libraries will be described in the
next chapter.
The standard system libraries are usually found in the directories
‘/usr/lib’ and ‘/lib’.(2) For example, the C math library is typically
stored in the file ‘/usr/lib/libm.a’ on Unix-like systems. The corresponding prototype declarations for the functions in this library are given
in the header file ‘/usr/include/math.h’. The C standard library itself
is stored in ‘/usr/lib/libc.a’ and contains functions specified in the
ANSI/ISO C standard, such as ‘printf’—this library is linked by default
for every C program.
Here is an example program which makes a call to the external function
sqrt in the math library ‘libm.a’:
#include <math.h>
#include <stdio.h>
int
main (void)
{
double x = sqrt (2.0);
printf ("The square root of 2.0 is %f\n", x);
return 0;
}
Trying to create an executable from this source file alone causes the compiler to give an error at the link stage:
(2)

On systems supporting both 64 and 32-bit executables the 64-bit versions
of the libraries will often be stored in ‘/usr/lib64’ and ‘/lib64’, with the
32-bit versions in ‘/usr/lib’ and ‘/lib’.

Chapter 2: Compiling a C program

15

$ gcc -Wall calc.c -o calc
/tmp/ccbR6Ojm.o: In function ‘main’:
/tmp/ccbR6Ojm.o(.text+0x19): undefined reference
to ‘sqrt’
The problem is that the reference to the sqrt function cannot be resolved
without the external math library ‘libm.a’. The function sqrt is not defined in the program or the default library ‘libc.a’, and the compiler does
not link to the file ‘libm.a’ unless it is explicitly selected. Incidentally,
the file mentioned in the error message ‘/tmp/ccbR60jm.o’ is a temporary
object file created by the compiler from ‘calc.c’, in order to carry out
the linking process.
To enable the compiler to link the sqrt function to the main program ‘calc.c’ we need to supply the library ‘libm.a’. One obvious but
cumbersome way to do this is to specify it explicitly on the command line:
$ gcc -Wall calc.c /usr/lib/libm.a -o calc
The library ‘libm.a’ contains object files for all the mathematical functions, such as sin, cos, exp, log and sqrt. The linker searches through
these to find the object file containing the sqrt function.
Once the object file for the sqrt function has been found, the main
program can be linked and a complete executable produced:
$ ./calc
The square root of 2.0 is 1.414214
The executable file includes the machine code for the main function and
the machine code for the sqrt function, copied from the corresponding
object file in the library ‘libm.a’.
To avoid the need to specify long paths on the command line, the
compiler provides a short-cut option ‘-l’ for linking against libraries. For
example, the following command,
$ gcc -Wall calc.c -lm -o calc
is equivalent to the original command above using the full library name
‘/usr/lib/libm.a’.
In general, the compiler option ‘-lNAME ’ will attempt to link object
files with a library file ‘libNAME.a’ in the standard library directories.
Additional directories can specified with command-line options and environment variables, to be discussed shortly. A large program will typically
use many ‘-l’ options to link libraries such as the math library, graphics
libraries and networking libraries.

2.6.1 Link order of libraries
The ordering of libraries on the command line follows the same convection as for object files: they are searched from left to right—a library

16

An Introduction to GCC

containing the definition of a function should appear after any source
files or object files which use it. This includes libraries specified with the
short-cut ‘-l’ option, as shown in the following command:
$ gcc -Wall calc.c -lm -o calc
(correct order)
With some compilers the opposite ordering (placing the ‘-lm’ option before the file which uses it) would result in an error,
$ cc -Wall -lm calc.c -o calc
(incorrect order)
main.o: In function ‘main’:
main.o(.text+0xf): undefined reference to ‘sqrt’
because there is no library or object file containing sqrt after ‘calc.c’.
The option ‘-lm’ should appear after the file ‘calc.c’.
When several libraries are being used, the same convention should
be followed for the libraries themselves. A library which calls an external function defined in another library should appear before the library
containing the function.
For example, a program ‘data.c’ using the GNU Linear Programming
library ‘libglpk.a’, which in turn uses the math library ‘libm.a’, should
be compiled as,
$ gcc -Wall data.c -lglpk -lm
since the object files in ‘libglpk.a’ use functions defined in ‘libm.a’.
As for object files, most current compilers will search all libraries,
regardless of order. However, since not all compilers do this it is best to
follow the convention of ordering libraries from left to right.

2.7 Using library header files
When using a library it is essential to include the appropriate header
files, in order to declare the function arguments and return values with
the correct types. Without declarations, the arguments of a function can
be passed with the wrong type, causing corrupted results.
The following example shows another program which makes a function
call to the C math library. In this case, the function pow is used to compute
the cube of two (2 raised to the power of 3):
#include <stdio.h>
int
main (void)
{
double x = pow (2.0, 3.0);
printf ("Two cubed is %f\n", x);
return 0;

Chapter 2: Compiling a C program

17

}
However, the program contains an error—the #include statement for
‘math.h’ is missing, so the prototype double pow (double x, double y)
given there will not be seen by the compiler.
Compiling the program without any warning options will produce an
executable file which gives incorrect results:
$ gcc badpow.c -lm
$ ./a.out
Two cubed is 2.851120
(incorrect result, should be 8)
The results are corrupted because the arguments and return value of the
call to pow are passed with incorrect types.(3) This can be detected by
turning on the warning option ‘-Wall’:
$ gcc -Wall badpow.c -lm
badpow.c: In function ‘main’:
badpow.c:6: warning: implicit declaration of
function ‘pow’
This example shows again the importance of using the warning option
‘-Wall’ to detect serious problems that could otherwise easily be overlooked.

(3)

The actual output shown above may differ, depending on the specific platform and environment.

18

An Introduction to GCC

Chapter 3: Compilation options

19

3 Compilation options
This chapter describes other commonly-used compiler options available
in GCC. These options control features such as the search paths used
for locating libraries and include files, the use of additional warnings and
diagnostics, preprocessor macros and C language dialects.

3.1 Setting search paths
In the last chapter, we saw how to link to a program with functions in the
C math library ‘libm.a’, using the short-cut option ‘-lm’ and the header
file ‘math.h’.
A common problem when compiling a program using library header
files is the error:
FILE.h : No such file or directory
This occurs if a header file is not present in the standard include file
directories used by gcc. A similar problem can occur for libraries:
/usr/bin/ld: cannot find library
This happens if a library used for linking is not present in the standard
library directories used by gcc.
By default, gcc searches the following directories for header files:
/usr/local/include/
/usr/include/
and the following directories for libraries:
/usr/local/lib/
/usr/lib/
The list of directories for header files is often referred to as the include
path, and the list of directories for libraries as the library search path or
link path.
The directories on these paths are searched in order, from first to
last in the two lists above.(1) For example, a header file found in
‘/usr/local/include’ takes precedence over a file with the same name
in ‘/usr/include’. Similarly, a library found in ‘/usr/local/lib’ takes
precedence over a library with the same name in ‘/usr/lib’.
(1)

The default search paths may also include additional system-dependent
or site-specific directories, and directories in the GCC installation itself.
For example, on 64-bit platforms additional ‘lib64’ directories may also be
searched by default.

20

An Introduction to GCC

When additional libraries are installed in other directories it is necessary to extend the search paths, in order for the libraries to be found.
The compiler options ‘-I’ and ‘-L’ add new directories to the beginning
of the include path and library search path respectively.

3.1.1 Search path example
The following example program uses a library that might be installed
as an additional package on a system—the GNU Database Management
Library (GDBM). The GDBM Library stores key-value pairs in a DBM
file, a type of data file which allows values to be stored and indexed by a
key (an arbitrary sequence of characters). Here is the example program
‘dbmain.c’, which creates a DBM file containing a key ‘testkey’ with the
value ‘testvalue’:
#include <stdio.h>
#include <gdbm.h>
int
main (void)
{
GDBM_FILE dbf;
datum key = { "testkey", 7 };
/* key, length */
datum value = { "testvalue", 9 }; /* value, length */
printf ("Storing key-value pair... ");
dbf = gdbm_open ("test", 0, GDBM_NEWDB, 0644, 0);
gdbm_store (dbf, key, value, GDBM_INSERT);
gdbm_close (dbf);
printf ("done.\n");
return 0;
}
The program uses the header file ‘gdbm.h’ and the library ‘libgdbm.a’. If
the library has been installed in the default location of ‘/usr/local/lib’,
with the header file in ‘/usr/local/include’, then the program can be
compiled with the following simple command:
$ gcc -Wall dbmain.c -lgdbm
Both these directories are part of the default gcc include and link paths.
However, if GDBM has been installed in a different location, trying to
compile the program will give the following error:
$ gcc -Wall dbmain.c -lgdbm
dbmain.c:1: gdbm.h: No such file or directory

Chapter 3: Compilation options

21

For example, if version 1.8.3 of the GDBM package is installed under the
directory ‘/opt/gdbm-1.8.3’ the location of the header file would be,
/opt/gdbm-1.8.3/include/gdbm.h
which is not part of the default gcc include path. Adding the appropriate
directory to the include path with the command-line option ‘-I’ allows
the program to be compiled, but not linked:
$ gcc -Wall -I/opt/gdbm-1.8.3/include dbmain.c -lgdbm
/usr/bin/ld: cannot find -lgdbm
collect2: ld returned 1 exit status
The directory containing the library is still missing from the link path. It
can be added to the link path using the following option:
-L/opt/gdbm-1.8.3/lib/
The following command line allows the program to be compiled and linked:
$ gcc -Wall -I/opt/gdbm-1.8.3/include
-L/opt/gdbm-1.8.3/lib dbmain.c -lgdbm
This produces the final executable linked to the GDBM library. Before
seeing how to run this executable we will take a brief look at the environment variables that affect the ‘-I’ and ‘-L’ options.
Note that you should never place the absolute paths of header files in
#include statements in your source code, as this will prevent the program
from compiling on other systems. The ‘-I’ option or the INCLUDE_PATH
variable described below should always be used to set the include path for
header files.

3.1.2 Environment variables
The search paths for header files and libraries can also be controlled
through environment variables in the shell. These may be set automatically for each session using the appropriate login file, such as
‘.bash_profile’.
Additional directories can be added to the include path using the environment variable C_INCLUDE_PATH (for C header files) or CPLUS_INCLUDE_
PATH (for C++ header files). For example, the following commands will
add ‘/opt/gdbm-1.8.3/include’ to the include path when compiling C
programs:
$ C_INCLUDE_PATH=/opt/gdbm-1.8.3/include
$ export C_INCLUDE_PATH
This directory will be searched after any directories specified on the command line with the option ‘-I’, and before the standard default directories
‘/usr/local/include’ and ‘/usr/include’. The shell command export
is needed to make the environment variable available to programs outside the shell itself, such as the compiler—it is only needed once for each

22

An Introduction to GCC

variable in each shell session, and can also be set in the appropriate login
file.
Similarly, additional directories can be added to the link path using
the environment variable LIBRARY_PATH. For example, the following commands will add ‘/opt/gdbm-1.8.3/lib’ to the link path:
$ LIBRARY_PATH=/opt/gdbm-1.8.3/lib
$ export LIBRARY_PATH
This directory will be searched after any directories specified on the command line with the option ‘-L’, and before the standard default directories
‘/usr/local/lib’ and ‘/usr/lib’.
With the environment variable settings given above the program
‘dbmain.c’ can be compiled without the ‘-I’ and ‘-L’ options,
$ gcc -Wall dbmain.c -lgdbm
because the default paths now use the directories specified in the environment variables C_INCLUDE_PATH and LIBRARY_PATH.

3.1.3 Extended search paths
Following the standard Unix convention for search paths, several directories can be specified together in an environment variable as a colon
separated list:
DIR1 :DIR2 :DIR3 :...
The directories are then searched in order from left to right. A single dot
‘.’ can be used to specify the current directory.(2)
For example, the following settings create default include and link
paths for packages installed in the current directory ‘.’ and the ‘include’
and ‘lib’ directories under ‘/opt/gdbm-1.8.3’ and ‘/net’ respectively:
$ C_INCLUDE_PATH=.:/opt/gdbm-1.8.3/include:/net/include
$ LIBRARY_PATH=.:/opt/gdbm-1.8.3/lib:/net/lib
To specify multiple search path directories on the command line, the options ‘-I’ and ‘-L’ can be repeated. For example, the following command,
$ gcc -I. -I/opt/gdbm-1.8.3/include -I/net/include
-L. -L/opt/gdbm-1.8.3/lib -L/net/lib .....
is equivalent to the environment variable settings given above.
When environment variables and command-line options are used together the compiler searches the directories in the following order:
1. command-line options ‘-I’ and ‘-L’, from left to right
(2)

The current directory can also be specified using an empty path element.
For example, :DIR1 :DIR2 is equivalent to .:DIR1 :DIR2 .

Chapter 3: Compilation options

23

2. directories specified by environment variables, such as C_INCLUDE_
PATH and LIBRARY_PATH
3. default system directories
In day-to-day usage, directories are usually added to the search paths with
the options ‘-I’ and ‘-L’.

3.2 Shared libraries and static libraries
Although the example program above has been successfully compiled and
linked, a final step is needed before being able to load and run the executable file.
If an attempt is made to start the executable directly, the following
error will occur on most systems:
$ ./a.out
./a.out: error while loading shared libraries:
libgdbm.so.3: cannot open shared object file:
No such file or directory
This is because the GDBM package provides a shared library. This type
of library requires special treatment—it must be loaded from disk before
the executable will run.
External libraries are usually provided in two forms: static libraries
and shared libraries. Static libraries are the ‘.a’ files seen earlier. When
a program is linked against a static library, the machine code from the
object files for any external functions used by the program is copied from
the library into the final executable.
Shared libraries are handled with a more advanced form of linking,
which makes the executable file smaller. They use the extension ‘.so’,
which stands for shared object.
An executable file linked against a shared library contains only a small
table of the functions it requires, instead of the complete machine code
from the object files for the external functions. Before the executable file
starts running, the machine code for the external functions is copied into
memory from the shared library file on disk by the operating system—a
process referred to as dynamic linking.
Dynamic linking makes executable files smaller and saves disk space,
because one copy of a library can be shared between multiple programs.
Most operating systems also provide a virtual memory mechanism which
allows one copy of a shared library in physical memory to be used by all
running programs, saving memory as well as disk space.

24

An Introduction to GCC

Furthermore, shared libraries make it possible to update a library without recompiling the programs which use it (provided the interface to the
library does not change).
Because of these advantages gcc compiles programs to use shared
libraries by default on most systems, if they are available. Whenever
a static library ‘libNAME.a’ would be used for linking with the option
‘-lNAME ’ the compiler first checks for an alternative shared library with
the same name and a ‘.so’ extension.
In this case, when the compiler searches for the ‘libgdbm’ library
in the link path, it finds the following two files in the directory
‘/opt/gdbm-1.8.3/lib’:
$ cd /opt/gdbm-1.8.3/lib
$ ls libgdbm.*
libgdbm.a libgdbm.so
Consequently, the ‘libgdbm.so’ shared object file is used in preference to
the ‘libgdbm.a’ static library.
However, when the executable file is started its loader function must
find the shared library in order to load it into memory. By default the
loader searches for shared libraries only in a predefined set of system
directories, such as ‘/usr/local/lib’ and ‘/usr/lib’. If the library is
not located in one of these directories it must be added to the load path.(3)
The simplest way to set the load path is through the environment
variable LD_LIBRARY_PATH. For example, the following commands set the
load path to ‘/opt/gdbm-1.8.3/lib’ so that ‘libgdbm.so’ can be found:
$ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib
$ export LD_LIBRARY_PATH
$ ./a.out
Storing key-value pair... done.
The executable now runs successfully, prints its message and creates
a DBM file called ‘test’ containing the key-value pair ‘testkey’ and
‘testvalue’.
To save typing, the LD_LIBRARY_PATH environment variable can be set
once for each session in the appropriate login file, such as ‘.bash_profile’
for the GNU Bash shell.
Several shared library directories can be placed in the load path, as
a colon separated list DIR1 :DIR2 :DIR3 :...:DIRN . For example, the fol(3)

Note that the directory containing the shared library can, in principle,
be stored (“hard-coded”) in the executable itself using the linker option
‘-rpath’, but this is not usually done since it creates problems if the library
is moved or the executable is copied to another system.

Chapter 3: Compilation options

25

lowing command sets the load path to use the ‘lib’ directories under
‘/opt/gdbm-1.8.3’ and ‘/opt/gtk-1.4’:
$ LD_LIBRARY_PATH=/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib
$ export LD_LIBRARY_PATH
If the load path contains existing entries, it can be extended using the syntax LD_LIBRARY_PATH=NEWDIRS :$LD_LIBRARY_PATH. For example, the
following command adds the directory ‘/opt/gsl-1.5/lib’ to the load
path shown above:
$ LD_LIBRARY_PATH=/opt/gsl-1.5/lib:$LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
/opt/gsl-1.5/lib:/opt/gdbm-1.8.3/lib:/opt/gtk-1.4/lib
It is possible for the system administrator to set the LD_LIBRARY_PATH
variable for all users, by adding it to a default login script, such as
‘/etc/profile’. On GNU systems, a system-wide path can also be defined in the loader configuration file ‘/etc/ld.so.conf’.
Alternatively, static linking can be forced with the ‘-static’ option
to gcc to avoid the use of shared libraries:
$ gcc -Wall -static -I/opt/gdbm-1.8.3/include/
-L/opt/gdbm-1.8.3/lib/ dbmain.c -lgdbm
This creates an executable linked with the static library ‘libgdbm.a’
which can be run without setting the environment variable LD_LIBRARY_
PATH or putting shared libraries in the default directories:
$ ./a.out
Storing key-value pair... done.
As noted earlier, it is also possible to link directly with individual library
files by specifying the full path to the library on the command line. For
example, the following command will link directly with the static library
‘libgdbm.a’,
$ gcc -Wall -I/opt/gdbm-1.8.3/include
dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.a
and the command below will link with the shared library file ‘libgdbm.so’:
$ gcc -Wall -I/opt/gdbm-1.8.3/include
dbmain.c /opt/gdbm-1.8.3/lib/libgdbm.so
In the latter case it is still necessary to set the library load path when
running the executable.

3.3 C language standards
By default, gcc compiles programs using the GNU dialect of the C
language, referred to as GNU C. This dialect incorporates the official

26

An Introduction to GCC

ANSI/ISO standard for the C language with several useful GNU extensions, such as nested functions and variable-size arrays. Most ANSI/ISO
programs will compile under GNU C without changes.
There are several options which control the dialect of C used by gcc.
The most commonly-used options are ‘-ansi’ and ‘-pedantic’. The specific dialects of the C language for each standard can also be selected with
the ‘-std’ option.

3.3.1 ANSI/ISO
Occasionally a valid ANSI/ISO program may be incompatible with the
extensions in GNU C. To deal with this situation, the compiler option
‘-ansi’ disables those GNU extensions which conflict with the ANSI/ISO
standard. On systems using the GNU C Library (glibc) it also disables
extensions to the C standard library. This allows programs written for
ANSI/ISO C to be compiled without any unwanted effects from GNU
extensions.
For example, here is a valid ANSI/ISO C program which uses a variable
called asm:
#include <stdio.h>
int
main (void)
{
const char asm[] = "6502";
printf ("the string asm is ’%s’\n", asm);
return 0;
}
The variable name asm is valid under the ANSI/ISO standard, but this
program will not compile in GNU C because asm is a GNU C keyword
extension (it allows native assembly instructions to be used in C functions). Consequently, it cannot be used as a variable name without giving
a compilation error:
$ gcc -Wall ansi.c
ansi.c: In function ‘main’:
ansi.c:6: parse error before ‘asm’
ansi.c:7: parse error before ‘asm’
In contrast, using the ‘-ansi’ option disables the asm keyword extension,
and allows the program above to be compiled correctly:
$ gcc -Wall -ansi ansi.c
$ ./a.out
the string asm is ’6502’

Chapter 3: Compilation options

27

For reference, the non-standard keywords and macros defined by the GNU
C extensions are asm, inline, typeof, unix and vax. More details can be
found in the GCC Reference Manual “Using GCC” (see [Further reading],
page 91).
The next example shows the effect of the ‘-ansi’ option on systems
using the GNU C Library, such as GNU/Linux systems. The program below prints the value of pi, π = 3.14159..., from the preprocessor definition
M_PI in the header file ‘math.h’:
#include <math.h>
#include <stdio.h>
int
main (void)
{
printf("the value of pi is %f\n", M_PI);
return 0;
}
The constant M_PI is not part of the ANSI/ISO C standard library (it
comes from the BSD version of Unix). In this case, the program will not
compile with the ‘-ansi’ option:
$ gcc -Wall -ansi pi.c
pi.c: In function ‘main’:
pi.c:7: ‘M_PI’ undeclared (first use in this function)
pi.c:7: (Each undeclared identifier is reported only once
pi.c:7: for each function it appears in.)
The program can be compiled without the ‘-ansi’ option. In this case
both the language and library extensions are enabled by default:
$ gcc -Wall pi.c
$ ./a.out
the value of pi is 3.141593
It is also possible to compile the program using ANSI/ISO C, by enabling
only the extensions in the GNU C Library itself. This can be achieved by
defining special macros, such as _GNU_SOURCE, which enable extensions in
the GNU C Library:(4)
$ gcc -Wall -ansi -D_GNU_SOURCE pi.c
$ ./a.out
the value of pi is 3.141593
The GNU C Library provides a number of these macros (referred to as
feature test macros) which allow control over the support for POSIX ex(4)

The ‘-D’ option for defining macros will be explained in detail in the next
chapter.

28

An Introduction to GCC

tensions (_POSIX_C_SOURCE), BSD extensions (_BSD_SOURCE), SVID extensions (_SVID_SOURCE), XOPEN extensions (_XOPEN_SOURCE) and GNU
extensions (_GNU_SOURCE).
The _GNU_SOURCE macro enables all the extensions together, with the
POSIX extensions taking precedence over the others in cases where they
conflict. Further information about feature test macros can be found in
the GNU C Library Reference Manual, see [Further reading], page 91.

3.3.2 Strict ANSI/ISO
The command-line option ‘-pedantic’ in combination with ‘-ansi’ will
cause gcc to reject all GNU C extensions, not just those that are incompatible with the ANSI/ISO standard. This helps you to write portable
programs which follow the ANSI/ISO standard.
Here is a program which uses variable-size arrays, a GNU C extension.
The array x[n] is declared with a length specified by the integer variable
n.
int
main (int argc, char *argv[])
{
int i, n = argc;
double x[n];
for (i = 0; i < n; i++)
x[i] = i;
return 0;
}
This program will compile with ‘-ansi’, because support for variable
length arrays does not interfere with the compilation of valid ANSI/ISO
programs—it is a backwards-compatible extension:
$ gcc -Wall -ansi gnuarray.c
However, compiling with ‘-ansi -pedantic’ reports warnings about violations of the ANSI/ISO standard:
$ gcc -Wall -ansi -pedantic gnuarray.c
gnuarray.c: In function ‘main’:
gnuarray.c:5: warning: ISO C90 forbids variable-size
array ‘x’
Note that an absence of warnings from ‘-ansi -pedantic’ does not guarantee that a program strictly conforms to the ANSI/ISO standard. The
standard itself specifies only a limited set of circumstances that should
generate diagnostics, and these are what ‘-ansi -pedantic’ reports.

Chapter 3: Compilation options

29

3.3.3 Selecting specific standards
The specific language standard used by GCC can be controlled with the
‘-std’ option. The following C language standards are supported:
‘-std=c89’ or ‘-std=iso9899:1990’
The original ANSI/ISO C language standard (ANSI X3.159-1989,
ISO/IEC 9899:1990). GCC incorporates the corrections in the two
ISO Technical Corrigenda to the original standard.
‘-std=iso9899:199409’
The ISO C language standard with ISO Amendment 1, published
in 1994. This amendment was mainly concerned with internationalization, such as adding support for multibyte characters to the C
library.
‘-std=c99’ or ‘-std=iso9899:1999’
The revised ISO C language standard, published in 1999 (ISO/IEC
9899:1999).
The C language standards with GNU extensions can be selected with the
options ‘-std=gnu89’ and ‘-std=gnu99’.

3.4 Warning options in -Wall
As described earlier (see Section 2.1 [Compiling a simple C program],
page 7), the warning option ‘-Wall’ enables warnings for many common
errors, and should always be used. It combines a large number of other,
more specific, warning options which can also be selected individually.
Here is a summary of these options:
‘-Wcomment’ (included in ‘-Wall’)
This option warns about nested comments. Nested comments typically arise when a section of code containing comments is later
commented out:
/* commented out
double x = 1.23 ; /* x-position */
*/
Nested comments can be a source of confusion—the safe way to
“comment out” a section of code containing comments is to surround it with the preprocessor directive #if 0 ... #endif:
/* commented out */
#if 0
double x = 1.23 ; /* x-position */
#endif

30

An Introduction to GCC

‘-Wformat’ (included in ‘-Wall’)
This option warns about the incorrect use of format strings in functions such as printf and scanf, where the format specifier does
not agree with the type of the corresponding function argument.
‘-Wunused’ (included in ‘-Wall’)
This option warns about unused variables. When a variable is declared but not used this can be the result of another variable being
accidentally substituted in its place. If the variable is genuinely not
needed it can be removed from the source code.
‘-Wimplicit’ (included in ‘-Wall’)
This option warns about any functions that are used without being declared. The most common reason for a function to be used
without being declared is forgetting to include a header file.
‘-Wreturn-type’ (included in ‘-Wall’)
This option warns about functions that are defined without a return type but not declared void. It also catches empty return
statements in functions that are not declared void.
For example, the following program does not use an explicit return
value:
#include <stdio.h>
int
main (void)
{
printf ("hello world\n");
return;
}
The lack of a return value in the code above could be the result
of an accidental omission by the programmer—the value returned
by the main function is actually the return value of the printf
function (the number of characters printed). To avoid ambiguity,
it is preferable to use an explicit value in the return statement,
either as a variable or a constant, such as return 0.
The complete set of warning options included in ‘-Wall’ can be found
in the GCC Reference Manual “Using GCC” (see [Further reading],
page 91). The options included in ‘-Wall’ have the common characteristic that they report constructions which are always wrong, or can easily
be rewritten in an unambiguously correct way. This is why they are so
useful—any warning produced by ‘-Wall’ can be taken as an indication
of a potentially serious problem.

Chapter 3: Compilation options

31

3.5 Additional warning options
GCC provides many other warning options that are not included in
‘-Wall’, but are often useful. Typically these produce warnings for source
code which may be technically valid but is very likely to cause problems. The criteria for these options are based on experience of common
errors—they are not included in ‘-Wall’ because they only indicate possibly problematic or “suspicious” code.
Since these warnings can be issued for valid code it is not necessary
to compile with them all the time. It is more appropriate to use them
periodically and review the results, checking for anything unexpected, or
to enable them for some programs or files.
‘-W’

This is a general option similar to ‘-Wall’ which warns about a
selection of common programming errors, such as functions which
can return without a value (also known as “falling off the end of
the function body”), and comparisons between signed and unsigned
values. For example, the following function tests whether an unsigned integer is negative (which is impossible, of course):
int
foo (unsigned int x)
{
if (x < 0)
return 0; /* cannot occur */
else
return 1;
}
Compiling this function with ‘-Wall’ does not produce a warning,
$ gcc -Wall -c w.c
but does give a warning with ‘-W’:
$ gcc -W -c w.c
w.c: In function ‘foo’:
w.c:4: warning: comparison of unsigned
expression < 0 is always false
In practice, the options ‘-W’ and ‘-Wall’ are normally used together.

‘-Wconversion’
This option warns about implicit type conversions that could cause
unexpected results. For example, the assignment of a negative
value to an unsigned variable, as in the following code,
unsigned int x = -1;
is technically allowed by the ANSI/ISO C standard (with the negative integer being converted to a positive integer, according to the

32

An Introduction to GCC
machine representation) but could be a simple programming error.
If you need to perform such a conversion you can use an explicit
cast, such as ((unsigned int) -1), to avoid any warnings from
this option. On two’s-complement machines the result of the cast
gives the maximum number that can be represented by an unsigned
integer.

‘-Wshadow’
This option warns about the redeclaration of a variable name in
a scope where it has already been declared. This is referred to as
variable shadowing, and causes confusion about which occurrence
of the variable corresponds to which value.
The following function declares a local variable y that shadows the
declaration in the body of the function:
double
test (double x)
{
double y = 1.0;
{
double y;
y = x;
}
return y;
}
This is valid ANSI/ISO C, where the return value is 1. The shadowing of the variable y might make it seem (incorrectly) that the
return value is x, when looking at the line y = x (especially in a
large and complicated function).
Shadowing can also occur for function names. For example, the
following program attempts to define a variable sin which shadows
the standard function sin(x).
double
sin_series (double x)
{
/* series expansion for small x */
double sin = x * (1.0 - x * x / 6.0);
return sin;
}
This error will be detected by the ‘-Wshadow’ option.
‘-Wcast-qual’
This option warns about pointers that are cast to remove a type
qualifier, such as const. For example, the following function dis-

Chapter 3: Compilation options

33

cards the const qualifier from its input argument, allowing it to be
overwritten:
void
f (const char * str)
{
char * s = (char *)str;
s[0] = ’\0’;
}
The modification of the original contents of str is a violation of its
const property. This option will warn about the improper cast of
the variable str which allows the string to be modified.
‘-Wwrite-strings’
This option implicitly gives all string constants defined in the program a const qualifier, causing a compile-time warning if there is
an attempt to overwrite them. The result of modifying a string
constant is not defined by the ANSI/ISO standard, and the use of
writable string constants is deprecated in GCC.
‘-Wtraditional’
This option warns about parts of the code which would be interpreted differently by an ANSI/ISO compiler and a “traditional”
pre-ANSI compiler.(5) When maintaining legacy software it may
be necessary to investigate whether the traditional or ANSI/ISO
interpretation was intended in the original code for warnings generated by this option.
The options above produce diagnostic warning messages, but allow the
compilation to continue and produce an object file or executable. For
large programs it can be desirable to catch all the warnings by stopping
the compilation whenever a warning is generated. The ‘-Werror’ option
changes the default behavior by converting warnings into errors, stopping
the compilation whenever a warning occurs.

(5)

The traditional form of the C language was described in the original C reference manual “The C Programming Language (First Edition)” by Kernighan
and Ritchie.

34

An Introduction to GCC

Chapter 4: Using the preprocessor

35

4 Using the preprocessor
This chapter describes the use of the GNU C preprocessor cpp, which is
part of the GCC package. The preprocessor expands macros in source
files before they are compiled. It is automatically called whenever GCC
processes a C or C++ program.(1)

4.1 Defining macros
The following program demonstrates the most common use of the C preprocessor. It uses the preprocessor conditional #ifdef to check whether
a macro is defined.
When the macro is defined, the preprocessor includes the corresponding code up to the closing #endif command. In this example, the macro
which is tested is called TEST, and the conditional part of the source code
is a printf statement which prints the message “Test mode”:
#include <stdio.h>
int
main (void)
{
#ifdef TEST
printf ("Test mode\n");
#endif
printf ("Running...\n");
return 0;
}
The gcc option ‘-DNAME ’ defines a preprocessor macro NAME from the
command line. If the program above is compiled with the commandline option ‘-DTEST’, the macro TEST will be defined and the resulting
executable will print both messages:
$ gcc -Wall -DTEST dtest.c
$ ./a.out
Test mode
Running...
(1)

In recent versions of GCC the preprocessor is integrated into the compiler,
although a separate cpp command is also provided.

36

An Introduction to GCC

If the same program is compiled without the ‘-D’ option then the “Test
mode” message is omitted from the source code after preprocessing, and
the final executable does not include the code for it:
$ gcc -Wall dtest.c
$ ./a.out
Running...
Macros are generally undefined, unless specified on the command line with
the option ‘-D’, or in a source file (or library header file) with #define.
Some macros are automatically defined by the compiler—these typically
use a reserved namespace beginning with a double-underscore prefix ‘__’.
The complete set of predefined macros can be listed by running the
GNU preprocessor cpp with the option ‘-dM’ on an empty file:
$ cpp -dM /dev/null
#define __i386__ 1
#define __i386 1
#define i386 1
#define __unix 1
#define __unix__ 1
#define __ELF__ 1
#define unix 1
.......
Note that this list includes a small number of system-specific macros defined by gcc which do not use the double-underscore prefix. These nonstandard macros can be disabled with the ‘-ansi’ option of gcc.

4.2 Macros with values
In addition to being defined, a macro can also be given a concrete value.
This value is inserted into the source code at each point where the macro
occurs. The following program uses a macro NUM, to represent a number
which will be printed:
#include <stdio.h>
int
main (void)
{
printf("Value of NUM is %d\n", NUM);
return 0;
}
Note that macros are not expanded inside strings—only the occurrence of
NUM outside the string is substituted by the preprocessor.

Chapter 4: Using the preprocessor

37

To define a macro with a value, the ‘-D’ command-line option can be
used in the form ‘-DNAME =VALUE ’. For example, the following command
line defines NUM to be 100 when compiling the program above:
$ gcc -Wall -DNUM=100 dtestval.c
$ ./a.out
Value of NUM is 100
This example uses a number, but a macro can take values of any form.
Whatever the value of the macro is, it is inserted directly into the source
code at the point where the macro name occurs. For example, the following definition expands the occurrences of NUM to 2+2 during preprocessing:
$ gcc -Wall -DNUM="2+2" dtestval.c
$ ./a.out
Value of NUM is 4
After the preprocessor has made the substitution NUM 7→ 2+2 this is equivalent to compiling the following program:
#include <stdio.h>
int
main (void)
{
printf("Value of NUM is %d\n", 2+2);
return 0;
}
Note that it is a good idea to surround macros by parentheses whenever they are part of an expression. For example, the following program
uses parentheses to ensure the correct precedence for the multiplication
10*NUM:
#include <stdio.h>
int
main (void)
{
printf ("Ten times NUM is %d\n", 10 * (NUM));
return 0;
}
With these parentheses, it produces the expected result when compiled
with the same command line as above:
$ gcc -Wall -DNUM="2+2" dtestmul10.c
$ ./a.out
Ten times NUM is 40

38

An Introduction to GCC

Without parentheses, the program would produce the value 22 from the
literal form of the expression 10*2+2 = 22, instead of the desired value
10*(2+2) = 40.
When a macro is defined with ‘-D’ alone, gcc uses a default value of 1.
For example, compiling the original test program with the option ‘-DNUM’
generates an executable which produces the following output:
$ gcc -Wall -DNUM dtestval.c
$ ./a.out
Value of NUM is 1
A macro can be defined to a empty value using quotes on the command
line, -DNAME ="". Such a macro is still treated as defined by conditionals
such as #ifdef, but expands to nothing.
A macro containing quotes can be defined using shell-escaped quote
characters. For example, the command-line option -DMESSAGE="\"Hello,
World!\"" defines a macro MESSAGE which expands to the sequence of
characters "Hello, World!". For an explanation of the different types
of quoting and escaping used in the shell see the “GNU Bash Reference
Manual”, [Further reading], page 91.

4.3 Preprocessing source files
It is possible to see the effect of the preprocessor on source files directly,
using the ‘-E’ option of gcc. For example, the file below defines and uses
a macro TEST:
#define TEST "Hello, World!"
const char str[] = TEST;
If this file is called ‘test.c’ the effect of the preprocessor can be seen with
the following command line:
$ gcc -E test.c
# 1 "test.c"
const char str[] = "Hello, World!" ;
The ‘-E’ option causes gcc to run the preprocessor, display the expanded
output, and then exit without compiling the resulting source code. The
value of the macro TEST is substituted directly into the output, producing
the sequence of characters const char str[] = "Hello, World!" ;.
The preprocessor also inserts lines recording the source file and line
numbers in the form # line-number "source-file ", to aid in debugging
and allow the compiler to issue error messages referring to this information. These lines do not affect the program itself.
The ability to see the preprocessed source files can be useful for examining the effect of system header files, and finding declarations of system

Chapter 4: Using the preprocessor

39

functions. The following program includes the header file ‘stdio.h’ to
obtain the declaration of the function printf:
#include <stdio.h>
int
main (void)
{
printf ("Hello, world!\n");
return 0;
}
It is possible to see the declarations from the included header file by
preprocessing the file with gcc -E:
$ gcc -E hello.c
On a GNU system, this produces output similar to the following:
# 1 "hello.c"
# 1 "/usr/include/stdio.h" 1 3
extern FILE *stdin;
extern FILE *stdout;
extern FILE *stderr;
extern int fprintf (FILE * __stream,
const char * __format, ...) ;
extern int printf (const char * __format, ...) ;
[ ... additional declarations ... ]
# 1 "hello.c" 2
int
main (void)
{
printf ("Hello, world!\n");
return 0;
}
The preprocessed system header files usually generate a lot of output.
This can be redirected to a file, or saved more conveniently using the gcc
‘-save-temps’ option:
$ gcc -c -save-temps hello.c
After running this command, the preprocessed output will be available
in the file ‘hello.i’. The ‘-save-temps’ option also saves ‘.s’ assembly
files and ‘.o’ object files in addition to preprocessed ‘.i’ files.

40

An Introduction to GCC

Chapter 5: Compiling for debugging

41

5 Compiling for debugging
Normally, an executable file does not contain any references to the original
program source code, such as variable names or line-numbers—the executable file is simply the sequence of machine code instructions produced
by the compiler. This is insufficient for debugging, since there is no easy
way to find the cause of an error if the program crashes.
GCC provides the ‘-g’ debug option to store additional debugging
information in object files and executables. This debugging information
allows errors to be traced back from a specific machine instruction to the
corresponding line in the original source file. It also allows the execution
of a program to be traced in a debugger, such as the GNU Debugger gdb
(for more information, see “Debugging with GDB: The GNU Source-Level
Debugger”, [Further reading], page 91). Using a debugger also allows the
values of variables to be examined while the program is running.
The debug option works by storing the names of functions and variables (and all the references to them), with their corresponding source
code line-numbers, in a symbol table in object files and executables.

5.1 Examining core files
In addition to allowing a program to be run under the debugger, another
helpful application of the ‘-g’ option is to find the circumstances of a
program crash.
When a program exits abnormally the operating system can write out
a core file, usually named ‘core’, which contains the in-memory state of
the program at the time it crashed. Combined with information from the
symbol table produced by ‘-g’, the core file can be used to find the line
where the program stopped, and the values of its variables at that point.
This is useful both during the development of software, and after
deployment—it allows problems to be investigated when a program has
crashed “in the field”.
Here is a simple program containing an invalid memory access bug,
which we will use to produce a core file:
int a (int *p);
int
main (void)
{

42

An Introduction to GCC
int *p = 0;
/* null pointer */
return a (p);
}

int
a (int *p)
{
int y = *p;
return y;
}
The program attempts to dereference a null pointer p, which is an invalid
operation. On most systems, this will cause a crash.(1)
In order to be able to find the cause of the crash later, we need to
compile the program with the ‘-g’ option:
$ gcc -Wall -g null.c
Note that a null pointer will only cause a problem at run-time, so the
option ‘-Wall’ does not produce any warnings.
Running the executable file on an x86 GNU/Linux system will cause
the operating system to terminate the program abnormally:
$ ./a.out
Segmentation fault (core dumped)
Whenever the error message ‘core dumped’ is displayed, the operating system should produce a file called ‘core’ in the current directory.(2) This
core file contains a complete copy of the pages of memory used by the
program at the time it was terminated. Incidentally, the term segmentation fault refers to the fact that the program tried to access a restricted
memory “segment” outside the area of memory which had been allocated
to it.
Some systems are configured not to write core files by default, since the
files can be large and rapidly fill up the available disk space on a system.
In the GNU Bash shell the command ulimit -c controls the maximum
size of core files. If the size limit is zero, no core files are produced. The
current size limit can be shown by typing the following command:
$ ulimit -c
0
(1)

(2)

Historically, a null pointer has typically corresponded to memory location 0,
which is usually restricted to the operating system kernel and not accessible
to user programs.
Some systems, such as FreeBSD and Solaris, can also be configured to write
core files in specific directories, e.g. ‘/var/coredumps/’, using the sysctl
or coreadm commands.

Chapter 5: Compiling for debugging

43

If the result is zero, as shown above, then it can be increased with the
following command to allow core files of any size to be written:(3)
$ ulimit -c unlimited
Note that this setting only applies to the current shell. To set the limit
for future sessions the command should be placed in an appropriate login
file, such as ‘.bash_profile’ for the GNU Bash shell.
Core files can be loaded into the GNU Debugger gdb with the following
command:
$ gdb EXECUTABLE-FILE CORE-FILE
Note that both the original executable file and the core file are required
for debugging—it is not possible to debug a core file without the corresponding executable. In this example, we can load the executable and
core file with the command:
$ gdb a.out core
The debugger immediately begins printing diagnostic information, and
shows a listing of the line where the program crashed (line 13):
$ gdb a.out core
Core was generated by ‘./a.out’.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0 0x080483ed in a (p=0x0) at null.c:13
13
int y = *p;
(gdb)
The final line (gdb) is the GNU Debugger prompt—it indicates that further commands can be entered at this point.
To investigate the cause of the crash, we display the value of the pointer
p using the debugger print command:
(gdb) print p
$1 = (int *) 0x0
This shows that p is a null pointer (0x0) of type ‘int *’, so we know that
dereferencing it with the expression *p in this line has caused the crash.
(3)

This example uses the ulimit command in the GNU Bash shell. On other
systems the usage of the ulimit command may vary, or have a different
name (the tcsh shell uses the limit command instead). The size limit for
core files can also be set to a specific value in kilobytes.

44

An Introduction to GCC

5.2 Displaying a backtrace
The debugger can also show the function calls and arguments up to the
current point of execution—this is called a stack backtrace and is displayed with the command backtrace:
(gdb) backtrace
#0 0x080483ed in a (p=0x0) at null.c:13
#1 0x080483d9 in main () at null.c:7
In this case, the backtrace shows that the crash at line 13 occurred when
the function a() was called with an argument of p=0x0, from line 7 in
main(). It is possible to move to different levels in the stack trace, and
examine their variables, using the debugger commands up and down.
A complete description of all the commands available in gdb can be
found in the manual “Debugging with GDB: The GNU Source-Level Debugger” (see [Further reading], page 91).

Chapter 6: Compiling with optimization

45

6 Compiling with optimization
GCC is an optimizing compiler. It provides a wide range of options which
aim to increase the speed, or reduce the size, of the executable files it
generates.
Optimization is a complex process. For each high-level command in
the source code there are usually many possible combinations of machine
instructions that can be used to achieve the appropriate final result. The
compiler must consider these possibilities and choose among them.
In general, different code must be generated for different processors,
as they use incompatible assembly and machine languages. Each type
of processor also has its own characteristics—some CPUs provide a large
number of registers for holding intermediate results of calculations, while
others must store and fetch intermediate results from memory. Appropriate code must be generated in each case.
Furthermore, different amounts of time are needed for different instructions, depending on how they are ordered. GCC takes all these factors
into account and tries to produce the fastest executable for a given system
when compiling with optimization.

6.1 Source-level optimization
The first form of optimization used by GCC occurs at the source-code
level, and does not require any knowledge of the machine instructions.
There are many source-level optimization techniques—this section describes two common types: common subexpression elimination and function inlining.

6.1.1 Common subexpression elimination
One method of source-level optimization which is easy to understand involves computing an expression in the source code with fewer instructions,
by reusing already-computed results. For example, the following assignment:
x = cos(v)*(1+sin(u/2)) + sin(w)*(1-sin(u/2))
can be rewritten with a temporary variable t to eliminate an unnecessary
extra evaluation of the term sin(u/2):
t = sin(u/2)
x = cos(v)*(1+t) + sin(w)*(1-t)

46

An Introduction to GCC

This rewriting is called common subexpression elimination (CSE), and
is performed automatically when optimization is turned on.(1) Common
subexpression elimination is powerful, because it simultaneously increases
the speed and reduces the size of the code.

6.1.2 Function inlining
Another type of source-level optimization, called function inlining, increases the efficiency of frequently-called functions.
Whenever a function is used, a certain amount of extra time is required
for the CPU to carry out the call: it must store the function arguments
in the appropriate registers and memory locations, jump to the start of
the function (bringing the appropriate virtual memory pages into physical
memory or the CPU cache if necessary), begin executing the code, and
then return to the original point of execution when the function call is
complete. This additional work is referred to as function-call overhead.
Function inlining eliminates this overhead by replacing calls to a function
by the code of the function itself (known as placing the code in-line).
In most cases, function-call overhead is a negligible fraction of the
total run-time of a program. It can become significant only when there
are functions which contain relatively few instructions, and these functions account for a substantial fraction of the run-time—in this case the
overhead then becomes a large proportion of the total run-time.
Inlining is always favorable if there is only one point of invocation of
a function. It is also unconditionally better if the invocation of a function
requires more instructions (memory) than moving the body of the function in-line. This is a common situation for simple accessor functions in
C++, which can benefit greatly from inlining. Moreover, inlining may facilitate further optimizations, such as common subexpression elimination,
by merging several separate functions into a single large function.
The following function sq(x) is a typical example of a function that
would benefit from being inlined. It computes x2 , the square of its argument x:
double
sq (double x)
{
return x * x;
}
(1)

Temporary values introduced by the compiler during common subexpression elimination are only used internally, and do not affect real variables.
The name of the temporary variable ‘t’ shown above is only used as an
illustration.

Chapter 6: Compiling with optimization

47

This function is small, so the overhead of calling it is comparable to the
time taken to execute the single multiplication carried out by the function
itself. If this function is used inside a loop, such as the one below, then
the function-call overhead would become substantial:
for (i = 0; i < 1000000; i++)
{
sum += sq (i + 0.5);
}
Optimization with inlining replaces the inner loop of the program with
the body of the function, giving the following code:
for (i = 0; i < 1000000; i++)
{
double t = (i + 0.5); /* temporary variable */
sum += t * t;
}
Eliminating the function call and performing the multiplication in-line
allows the loop to run with maximum efficiency.
GCC selects functions for inlining using a number of heuristics, such
as the function being suitably small. As an optimization, inlining is carried out only within each object file. The inline keyword can be used
to request explicitly that a specific function should be inlined wherever
possible, including its use in other files.(2) The GCC Reference Manual
“Using GCC” provides full details of the inline keyword, and its use
with the static and extern qualifiers to control the linkage of explicitly
inlined functions (see [Further reading], page 91).

6.2 Speed-space tradeoffs
While some forms of optimization, such as common subexpression elimination, are able to increase the speed and reduce the size of a program
simultaneously, other types of optimization produce faster code at the expense of increasing the size of the executable. This choice between speed
and memory is referred to as a speed-space tradeoff. Optimizations with
a speed-space tradeoff can also be used to make an executable smaller, at
the expense of making it run slower.

6.2.1 Loop unrolling
A prime example of an optimization with a speed-space tradeoff is loop
unrolling. This form of optimization increases the speed of loops by elim(2)

In this case, the definition of the inline function must be made available to
the other files (in a header file, for example).

48

An Introduction to GCC

inating the “end of loop” condition on each iteration. For example, the
following loop from 0 to 7 tests the condition i < 8 on each iteration:
for (i = 0; i < 8; i++)
{
y[i] = i;
}
At the end of the loop, this test will have been performed 9 times, and a
large fraction of the run time will have been spent checking it.
A more efficient way to write the same code is simply to unroll the
loop and execute the assignments directly:
y[0] = 0;
y[1] = 1;
y[2] = 2;
y[3] = 3;
y[4] = 4;
y[5] = 5;
y[6] = 6;
y[7] = 7;
This form of the code does not require any tests, and executes at maximum
speed. Since each assignment is independent, it also allows the compiler
to use parallelism on processors that support it. Loop unrolling is an
optimization that increases the speed of the resulting executable but also
generally increases its size (unless the loop is very short, with only one or
two iterations, for example).
Loop unrolling is also possible when the upper bound of the loop is
unknown, provided the start and end conditions are handled correctly.
For example, the same loop with an arbitrary upper bound,
for (i = 0; i < n; i++)
{
y[i] = i;
}
can be rewritten by the compiler as follows:
for (i = 0; i < (n % 2); i++)
{
y[i] = i;
}
for ( ; i + 1 < n; i += 2) /* no initializer */
{
y[i] = i;
y[i+1] = i+1;
}

Chapter 6: Compiling with optimization

49

The first loop handles the case i = 0 when n is odd, and the second loop
handles all the remaining iterations. Note that the second loop does
not use an initializer in the first argument of the for statement, since
it continues where the first loop finishes. The assignments in the second
loop can be parallelized, and the overall number of tests is reduced by a
factor of 2 (approximately). Higher factors can be achieved by unrolling
more assignments inside the loop, at the cost of greater code size.

6.3 Scheduling
The lowest level of optimization is scheduling, in which the compiler determines the best ordering of individual instructions. Most CPUs allow
one or more new instructions to start executing before others have finished. Many CPUs also support pipelining, where multiple instructions
execute in parallel on the same CPU.
When scheduling is enabled, instructions must be arranged so that
their results become available to later instructions at the right time, and to
allow for maximum parallel execution. Scheduling improves the speed of
an executable without increasing its size, but requires additional memory
and time in the compilation process itself (due to its complexity).

6.4 Optimization levels
In order to control compilation-time and compiler memory usage, and
the trade-offs between speed and space for the resulting executable, GCC
provides a range of general optimization levels, numbered from 0–3, as
well as individual options for specific types of optimization.
An optimization level is chosen with the command line option
‘-OLEVEL ’, where LEVEL is a number from 0 to 3. The effects of the
different optimization levels are described below:
‘-O0’ or no ‘-O’ option (default)
At this optimization level GCC does not perform any optimization and compiles the source code in the most straightforward way
possible. Each command in the source code is converted directly
to the corresponding instructions in the executable file, without
rearrangement. This is the best option to use when debugging a
program.
The option ‘-O0’ is equivalent to not specifying a ‘-O’ option.
‘-O1’ or ‘-O’
This level turns on the most common forms of optimization that
do not require any speed-space tradeoffs. With this option the
resulting executables should be smaller and faster than with ‘-O0’.

50

An Introduction to GCC
The more expensive optimizations, such as instruction scheduling,
are not used at this level.
Compiling with the option ‘-O1’ can often take less time than compiling with ‘-O0’, due to the reduced amounts of data that need to
be processed after simple optimizations.

‘-O2’ This option turns on further optimizations, in addition to those
used by ‘-O1’. These additional optimizations include instruction
scheduling. Only optimizations that do not require any speed-space
tradeoffs are used, so the executable should not increase in size. The
compiler will take longer to compile programs and require more
memory than with ‘-O1’. This option is generally the best choice
for deployment of a program, because it provides maximum optimization without increasing the executable size. It is the default
optimization level for releases of GNU packages.
‘-O3’ This option turns on more expensive optimizations, such as function inlining, in addition to all the optimizations of the lower levels
‘-O2’ and ‘-O1’. The ‘-O3’ optimization level may increase the speed
of the resulting executable, but can also increase its size. Under
some circumstances where these optimizations are not favorable,
this option might actually make a program slower.
‘-funroll-loops’
This option turns on loop-unrolling, and is independent of the other
optimization options. It will increase the size of an executable.
Whether or not this option produces a beneficial result has to be
examined on a case-by-case basis.
‘-Os’ This option selects optimizations which reduce the size of an executable. The aim of this option is to produce the smallest possible
executable, for systems constrained by memory or disk space. In
some cases a smaller executable will also run faster, due to better
cache usage.
It is important to remember that the benefit of optimization at the
highest levels must be weighed against the cost. The cost of optimization
includes greater complexity in debugging, and increased time and memory
requirements during compilation. For most purposes it is satisfactory to
use ‘-O0’ for debugging, and ‘-O2’ for development and deployment.

6.5 Examples
The following program will be used to demonstrate the effects of different
optimization levels:

Chapter 6: Compiling with optimization

51

#include <stdio.h>
double
powern (double d, unsigned n)
{
double x = 1.0;
unsigned j;
for (j = 1; j <= n; j++)
x *= d;
return x;
}
int
main (void)
{
double sum = 0.0;
unsigned i;
for (i = 1; i <= 100000000; i++)
{
sum += powern (i, i % 5);
}
printf ("sum = %g\n", sum);
return 0;
}
The main program contains a loop calling the powern function. This
function computes the n-th power of a floating point number by repeated
multiplication—it has been chosen because it is suitable for both inlining
and loop-unrolling. The run-time of the program can be measured using
the time command in the GNU Bash shell.
Here are some results for the program above, compiled on a 566 MHz
Intel Celeron with 16 KB L1-cache and 128 KB L2-cache, using GCC 3.3.1
on a GNU/Linux system:
$ gcc -Wall -O0 test.c -lm
$ time ./a.out
real
0m13.388s
user
0m13.370s
sys
0m0.010s
$ gcc -Wall -O1 test.c -lm

52

An Introduction to GCC
$ time ./a.out
real
0m10.030s
user
0m10.030s
sys
0m0.000s
$ gcc -Wall -O2 test.c -lm
$ time ./a.out
real
0m8.388s
user
0m8.380s
sys
0m0.000s
$ gcc -Wall -O3 test.c -lm
$ time ./a.out
real
0m6.742s
user
0m6.730s
sys
0m0.000s
$ gcc -Wall -O3 -funroll-loops test.c -lm
$ time ./a.out
real
0m5.412s
user
0m5.390s
sys
0m0.000s

The relevant entry in the output for comparing the speed of the resulting
executables is the ‘user’ time, which gives the actual CPU time spent
running the process. The other rows, ‘real’ and ‘sys’, record the total
real time for the process to run (including times where other processes
were using the CPU) and the time spent waiting for operating system
calls. Although only one run is shown for each case above, the benchmarks
were executed several times to confirm the results.
From the results it can be seen in this case that increasing the optimization level with ‘-O1’, ‘-O2’ and ‘-O3’ produces an increasing speedup,
relative to the unoptimized code compiled with ‘-O0’. The additional
option ‘-funroll-loops’ produces a further speedup. The speed of the
program is more than doubled overall, when going from unoptimized code
to the highest level of optimization.
Note that for a small program such as this there can be considerable
variation between systems and compiler versions. For example, on a Mobile 2.0 GHz Intel Pentium 4M system the trend of the results using the
same version of GCC is similar except that the performance with ‘-O2’
is slightly worse than with ‘-O1’. This illustrates an important point:
optimizations may not necessarily make a program faster in every case.

Chapter 6: Compiling with optimization

53

6.6 Optimization and debugging
With GCC it is possible to use optimization in combination with the
debugging option ‘-g’. Many other compilers do not allow this.
When using debugging and optimization together, the internal rearrangements carried out by the optimizer can make it difficult to see what
is going on when examining an optimized program in the debugger. For
example, temporary variables are often eliminated, and the ordering of
statements may be changed.
However, when a program crashes unexpectedly, any debugging information is better than none—so the use of ‘-g’ is recommended for optimized programs, both for development and deployment. The debugging
option ‘-g’ is enabled by default for releases of GNU packages, together
with the optimization option ‘-O2’.

6.7 Optimization and compiler warnings
When optimization is turned on, GCC can produce additional warnings
that do not appear when compiling without optimization.
As part of the optimization process, the compiler examines the use of
all variables and their initial values—this is referred to as data-flow analysis. It forms the basis for other optimization strategies, such as instruction
scheduling. A side-effect of data-flow analysis is that the compiler can detect the use of uninitialized variables.
The ‘-Wuninitialized’ option (which is included in ‘-Wall’) warns
about variables that are read without being initialized. It only works when
the program is compiled with optimization to enable data-flow analysis.
The following function contains an example of such a variable:
int
sign (int x)
{
int s;
if (x > 0)
s = 1;
else if (x < 0)
s = -1;
return s;
}
The function works correctly for most arguments, but has a bug when x
is zero—in this case the return value of the variable s will be undefined.

54

An Introduction to GCC

Compiling the program with the ‘-Wall’ option alone does not produce any warnings, because data-flow analysis is not carried out without
optimization:
$ gcc -Wall -c uninit.c
To produce a warning, the program must be compiled with ‘-Wall’ and
optimization simultaneously. In practice, the optimization level ‘-O2’ is
needed to give good warnings:
$ gcc -Wall -O2 -c uninit.c
uninit.c: In function ‘sign’:
uninit.c:4: warning: ‘s’ might be used uninitialized
in this function
This correctly detects the possibility of the variable s being used without
being defined.
Note that while GCC will usually find most uninitialized variables,
it does so using heuristics which will occasionally miss some complicated
cases or falsely warn about others. In the latter situation, it is often
possible to rewrite the relevant lines in a simpler way that removes the
warning and improves the readability of the source code.

Chapter 7: Compiling a C++ program

55

7 Compiling a C++ program
This chapter describes how to use GCC to compile programs written in
C++, and the command-line options specific to that language.
The GNU C++ compiler provided by GCC is a true C++ compiler—it
compiles C++ source code directly into assembly language. Some other
C++ “compilers” are translators which convert C++ programs into C, and
then compile the resulting C program using an existing C compiler. A
true C++ compiler, such as GCC, is able to provide better support for
error reporting, debugging and optimization.

7.1 Compiling a simple C++ program
The procedure for compiling a C++ program is the same as for a C program, but uses the command g++ instead of gcc. Both compilers are part
of the GNU Compiler Collection.
To demonstrate the use of g++, here is a version of the Hello World
program written in C++:
#include <iostream>
int
main ()
{
std::cout << "Hello, world!" << std::endl;
return 0;
}
The program can be compiled with the following command line:
$ g++ -Wall hello.cc -o hello
The C++ frontend of GCC uses many of the same the same options as the
C compiler gcc. It also supports some additional options for controlling
C++ language features, which will be described in this chapter. Note that
C++ source code should be given one of the valid C++ file extensions ‘.cc’,
‘.cpp’, ‘.cxx’ or ‘.C’ rather than the ‘.c’ extension used for C programs.
The resulting executable can be run in exactly same way as the C
version, simply by typing its filename:
$ ./hello
Hello, world!
The executable produces the same output as the C version of the program,
using std::cout instead of the C printf function. All the options used in

56

An Introduction to GCC

the gcc commands in previous chapters apply to g++ without change, as
do the procedures for compiling and linking files and libraries (using g++
instead of gcc, of course). One natural difference is that the ‘-ansi’ option
requests compliance with the C++ standard, instead of the C standard,
when used with g++.
Note that programs using C++ object files must always be linked with
g++, in order to supply the appropriate C++ libraries. Attempting to link
a C++ object file with the C compiler gcc will cause “undefined reference”
errors for C++ standard library functions:
$ g++ -Wall -c hello.cc
$ gcc hello.o
(should use g++)
hello.o: In function ‘main’:
hello.o(.text+0x1b): undefined reference to ‘std::cout’
.....
hello.o(.eh_frame+0x11):
undefined reference to ‘__gxx_personality_v0’
Linking the same object file with g++ supplies all the necessary C++ libraries and will produce a working executable:
$ g++ hello.o
$ ./a.out
Hello, world!
A point that sometimes causes confusion is that gcc will actually compile
C++ source code when it detects a C++ file extension, but cannot then
link the resulting object files.
$ gcc -Wall -c hello.cc
(succeeds, even for C++)
$ gcc hello.o
hello.o: In function ‘main’:
hello.o(.text+0x1b): undefined reference to ‘std::cout’
In order to avoid this problem it is best to use g++ consistently for C++
programs, and gcc for C programs.

7.2 Using the C++ standard library
An implementation of the C++ standard library is provided as a part of
GCC. The following program uses the standard library string class to
reimplement the Hello World program:
#include <string>
#include <iostream>
using namespace std;
int

Chapter 7: Compiling a C++ program

57

main ()
{
string s1 = "Hello,";
string s2 = "World!";
cout << s1 + " " + s2 << endl;
return 0;
}
The program can be compiled and run using the same commands as above:
$ g++ -Wall hellostr.cc
$ ./a.out
Hello, World!
Note that in accordance with the C++ standard, the header files for the
C++ library itself do not use a fi