Copyright © 2002, 2003 Mathieu Lacage
This article is copyrighted to Mathieu Lacage. Copying and/or reuse of this document is not allowed. You may link to this page but you may not redistribute copies and/or modifications of this article.
| Revision History | |
|---|---|
| Revision 0.0.4 | 20-09-2003 |
Update build system. | |
| Revision 0.0.3 | 22-05-2003 |
| |
| Revision 0.0.2 | 20-05-2003 |
| |
Table of Contents
A lot of C++ development frameworks do not use exceptions: lots of C++ developers shy away from what they feel is an overly complex feature of the language which they don't understand. Lots of articles available on the internet contribute to this feeling by describing lots and lots of tricky aspects of C++ exceptions. Most of these articles deal solely with the so-called exception safety property of C++ code.
Most such articles usually assume the reader is familiar with the normal aspects of exceptions: their throwing and handling. These articles furthermore seem to hold these aspects as simple enough that they do not deserve much more discussion. However, if you actually try to write exception code, a lot of practical and theorical questions arise, none of which are answered by any of the introductions to C++ exceptions found online. After a lot of time spent browsing the internet for good introductions to the basic concepts of exception handling, I finally gave up and went to the ANSI website and bought an electronic copy of the C++ standard. Because I do not expect most programmers to do this and because I do not think reading the C++ standard should be needed to learn C++, I decided to write down what I learned about exceptions :)
Exceptions are usually presented as a way to simplify error handling in user code: user code is supposed to split the error handling code which is usually multiplexed with the main code path from the normal non-error code. The following sample shows code with and without exceptions.
// Without exceptions
typedef enum {
error_alloc,
error,
ok
} ErrorStatus;
class Automatic {};
class Foo
{
public:
ErrorStatus DoAction (void) {return ok;}
};
class Bar
{
public:
ErrorStatus DoAction (void) {return error;}
};
ErrorStatus f (void)
{
Automatic a;
Foo *foo = new Foo ();
if (foo == 0) {
// a's destructor invoked automatically upon exit.
return error_alloc;
}
ErrorStatus status = foo->DoAction ();
if (status != ok) {
delete foo;
// a's destructor invoked automatically upon exit.
return status;
}
Bar *bar = new Bar ();
if (bar == 0) {
delete foo;
// a's destructor invoked automatically upon exit.
return error_alloc;
}
status = bar->DoAction ();
if (status != ok) {
delete foo;
delete bar;
// a's destructor invoked automatically upon exit.
return status;
}
delete foo;
delete bar;
// a's destructor invoked automatically upon exit.
return ok;
}
int main (int argc, char *argv[])
{
ErrorStatus status;
status = f ();
if (status != ok) {
}
return 0;
}
and with exceptions:
class Error {};
class Automatic {};
class Foo
{
public:
void DoAction() throw(Error) { throw Error(); }
};
class Bar
{
public:
void DoAction() throw(Error) { throw Error(); }
};
void f()
{
Foo* foo = 0;
try {
Automatic a;
foo = new Foo;
foo->DoAction();
Bar* bar = 0;
try {
bar = new Bar;
bar->DoAction();
} catch (...) {
delete bar;
throw;
}
delete foo;
delete bar;
// a's destructor invoked automatically after leaving this scope
} catch (...) {
delete foo;
throw;
}
}
int main (int argc, char *argv[])
{
try {
f ();
} catch (Error &e) {
}
}
Both of these code listings try to invoke multiple methods on multiple objects which can all trigger errors. The errors are caught by the Bar method, they are handled localy by freeing the allocated memory and are propagated to the caller in case the caller might want to do something with it.
The syntax for the exception version seems rather straightforward (we'll revisit exception throwing syntax and semantics as well as exception catching and propagation syntax and semantics in the following sections): To return an error, rather than execute the statement return error;, you execute the statement throw Error ();. At this point, the C++ runtime is triggered and unwinds the call stack: it parses the stack from the inner-most function to the outer-most function and invokes the first code block identified by a matching catch statement in one of these functions. The unwinding stops when either the C++ runtime has found a matching catch block which does not rethrow the exception or when it reaches the top of the function stack. At each level, the unwinding routine invokes the destructors of any temporary variables left on the stack.
This process is absolutely similar in terms of functionality to the previous exception-free code which returned from the function as soon as an error was found and propagated the error code to its calling function, hoping that someone higher up in the call hierarchy knows better what to do with this error:
Table 1. Exceptions vs Error codes
| Error code | Exceptions | |
|---|---|---|
| Raise error | return error; | throw Error (); |
| Propagate error | if (status != ok) { /* error code*/; return status;} | try {} catch (condition) { /* error code */; throw;} |
| Treat error | if (status != ok) {/* do something about it */;} | try {} catch (condition) { /* error code */;} |
| Declare error | ErrorCode foo (void); | void foo throw (ErrorCode); |
| Temporary destructor invocation | Invoked whenever the return statement is reached | Invoked during unwinding after any present catch block is executed |
Of course, there are a number of rather subtle differences:
The exception version of the code can grow pretty deep in terms of indenting which makes it (in my opinion) rather ugly and unreadable. Readers who believe they are smarter than me can try to do what is described in the section called “Smart and working exception code”.
Exception declaration in method/function signatures is optional. That is, if a function raises an exception in a code block its declaration does not need to specify that it raises the exception. This property has a rather unpleasant consequence: it means you need to be ready to see _any_ exception thrown from any function/method.
The constructor in the exceptions version can throw exceptions (This is the only way to return an error from a constructor but it is better not to throw unless really needed from a constructor) while the error code version cannot return any error code since a constructor does not return any value. Some sick readers might imagine they can give a reference or a pointer to the constructor and have the constructor assign its return value to this parameter. This looks horribly ugly and feels like a just fscking truly scary idea but it is not possible with an object's default constructor which, by definition, does not have any parameters.
The syntax to throw exceptions might seem at first simple but, as the other bits and pieces of exception-related syntax, it has numerous subtle points.
The throw keyword used to throw exceptions actually has 3 different uses:
In the context of a block of code marked by a catch statement, throw; is used to specify that the exception caught by the block is to be rethrown. This means that instead of stopping exception propagation at this catch statement, the C++ runtime will keep on unwinding the stack with the exception thrown originally. In any other context, invoking throw; is considered an error and is caught by the runtime.
In the context of a function or method declaration, ret_type function (par_type1 ,par_type2) throw (Exception1Name, Excepion2Name); specifies explicitely a subset of the exceptions the method can throw. The function can throw any other exception. The exception list specification is thus nothing more than informative. A special case of this syntax is when the exception list is empty ret_type function (par_type1 ,par_type2) throw ();: this specifies that the function cannot throw _any_ exception. It should be noted that such a promise is extremly difficult to hold and that extremly few functions specify this. You are thus recommended to think twice about adding this into your APIs: the compiler is free to use this information to generate code or optimize it...
Finally, in the context of any code block, throw exception_object; throws an exception and triggers its propagation up into the stack.
Interestingly, the seemingly simple syntax used to throw an exception is a bit more complicated than it looks. The exact definition of this syntax is: throw assignment-expression [1] . The static type of the assignment-expression is then used to initialize a temporary object of that type (through the type's copy-constructor) into a magic memory location which must stay valid during the subsequent unwinding. The magic memory location used is, of course, completely implementation-dependent. The temporary object is destroyed (that is, its destructor is first invoked and then its magic memory location is freed) when the stack unwinding completes.
The C++ specification suggests that the parameter to the throw statement can be viewed as an argument to a function call named throw. The syntax described above can thus be considered equivalent to throw (assignment-expression); which means that it is possible to replace the assignment-expression with anything used in the parameters to a normal function call. Most notably, the following sample code shows a few interesting examples:
void function (void) {
// void bar (Foo foo);
Foo foo = Foo ();
bar (foo);
}
void function (void) {
// void bar (Foo *foo);
Foo *foo = new Foo ();
bar (foo);
}
void function (void) {
// void bar (Foo foo);
bar (Foo ());
}
void function (void) {
// void bar (Foo *foo);
bar (new Foo ());
}
All these examples can be re-written with throw. They are commented below.
void function (void) {
// push a copy of the foo object on the stack and invoke throw.
Foo foo = Foo ();
throw foo; // 1
}
void function (void) {
// push a copy of the foo pointer on the stack and invoke throw.
Foo *foo = new Foo ();
throw foo; // 2
}
void function (void) {
// allocate the foo object on the stack, constructs it and invoke throw.
throw Foo (); // 3
}
void function (void) {
// call new to allocate memory, construct this memory, push a copy
// of the pointer returned by new on the stack, invoke throw.
throw new Foo (); // 4
}
The copy of the object constructed on the stack for Foo is used to copy-construct the magic temporary copy which is destroyed as soon as stack unwinding completes. It is interesting to note that the original object on function's stack is destroyed before its copy since the stack unwinding code invokes its destructor with the destructor of all the local variables of the function.
A copy of the pointer to the object allocated by the operator new is pushed on the stack. The C++ runtime allocates a magic temporary copy of the pointer and deletes this copy o fthe pointer when stack unwinding completes. Here, no-one is reponsible for invoking operator delete on the object's object. As such, the final catch block which receives this exception should delete the pointer to the exception to avoid a memory leak.
An object is allocated on the stack and is constructed there. The C++ runtime then uses this object to copy-construct the magic temporary copy. The original object is destroyed during stack unwinding as in 1).
The memory for an object is allocated by operator new, its constructor is invoked, a pointer pointing to this object is pushed on the stack and the C++ runtime creates a magic temporary copy of the pointer. Here, as in 2), someone must decide to invoke operator delete on the object created for the throw. Most likely this will be the code in the final matching catch block.
To summarize, it is important to understand that the exception object received by each matching catch block is a copy of the original exception object, created in the throw statement. It is thus vital that both its copy constructor and its destructor (to destroy the copies) are available to the program. The GNU C++ compiler will issue a warning if this not the case. It is also vital to understand that if you use a pointer as the exception object, the problem of freeing the memory it points it must be solved.
Given all these constraints, I'd suggest to always throw a class type (don't try to throw things such as int or const char *) and to never throw a pointer to such a class type. This will ensure that the memory used by exception objects will be automatically managed by the C++ compiler and its runtime, thus reducing potential bugs.
The syntax to declare a block of code as being able to catch an exception being thrown in another block of code is not as straightforward as most texts lead you to believe. The main syntax is, of course, pretty simple:
// catch all exceptions
try {
// code which throws an exception
} catch (...) {
// code which catches any exception
}
// catch only a given type of exceptions
try {
// code which throws an exception
} catch (const Excp &e) {
// code which catches only exceptions of type const Excp
}
The catch handler catches either all exceptions if the exception specification is ... or only a given exception type as specified by the exception specification. A handler matches an exception type if both underlying types are intuitively compatible in a way similar to that of a function call and its arguments. [2] The code in the catch handler can then access a copy of the original exception object. If this catch handler does not decide to progate the original exception with an empty throw; statement, it can throw another new exception (the original exception object is often copied into the new exception object to keep information about where the exception comes from) or not throw anything in which case stack unwinding stops and resumes after the catch block.
Catching exceptions can become interesting when you look at the more subtle syntax for the so-called function-try:
C::C(int ii, double id)
try
: i(f(ii)), d(id) // initializer list.
{
// constructor function body
}
catch (...)
{
// handles exceptions thrown from the ctor-initializer
// and from the constructor function body
}
This function-try is used to catch exceptions thrown either during the initialization of the members of an object of during the execution of the constructor body of this object. In the sample code above, the catch block will be executed if an exception is thrown during the construction of the initializer list or during the excution of the function body. The exception is automatically rethrown when control reaches the end of the catch block. This means that it is impossible to actually catch exceptions and stop their propagation in a function-try-block which kind of defeats the whole purpose of a catch block! [3]
Robert Schmidt discusses the details of this rather surprising behavior in Part 14 of his C++ exceptions articles (se e the section called “References”).
A lot of work, sweat and saliva was dedicated to discover and describe programming idioms to ease the work of writing exception-safe code. Most texts (the interested reader will find useful references in the section called “References”) which deal with this problem distinguish between:
Strong exception-safety guarantee: functions ensure that the state of the program is atomically changed with regard to exceptions. That is, if an exception arises at any point, the functions guarantee that the program state is not changed.
Weak (or Basic) exception-safety guarantee: functions ensure that the program does not become corrupted after an exception is thrown. The state of some objects might have changed but no memory is to be leaked and no object should be in a corrupted or inconsistant state.
The focus is on writing all the code which lies between the throw and the corresponding catch statement. As you probably have noticed it, the same problem exists in code with uses exceptions and code which does not uses them. In both cases, the programmer must be careful to free all localy allocated ressources when the error code path is taken which is always a chore and error-prone.
What I have always found fascinating and rather surprising is that all the techniques developed to improve the look of code which must provide exception safety can be applied as-is to code which does not use exceptions to propagate program errors. The main technique used by exception safety code is to build upon one of the properties of the exception specification: all automatic variables locally allocated are destroyed upon stack unwinding. While this property can be seen by most readers as the source of all problems (pointers are destroyed but not the data they point to), it is paradoxically also the source of the solution.
The solution lies in what everyone calls smart pointers: while wide use of smart pointers makes templates mandatory, it is possible to understand the basic mechanism of these small object classes without using any templates.
A smart pointer is a small class which is used to hold a pointer to another object:
class Error {};
class Foo
{
public:
void Action (void) {throw Error ();}
};
class FooPtr
{
public:
FooPtr (Foo *ptr) {m_ptr = ptr;}
~FooPtr () {delete m_ptr;}
Foo *GetFoo (void) {return m_ptr;}
private:
Foo *m_ptr;
};
static void dumb (void)
{
Foo *ptr = new Foo ();
try {
ptr->Action ();
} catch (Error &e) {
delete ptr;
}
}
static void smart (void)
{
FooPtr ptr (new Foo ());
ptr.GetFoo ()->Action ();
// FooPtr's destructor is invoked upon exit of the function code block.
// it destroys automatically the object it points to.
// If an exception arises, FooPtr's destructor is also invoked because it
// is an automatic local variable.
}
int main (int argc, char *argv[])
{
dumb ();
try {
smart ();
} catch (Error &e) {
// handle.
}
}
The key point to notice here is that the FooPtr class is used to associate a destructor to a pointer to the Foo class which is thus automatically invoked by the stack unwinding code during exception propagation.
Of course, the above-described smart pointer class can be rewritten to the following less easy to understand code which has the nice side-effect to allow for much nicer-looking client code:
class Error {};
class Foo
{
public:
void Action (void) {throw Error ();}
};
class FooPtr
{
public:
FooPtr (Foo *ptr) {m_ptr = ptr;}
~FooPtr () {delete m_ptr;}
Foo *operator-> () {return m_ptr;}
private:
Foo *m_ptr;
};
static void smart (void)
{
FooPtr ptr (new Foo ());
ptr->Action ();
// FooPtr's destructor is invoked upon exit of the function code block.
// it destroys automatically the object it points to.
// If an exception arises, FooPtr's destructor is also invoked because it
// is an automatic local variable.
}
int main (int argc, char *argv[])
{
try {
smart ();
} catch (Error &e) {
// handle.
}
}
Here, the call to GetFoo () is replaced with the operator ->. If you really want to design a nice smart pointer class, it will clearly require a bit more code and thinking (look at the template class auto_ptr in the C++ standard library) but the basic idea will stay the same.
As I mentioned earlier, the idea of smart pointers can be used to rewrite the error-propagation code we wrote in the section called “What is an exception ?” (if you write smart pointer classes which override the == and -> operators):
// no exceptions
typedef enum {
ok,
error,
error_alloc
} ErrorStatus;
class Foo
{
public:
ErrorStatus Action (void) {return ok;}
};
class FooPtr
{
public:
FooPtr (Foo *ptr) {m_ptr = ptr;}
~FooPtr () {delete m_ptr;}
Foo *operator-> () {return m_ptr;}
bool operator == (void * v) {return m_ptr == v;}
private:
Foo *m_ptr;
};
class Bar
{
public:
ErrorStatus Action (void) {return ok;}
};
class BarPtr
{
public:
BarPtr (Bar *ptr) {m_ptr = ptr;}
~BarPtr () {delete m_ptr;}
Bar *operator-> () {return m_ptr;}
bool operator == (void * v) {return m_ptr == v;}
private:
Bar *m_ptr;
};
ErrorStatus no_exceptions (void)
{
FooPtr foo (new Foo ());
if (foo == 0) {
return error_alloc;
}
ErrorStatus status = foo->Action ();
if (status != ok) {
return status;
}
BarPtr bar (new Bar ());
if (bar == 0) {
return error_alloc;
}
status = bar->Action ();
if (status != ok) {
return status;
}
out:
return status;
}
int main (int argc, char *argv[])
{
ErrorStatus status;
status = no_exceptions ();
if (status != ok) {
}
}
And the exception version:
// exceptions
class Error {};
class Foo
{
public:
void Action (void) {throw Error ();}
};
class FooPtr
{
public:
FooPtr (Foo *ptr) {m_ptr = ptr;}
~FooPtr () {delete m_ptr;}
Foo *operator-> () {return m_ptr;}
bool operator == (void * v) {return m_ptr == v;}
private:
Foo *m_ptr;
};
class Bar
{
public:
void Action (void) {throw Error ();}
};
class BarPtr
{
public:
BarPtr (Bar *ptr) {m_ptr = ptr;}
~BarPtr () {delete m_ptr;}
Bar *operator-> () {return m_ptr;}
bool operator == (void * v) {return m_ptr == v;}
private:
Bar *m_ptr;
};
void exceptions (void)
{
FooPtr foo (new Foo ());
foo->Action ();
BarPtr bar (new Bar ());
bar->Action ();
}
int main (int argc, char *argv[])
{
try {
exceptions ();
} catch (Error &e) {
}
}
Here, admitedly, the exception version of the code looks, erm, a bit more simple than the error-code based version of the code which is, at best, clutered with if statements (but it's clearly much nicer code since you do not have to worry about freeing the objects correctly in each different if case: it is automatically handled by the compiler).
As can be seen in the previous sections, while it might seem at first that code written to deal with program errors by using exceptions and code written using simple status error codes are very similar, it is not true anymore if you use a few relatively simple tricks: the exception-based code becomes much cleaner which is what exceptions were designed to deliver.
It was often pointed out that, while exceptions do provide nicer-looking code in a lot of cases, it is very difficult to write exception-safe code in all cases. Some people claim that programmers then spend most of their time thinking about how to write exception-safe code and not how to to write their application. What these people forget is that if they decide not to use exceptions, they still need to do something about their program errors: how do you propagate errors up to the piece of code which can deal with it ?
There are not many solutions:
Use error codes as function/method return values, propagate the error to the caller whenever needed: code is not so nice-looking and easy to read.
Use the good old evil setjmp/longjmp (it is even more evil in the context of C++ code than in the context of C code because the destructors of all the automatic variables on the stack will never be invoked).
Try to use a different exception framework: see the section called “References”, David Turner's paper.
I don't think that any of the above solutions provide a framework which might provide better-looking code than straight exception-based code.
A lot of other people dismiss exception-based code by complaining either about the performance hit or their compiler's compliance to the C++ standard. There isn't much I can do about standard compliance: as far as I can tell, gcc runs on quite a lot of platforms which means you should be able to get pretty good compliance on anything. EDG also sells its standard-compliant C++ frontend to a lot of companies which means you should be able to buy a pretty good compiler for your platforms unless it is very bizare. If you can't, then, what can I say ?
Measuring the performance of exception-based code is pretty difficult: the only interesting reference I could find is available in the section called “References”. I will probably try to do some measurements myself in another article.
Given the code shown in the first section, it is possible to flatten the nested code shown in the first section:
// Initial code
class Error {};
class Automatic {};
class Foo
{
public:
void DoAction() throw(Error) { throw Error(); }
};
class Bar
{
public:
void DoAction() throw(Error) { throw Error(); }
};
void f()
{
Foo* foo = 0;
try {
Automatic a;
foo = new Foo;
foo->DoAction();
Bar* bar = 0;
try {
bar = new Bar;
bar->DoAction();
} catch (...) {
delete bar;
throw;
}
delete foo;
delete bar;
// a's destructor invoked automatically after leaving this scope
} catch (...) {
delete foo;
throw;
}
}
int main (int argc, char *argv[])
{
try {
f ();
} catch (Error &e) {
}
}
And the flattened version which looks much nicer:
// Flattened version.
class Error {};
class Automatic {};
class Foo
{
public:
void DoAction() throw(Error) { throw Error(); }
};
class Bar
{
public:
void DoAction() throw(Error) { throw Error(); }
};
void f (void)
{
Foo *foo = 0;
Bar *bar = 0;
try {
Automatic a;
foo = new Foo ();
foo->DoAction ();
bar = new Bar ();
bar->DoAction ();
delete foo;
delete bar;
// a's destructor invoked automatically after this
} catch (...) {
delete foo;
delete bar;
throw;
}
}
int main (int argc, char *argv[])
{
try {
f ();
} catch (Error &e) {
}
}
The code above works but the exact reason why it works is a bit complicated which is why I defered this discussion until now. The first thing to know is that the delete operator behaves nicely with regard to null pointers. That is, it is not an error to invoke delete on the null pointer. The second interesting point about this code is that the new Bar (); and new Foo () statements cannot half succeed. They either complete successfully in which case the return value of the expression is assigned to the left-hand side of the assignment or they fail completely in which case the left-hand side of the expression is not assigned.
The left-hand side of the expression is never assigned to if an expression is thrown because throwing the exception makes it impossible either for operator new to return or for the constructor to return. In both cases the code which assigns the return value of the expression is never given a chance to run.
Some readers might find the above discussion obvious but, I actually fell in this trap myself and it took me time to be convinced that this code could work so, I thought I should share my surprise :)
As already mentioned in the introduction, there are numerous sources of information on exceptions online. I have found a lot of them completely useless. Here is my list of what proved to be really enlightening:
http://www.octopull.demon.co.uk/c++/dragons/: Here be Dragons, by Alan Griffiths.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndeepc/html/deep01202000.asp: Handling Exceptions, Part 14, by Robert Schmidt, January 20, 2000.
C++ exception handling for IA64, by Christophe de Dinechin. This paper describes the implementation now used in gcc.
http://www.freetype.org/david/reliable-c.html: Robust Design Techniques for C Programs, by David Turner.
http://www.kuzbass.ru:8086/docs/isocpp/: HTML version of the final draft of the C++ standard.