More Books
C++ Gotchas: Avoiding Common Problems in Coding and Design
Main Page
Table of content
Copyright
Addison-Wesley Professional Computing Series
Preface
Acknowledgments
Chapter 1. Basics
Gotcha #1: Excessive Commenting
Gotcha #2: Magic Numbers
Gotcha #3: Global Variables
Gotcha #4: Failure to Distinguish Overloading from Default Initialization
Gotcha #5: Misunderstanding References
Gotcha #6: Misunderstanding Const
Gotcha #7: Ignorance of Base Language Subtleties
Gotcha #8: Failure to Distinguish Access and Visibility
Gotcha #9: Using Bad Language
Gotcha #10: Ignorance of Idiom
Gotcha #11: Unnecessary Cleverness
Gotcha #12: Adolescent Behavior
Chapter 2. Syntax
Gotcha #13: Array/Initializer Confusion
Gotcha #14: Evaluation Order Indecision
Gotcha #15: Precedence Problems
Gotcha #16: 'for' Statement Debacle
Gotcha #17: Maximal Munch Problems
Gotcha #18: Creative Declaration-Specifier Ordering
Gotcha #19: Function/Object Ambiguity
Gotcha #20: Migrating Type-Qualifiers
Gotcha #21: Self-Initialization
Gotcha #22: Static and Extern Types
Gotcha #23: Operator Function Lookup Anomaly
Gotcha #24: Operator '->' Subtleties
Chapter 3. The Preprocessor
Gotcha #25: '#define' Literals
Gotcha #26: '#define' Pseudofunctions
Gotcha #27: Overuse of '#if'
Gotcha #28: Side Effects in Assertions
Chapter 4. Conversions
Gotcha #29: Converting through 'void *'
Gotcha #30: Slicing
Gotcha #31: Misunderstanding Pointer-to-Const Conversion
Gotcha #32: Misunderstanding Pointer-to-Pointer-to-Const Conversion
Gotcha #33: Misunderstanding Pointer-to-Pointer-to-Base Conversion
Gotcha #34: Pointer-to-Multidimensional-Array Problems
Gotcha #35: Unchecked Downcasting
Gotcha #36: Misusing Conversion Operators
Gotcha #37: Unintended Constructor Conversion
Gotcha #38: Casting under Multiple Inheritance
Gotcha #39: Casting Incomplete Types
Gotcha #40: Old-Style Casts
Gotcha #41: Static Casts
Gotcha #42: Temporary Initialization of Formal Arguments
Gotcha #43: Temporary Lifetime
Gotcha #44: References and Temporaries
Gotcha #45: Ambiguity Failure of 'dynamic_cast'
Gotcha #46: Misunderstanding Contravariance
Chapter 5. Initialization
Gotcha #47: Assignment/Initialization Confusion
Gotcha #48: Improperly Scoped Variables
Gotcha #49: Failure to Appreciate C++'s Fixation on Copy Operations
Gotcha #50: Bitwise Copy of Class Objects
Gotcha #51: Confusing Initialization and Assignment in Constructors
Gotcha #52: Inconsistent Ordering of the Member Initialization List
Gotcha #53: Virtual Base Default Initialization
Gotcha #54: Copy Constructor Base Initialization
Gotcha #55: Runtime Static Initialization Order
Gotcha #56: Direct versus Copy Initialization
Gotcha #57: Direct Argument Initialization
Gotcha #58: Ignorance of the Return Value Optimizations
Gotcha #59: Initializing a Static Member in a Constructor
Chapter 6. Memory and Resource Management
Gotcha #60: Failure to Distinguish Scalar and Array Allocation
Gotcha #61: Checking for Allocation Failure
Gotcha #62: Replacing Global New and Delete
Gotcha #63: Confusing Scope and Activation of Member 'new' and 'delete'
Gotcha #64: Throwing String Literals
Gotcha #65: Improper Exception Mechanics
Gotcha #66: Abusing Local Addresses
Gotcha #67: Failure to Employ Resource Acquisition Is Initialization
Gotcha #68: Improper Use of 'auto_ptr'
Chapter 7. Polymorphism
Gotcha #69: Type Codes
Gotcha #70: Nonvirtual Base Class Destructor
Gotcha #71: Hiding Nonvirtual Functions
Gotcha #72: Making Template Methods Too Flexible
Gotcha #73: Overloading Virtual Functions
Gotcha #74: Virtual Functions with Default Argument Initializers
Gotcha #75: Calling Virtual Functions in Constructors and Destructors
Gotcha #76: Virtual Assignment
Gotcha #77: Failure to Distinguish among Overloading, Overriding, and Hiding
Gotcha #78: Failure to Grok Virtual Functions and Overriding
Gotcha #79: Dominance Issues
Chapter 8. Class Design
Gotcha #80: Get/Set Interfaces
Gotcha #81: Const and Reference Data Members
Gotcha #82: Not Understanding the Meaning of Const Member Functions
Gotcha #83: Failure to Distinguish Aggregation and Acquaintance
Gotcha #84: Improper Operator Overloading
Gotcha #85: Precedence and Overloading
Gotcha #86: Friend versus Member Operators
Gotcha #87: Problems with Increment and Decrement
Gotcha #88: Misunderstanding Templated Copy Operations
Chapter 9. Hierarchy Design
Gotcha #89: Arrays of Class Objects
Gotcha #90: Improper Container Substitutability
Gotcha #91: Failure to Understand Protected Access
Gotcha #92: Public Inheritance for Code Reuse
Gotcha #93: Concrete Public Base Classes
Gotcha #94: Failure to Employ Degenerate Hierarchies
Gotcha #95: Overuse of Inheritance
Gotcha #96: Type-Based Control Structures
Gotcha #97: Cosmic Hierarchies
Gotcha #98: Asking Personal Questions of an Object
Gotcha #99: Capability Queries
Bibliography

Gotcha #78: Failure to Grok Virtual Functions and Overriding

Many novice C++ programmers have only a superficial understanding of the mechanics of overriding as it's implemented in C++. Sometimes an illustration of the mechanics of the implementation of overriding helps to clarify things. There are a number of different effective mechanisms for implementing virtual functions and overriding in C++. The treatment below describes one common approach.

Let's look first at a simple implementation for single inheritance.

class B { 
 public:
   virtual int f1();
   virtual void f2( int );
   virtual int f3( int );
};

In this implementation of virtual functions, each virtual function contained within a class is assigned an index by the compiler. For example, B::f1 is assigned index 0, B::f2 is assigned index 1, and so on. These indexes are used to access a table of pointers to functions. The table element at index 0 contains the address of B::f1, the element at index 1 contains the address of B::f2, and so on. Each object of the class contains a pointer, inserted implicitly by the compiler, to the table of function pointers. An object of type B might be laid out as in Figure 7-5.

Figure 7-5. A simple implementation of virtual functions under single inheritance

graphics/07fig05.gif

Colloquially, the table of function pointers is called the "vtbl," pronounced "vee table," and the pointer to vtbl is called the "vptr," pronounced "vee pointer." The constructors for class B initialize the vptr to refer to the appropriate vtbl (see Gotcha #75). Calling a virtual function involves indirection through the vtbl. The function call

B *bp = new B; 
bp->f3(12);

is translated something like this:

(*(bp->vptr)[2])(bp, 12) 

We get the address of the function to call by indexing the vtbl with that function's index. We then make an indirect call, passing the address of the object as the implicit "this" argument to the function. The virtual function mechanism in C++ is efficient. The indirect function call is generally highly optimized for each hardware architecture, and all objects of the same type typically share a single vtbl. Under single inheritance, each object has a single vptr, no matter how many virtual functions are declared in the class.

Let's look at the implementation of a derived class that overrides some of its base class's virtual functions:

class B { 
 public:
   virtual int f1();
   virtual void f2( int );
   virtual int f3( int );
};
class D : public B {
   int f1();
   virtual void f4();
   int f3( int );
};

An object of type D contains a subobject of type B. Typically, but not universally (see Gotcha #70), the base class subobject is located at the start of the derived class object (that is, at offset 0), and any additional derived class data members are appended after the base class part, as in Figure 7-6.

Figure 7-6. A simple implementation of virtual functions under single inheritance for a derived class object. The base class subobject still contains a vptr, but it refers to a table customized for the derived class.

graphics/07fig06.gif

Let's look at the same virtual member function call we saw earlier, but this time we'll use a D object rather than a B object:

B *bp = new D; 
bp->f3(12);

The compiler will generate the same calling sequence, but this time we'll bind at runtime to the function D::f3 rather than B::f3:

(*(bp->vptr)[2])(bp, 12) 

The utility of the virtual function mechanism is more obvious in truly polymorphic code, where the precise type of object being manipulated is unknown:

B *bp = getSomeSortOfB(); 
bp->f3(12);

The virtual calling sequence generated by the compiler is capable of calling, without recompilation, the f3 function of any class derived from B, even of classes that do not yet exist.

Mechanically speaking, overriding is the process of replacing the address of a base class member function with the address of a derived class member function when constructing a virtual function table for a derived class. In our example above, class D has overridden the base class virtual functions f1 and f3, inherited the implementation of f2, and added a new virtual function f4. This is reflected precisely in the structure of the virtual table for class D.

The mechanics of virtual functions under multiple inheritance are more complex in their details but employ essentially the same approach. The additional complexity is the result of a single object's having more than one base class subobject and therefore more than one valid address. Consider the following hierarchy:

class B1 { /* . . . */ }; 
class B2 { /* . . . */ };
class D : public B1, public B2 { /* . . . */ };

A derived class object can be manipulated through the interface of any of its public base classes; this is the meaning of the is-a relationship. Therefore, an object of type D can be referred to through pointers or references to D, B1, or B2:

D *dp = new D; 
B1 *b1p = dp;
B2 *b2p = dp;

Only one base class subobject can be located at offset 0 in a derived class object, so base class subobjects are typically allocated in the order in which they appear on the base class list in the derived class definition. In the case of D, the storage for B1 will come first, followed by that for B2, as in Figure 7-7 (see Gotcha #38).

Figure 7-7. Likely layout of an object under multiple inheritance

graphics/07fig07.gif

Let's flesh out this simple multiple-inheritance hierarchy with some virtual functions:

class B1 { 
 public:
   virtual void f1();
   virtual void f2();
};
class B2 {
 public:
   virtual void f2();
   virtual void f3( int );
   virtual void f4();
};

The B1 and B2 classes each have virtual functions, so objects of these types will each contain a vptr to a class-specific vtbl, as in Figure 7-8.

Figure 7-8. Two potential base classes

graphics/07fig08.gif

A D object is-a B1 and is-a B2, so it will have two vptrs and two associated vtbls (see Figure 7-9):

class D : public B1, public B2 { 
 public:
   void f2();
   void f3( int );
   virtual void f5();
};
Figure 7-9. Possible implementation of virtual functions under multiple inheritance. The complete object overrides virtual functions for both of its base class subobjects.

graphics/07fig09.gif

Notice that D::f2 overrides the f2 in both of its base classes. An overriding derived class function will override every base class virtual function with the same name and signature (number and type of formal arguments), whether the base class is a direct base class or a base class of a base class (of a base class …). Note that even though D adds a new virtual function (D::f5), the compiler doesn't insert a vtpr into the D-specific part of the object. Typically, new derived class virtual functions will be appended to one of the base class virtual function tables.

We do have a problem, though. Let's look at some possible code:

B2 *b2p = new D; 
b2p->f3(12);

We're going to engage in the common practice of manipulating a derived class object through one of its base class interfaces. However, if we generate the same calling sequence we did under the single-inheritance model we examined earlier, we'll wind up with a bad value for the this pointer:

(*(b2p->vptr)[1])(b2p,12) 

The reason is that the call is dynamically bound to D::f3, which is expecting an implicit this argument that refers to the start of a D object. Unfortunately, b2p refers to the start of a B2 (sub)object, which is offset some number of bytes into the D object in which it's embedded. (Refer to Figure 7-7.) It's necessary to "fix up" the value of this passed in the call by adjusting the value of b2p to refer to the start of the D object.

Fortunately, when it's constructing the vtbl for a derived class, the compiler knows precisely what these fix-up values are, since it knows precisely the class for which it's constructing the vtbl and the offsets of the various base class subobjects within the derived class. There are several common ways to apply this fix-up information, from small sections of code (misnamed "thunks") executed before the actual function is attained, to member functions with multiple entry points. Conceptually, the cleanest way to represent the operation is simply to record the required offset value in the vtbl and modify the calling sequence to take the offset into account, as in Figure 7-10.

Figure 7-10. One of many possible implementations of virtual functions under multiple inheritance. This implementation records the fix-up values for the this pointer in the virtual function table itself.

graphics/07fig10.gif

The vtbl entries are now small structures containing the member function address (fptr) and an offset (delta) to add to the this value, and the calling sequence becomes

(*(b2p->vptr)[1].fptr)(b2p+(b2p->vptr)[1].delta,12) 

This code can be heavily optimized, so it's not as expensive as it might look.