More Books
Effective C++ 55 Specific Ways to Improve Your Programs and Designs
Effective C++ Third Edition 55 Specific Ways to Improve Your Programs and Designs
Table of Contents
Copyright
Praise for Effective C++, Third Edition
Addison-Wesley Professional Computing Series
Preface
Acknowledgments
Introduction
Terminology
Chapter 1. Accustoming Yourself to C++
Item 1: View C++ as a federation of languages
Item 2: Prefer consts, enums, and inlines to #defines
Item 3: Use const whenever possible
Item 4: Make sure that objects are initialized before they're used
Chapter 2. Constructors, Destructors, and Assignment Operators
Item 5: Know what functions C++ silently writes and calls
Item 6: Explicitly disallow the use of compiler-generated functions you do not want
Item 7: Declare destructors virtual in polymorphic base classes
Item 8: Prevent exceptions from leaving destructors
Item 9: Never call virtual functions during construction or destruction
Item 10: Have assignment operators return a reference to *this
Item 11: Handle assignment to self in operator=
Item 12: Copy all parts of an object
Chapter 3. Resource Management
Item 13: Use objects to manage resources.
Item 14: Think carefully about copying behavior in resource-managing classes.
Item 15: Provide access to raw resources in resource-managing classes.
Item 16: Use the same form in corresponding uses of new and delete.
Item 17: Store newed objects in smart pointers in standalone statements.
Chapter 4. Designs and Declarations
Item 18: Make interfaces easy to use correctly and hard to use incorrectly
Item 19: Treat class design as type design
Item 20: Prefer pass-by-reference-to-const to pass-by-value
Item 21: Don't try to return a reference when you must return an object
Item 22: Declare data members private
Item 23: Prefer non-member non-friend functions to member functions
Item 24: Declare non-member functions when type conversions should apply to all parameters
Item 25: Consider support for a non-throwing swap
Chapter 5. Implementations
Item 26: Postpone variable definitions as long as possible.
Item 27: Minimize casting.
Item 28: Avoid returning "handles" to object internals.
Item29: Strive for exception-safe code.
Item 30: Understand the ins and outs of inlining.
Item31: Minimize compilation dependencies between files.
Chapter 6. Inheritance and Object-Oriented Design
Item 32: Make sure public inheritance models "is-a."
Item 33: Avoid hiding inherited names
Item 34: Differentiate between inheritance of interface and inheritance of implementation
Item 35: Consider alternatives to virtual functions
Item 36: Never redefine an inherited non-virtual function
Item 37: Never redefine a function's inherited default parameter value
Item 38: Model "has-a" or "is-implemented-in-terms-of" through composition
Item 39: Use private inheritance judiciously
Item 40: Use multiple inheritance judiciously
Chapter 7. Templates and Generic Programming
Item 41: Understand implicit interfaces and compile-time polymorphism
Item 42: Understand the two meanings of typename
Item 43: Know how to access names in templatized base classes
Item 44: Factor parameter-independent code out of templates
Item 45: Use member function templates to accept "all compatible types."
Item 46: Define non-member functions inside templates when type conversions are desired
Item 47: Use traits classes for information about types
Item 48: Be aware of template metaprogramming
Chapter 8. Customizing new and delete
Item 49: Understand the behavior of the new-handler
Item 50: Understand when it makes sense to replace new and delete
Item 51: Adhere to convention when writing new and delete
Item 52: Write placement delete if you write placement new
Chapter 9. Miscellany
Item 53: Pay attention to compiler warnings.
Item 54: Familiarize yourself with the standard library, including TR1
Item.55: Familiarize yourself with Boost.
Appendix A. Beyond Effective C++
Appendix B. Item Mappings Between Second and Third Editions
Index
index_SYMBOL
index_A
index_B
index_C
index_D
index_E
index_F
index_G
index_H
index_I
index_J
index_K
index_L
index_M
index_N
index_O
index_P
index_R
index_S
index_T
index_U
index_V
index_W
index_X
index_Z

Item 44: Factor parameter-independent code out of templates

Templates are a wonderful way to save time and avoid code replication. Instead of typing 20 similar classes, each with 15 member functions, you type one class template, and you let compilers instantiate the 20 specific classes and 300 functions you need. (Member functions of class templates are implicitly instantiated only when used, so you should get the full 300 member functions only if each is actually used.) Function templates are similarly appealing. Instead of writing many functions, you write one function template and let the compilers do the rest. Ain't technology grand?

Yes, well...sometimes. If you're not careful, using templates can lead to code bloat: binaries with replicated (or almost replicated) code, data, or both. The result can be source code that looks fit and trim, yet object code that's fat and flabby. Fat and flabby is rarely fashionable, so you need to know how to avoid such binary bombast.

Your primary tool has the imposing name commonality and variability analysis, but there's nothing imposing about the idea. Even if you've never written a template in your life, you do such analysis all the time.

When you're writing a function and you realize that some part of the function's implementation is essentially the same as another function's implementation, do you just replicate the code? Of course not. You factor the common code out of the two functions, put it into a third function, and have both of the other functions call the new one. That is, you analyze the two functions to find the parts that are common and those that vary, you move the common parts into a new function, and you keep the varying parts in the original functions. Similarly, if you're writing a class and you realize that some parts of the class are the same as parts of another class, you don't replicate the common parts. Instead, you move the common parts to a new class, then you use inheritance or composition (see Items 32, 38, and 39) to give the original classes access to the common features. The parts of the original classes that differ — the varying parts — remain in their original locations.

When writing templates, you do the same analysis, and you avoid replication in the same ways, but there's a twist. In non-template code, replication is explicit: you can see that there's duplication between two functions or two classes. In template code, replication is implicit: there's only one copy of the template source code, so you have to train yourself to sense the replication that may take place when a template is instantiated multiple times.

For example, suppose you'd like to write a template for fixed-size square matrices that, among other things, support matrix inversion.


template<typename T,           // template for n x n matrices of

         std::size_t n>        // objects of type T; see below for info

class SquareMatrix {           // on the size_t parameter

public:

  ...

  void invert();              // invert the matrix in place

};


This template takes a type parameter, T, but it also takes a parameter of type size_t — a non-type parameter. Non-type parameters are less common than type parameters, but they're completely legal, and, as in this example, they can be quite natural.

Now consider this code:


SquareMatrix<double, 5> sm1;

...

sm1.invert();                  // call SquareMatrix<double, 5>::invert



SquareMatrix<double, 10> sm2;

...

sm2.invert();                  // call SquareMatrix<double, 10>::invert


Two copies of invert will be instantiated here. The functions won't be identical, because one will work on 5x5 matrices and one will work on 10 x 10 matrices, but other than the constants 5 and 10, the two functions will be the same. This is a classic way for template-induced code bloat to arise.

What would you do if you saw two functions that were character-for-character identical except for the use of 5 in one version and 10 in the other? Your instinct would be to create a version of the function that took a value as a parameter, then call the parameterized function with 5 or 10 instead of replicating the code. Your instinct serves you well! Here's a first pass at doing that for SquareMatrix:


template<typename T>                   // size-independent base class for

class SquareMatrixBase {               // square matrices

protected:

  ...

  void invert(std::size_t matrixSize); // invert matrix of the given size

  ...

};



template<          typename T, std::size_t n>

class SquareMatrix: private SquareMatrixBase<T> {

private:

  using SquareMatrixBase<T>::invert;   // avoid hiding base version of

                                       // invert; see Item 33

public:

  ...

  void invert() { this->invert(n); }   // make inline call to base class

};                                     // version of invert; see below

                                       // for why "this->" is here


As you can see, the parameterized version of invert is in a base class, SquareMatrixBase. Like SquareMatrix, SquareMatrixBase is a template, but unlike SquareMatrix, it's templatized only on the type of objects in the matrix, not on the size of the matrix. Hence, all matrices holding a given type of object will share a single SquareMatrixBase class. They will thus share a single copy of that class's version of invert.

SquareMatrixBase::invert is intended only to be a way for derived classes to avoid code replication, so it's protected instead of being public. The additional cost of calling it should be zero, because derived classes' inverts call the base class version using inline functions. (The inline is implicit — see Item 30.) These functions use the "this->" notation, because otherwise, as Item 43 explains, function names in templatized base classes (such as SquareMatrixBase<T>) are hidden from derived classes. Notice also that the inheritance between SquareMatrix and SquareMatrixBase is private. This accurately reflects the fact that the reason for the base class is only to facilitate the derived classes' implementations, not to express a conceptual is-a relationship between SquareMatrix and SquareMatrixBase. (For information on private inheritance, see Item 39.)

So far, so good, but there's a sticky issue we haven't addressed yet. How does SquareMatrixBase::invert know what data to operate on? It knows the size of the matrix from its parameter, but how does it know where the data for a particular matrix is? Presumably only the derived class knows that. How does the derived class communicate that to the base class so that the base class can do the inversion?

One possibility would be to add another parameter to SquareMatrixBase::invert, perhaps a pointer to the beginning of a chunk of memory with the matrix's data in it. That would work, but in all likelihood, invert is not the only function in SquareMatrix that can be written in a size-independent manner and moved into SquareMatrixBase. If there are several such functions, all will need a way to find the memory holding the values in the matrix. We could add an extra parameter to all of them, but we'd be telling SquareMatrixBase the same information repeatedly. That seems wrong.

An alternative is to have SquareMatrixBase store a pointer to the memory for the matrix values. And as long as it's storing that, it might as well store the matrix size, too. The resulting design looks like this:


template<typename T>

class SquareMatrixBase {

protected:

  SquareMatrixBase(std::size_t n, T *pMem)     // store matrix size and a

  : size(n), pData(pMem) {}                    // ptr to matrix values



  void setDataPtr(T *ptr) { pData = ptr; }     // reassign pData

  ...



private:

  std::size_t size;                            // size of matrix



  T *pData;                                    // pointer to matrix values

};


This lets derived classes decide how to allocate the memory. Some implementations might decide to store the matrix data right inside the SquareMatrix object:


template<typename T, std::size_t n>

class SquareMatrix: private SquareMatrixBase<T> {

public:

  SquareMatrix()                             // send matrix size and

  : SquareMatrixBase<T>(n, data) {}          // data ptr to base class

  ...



private:

  T data[n*n];

};


Objects of such types have no need for dynamic memory allocation, but the objects themselves could be very large. An alternative would be to put the data for each matrix on the heap:


template<typename T, std::size_t n>

class SquareMatrix: private SquareMatrixBase<T> {

public:

  SquareMatrix()                          // set base class data ptr to null,

  : SquareMatrixBase<T>(n, 0),            // allocate memory for matrix

    pData(new T[n*n])                     // values, save a ptr to the

  { this->setDataPtr(pData.get()); }      // memory, and give a copy of it

  ...                                     // to the base class



private:

  boost::scoped_array<T> pData;           // see Item 13 for info on

};                                        // boost::scoped_array


Regardless of where the data is stored, the key result from a bloat point of view is that now many — maybe all — of SquareMatrix's member functions can be simple inline calls to base class versions that are shared with all other matrices holding the same type of data, regardless of their size. At the same time, SquareMatrix objects of different sizes are distinct types, so even though, e.g., SquareMatrix<double, 5> and SquareMatrix<double, 10> objects use the same member functions in SquareMatrixBase<double>, there's no chance of passing a SquareMatrix<double, 5> object to a function expecting a SquareMatrix<double, 10>. Nice, no?

Nice, yes, but not free. The versions of invert with the matrix sizes hardwired into them are likely to generate better code than the shared version where the size is passed as a function parameter or is stored in the object. For example, in the size-specific versions, the sizes would be compile-time constants, hence eligible for such optimizations as constant propagation, including their being folded into the generated instructions as immediate operands. That can't be done in the size-independent version.

On the other hand, having only one version of invert for multiple matrix sizes decreases the size of the executable, and that could reduce the program's working set size and improve locality of reference in the instruction cache. Those things could make the program run faster, more than compensating for any lost optimizations in size-specific versions of invert. Which effect would dominate? The only way to know is to try it both ways and observe the behavior on your particular platform and on representative data sets.

Another efficiency consideration concerns the sizes of objects. If you're not careful, moving size-independent versions of functions up into a base class can increase the overall size of each object. For example, in the code I just showed, each SquareMatrix object has a pointer to its data in the SquareMatrixBase class, even though each derived class already has a way to get to the data. This increases the size of each SquareMatrix object by at least the size of a pointer. It's possible to modify the design so that these pointers are unnecessary, but, again, there are trade-offs. For example, having the base class store a protected pointer to the matrix data leads to the loss of encapsulation described in Item 22. It can also lead to resource management complications: if the base class stores a pointer to the matrix data, but that data may have been either dynamically allocated or physically stored inside the derived class object (as we saw), how will it be determined whether the pointer should be deleted? Such questions have answers, but the more sophisticated you try to be about them, the more complicated things become. At some point, a little code replication begins to look like a mercy.

This Item has discussed only bloat due to non-type template parameters, but type parameters can lead to bloat, too. For example, on many platforms, int and long have the same binary representation, so the member functions for, say, vector<int> and vector<long> would likely be identical — the very definition of bloat. Some linkers will merge identical function implementations, but some will not, and that means that some templates instantiated on both int and long could cause code bloat in some environments. Similarly, on most platforms, all pointer types have the same binary representation, so templates holding pointer types (e.g., list<int*>, list<const int*>, list<SquareMatrix<long, 3>*>, etc.) should often be able to use a single underlying implementation for each member function. Typically, this means implementing member functions that work with strongly typed pointers (i.e., T* pointers) by having them call functions that work with untyped pointers (i.e., void* pointers). Some implementations of the standard C++ library do this for templates like vector, deque, and list. If you're concerned about code bloat arising in your templates, you'll probably want to develop templates that do the same thing.

Things to Remember

  • Templates generate multiple classes and multiple functions, so any template code not dependent on a template parameter causes bloat.

  • Bloat due to non-type template parameters can often be eliminated by replacing template parameters with function parameters or class data members.

  • Bloat due to type parameters can be reduced by sharing implementations for instantiation types with identical binary representations.