480 Thinking in C++ www.BruceEckel.com
defined types by value during function calls. It’s so important, in
fact, that the compiler will automatically synthesize a copy-
constructor if you don’t provide one yourself, as you will see.
Passing & returning by value
To understand the need for the copy-constructor, consider the way
C handles passing and returning variables by value during function
calls. If you declare a function and make a function call,
int f(int x, char c);
int g = f(a, b);
how does the compiler know how to pass and return those
variables? It just knows! The range of the types it must deal with is
so small –
char
,
int
,
float
,
double
, and their variations – that this
information is built into the compiler.
If you figure out how to generate assembly code with your
compiler and determine the statements generated by the function
call to
f( )
, you’ll get the equivalent of:
push b
push a
call f()
compiler simply pushes copies on the stack – it knows how big
they are and that pushing those arguments makes accurate copies
of them.
The return value of
f( )
is placed in a register. Again, the compiler
knows everything there is to know about the return value type
because that type is built into the language, so the compiler can
return it by placing it in a register. With the primitive data types in
C, the simple act of copying the bits of the value is equivalent to
copying the object.
Passing & returning large objects
But now consider user-defined types. If you create a class and you
want to pass an object of that class by value, how is the compiler
supposed to know what to do? This is not a type built into the
compiler; it’s a type you have created.
To investigate this, you can start with a simple structure that is
clearly too large to return in registers:
//: C11:PassingBigStructures.cpp
struct Big {
char buf[100];
int i;
long d;
} B, B2;
Big bigfun(Big b) {
b.i = 100; // Do something to the argument
return b;
}
going on here, you need to understand the constraints on the
compiler when it’s making a function call.
Function-call stack frame
When the compiler generates code for a function call, it first pushes
all the arguments on the stack, then makes the call. Inside the
function, code is generated to move the stack pointer down even
farther to provide storage for the function’s local variables.
(“Down” is relative here; your machine may increment or
decrement the stack pointer during a push.) But during the
assembly-language CALL, the CPU pushes the address in the
program code where the function call
came from
, so the assembly-
language RETURN can use that address to return to the calling
point. This address is of course sacred, because without it your
program will get completely lost. Here’s what the stack frame looks
like after the CALL and the allocation of local variable storage in
the function:
Function arguments
Return address
Local variables11: References & the Copy-Constructor 483
The code generated for the rest of the function expects the memory
to be laid out exactly this way, so that it can carefully pick from the
function arguments and local variables without touching the return
address. I shall call this block of memory, which is everything used
by a function in the process of the function call, the
function frame
become vulnerable at that moment because an interrupt could
come along. The ISR would move the stack pointer down to hold
484 Thinking in C++ www.BruceEckel.com
its return address and its local variables and overwrite your return
value.
To solve this problem, the caller
could
be responsible for allocating
the extra storage on the stack for the return values before calling
the function. However, C was not designed this way, and C++
must be compatible. As you’ll see shortly, the C++ compiler uses a
more efficient scheme.
Your next idea might be to return the value in some global data
area, but this doesn’t work either. Reentrancy means that any
function can be an interrupt routine for any other function,
including the same function you’re currently inside
. Thus, if you put
the return value in a global area, you might return into the same
function, which would overwrite that return value. The same logic
applies to recursion.
The only safe place to return values is in the registers, so you’re
back to the problem of what to do when the registers aren’t large
enough to hold the return value. The answer is to push the address
of the return value’s destination on the stack as one of the function
arguments, and let the function copy the return information
directly into the destination. This not only solves all the problems,
it’s more efficient. It’s also the reason that, in
PassingBigStructures.cpp
, the compiler pushes the address of
B2
#include <string>
using namespace std;
ofstream out("HowMany.out");
class HowMany {
static int objectCount;
public:
HowMany() { objectCount++; }
static void print(const string& msg = "") {
if(msg.size() != 0) out << msg << ": ";
out << "objectCount = "
<< objectCount << endl;
}
~HowMany() {
objectCount ;
print("~HowMany()");
}
};
int HowMany::objectCount = 0;
// Pass and return BY VALUE:
HowMany f(HowMany x) {
x.print("x argument inside f()");
return x;
}
int main() {
HowMany h;
HowMany::print("after construction of h");
After
h
is created, the object count is one, which is fine. But after
the call to
f( )
you would expect to have an object count of two,
because
h2
is now in scope as well. Instead, the count is zero, which
indicates something has gone horribly wrong. This is confirmed by
the fact that the two destructors at the end make the object count go
negative, something that should never happen.
Look at the point inside
f( )
, which occurs after the argument is
passed by value. This means the original object
h
exists outside the
function frame, and there’s an additional object
inside
the function
frame, which is the copy that has been passed by value. However,
the argument has been passed using C’s primitive notion of
bitcopying, whereas the C++
HowMany
class requires true
initialization to maintain its integrity, so the default bitcopy fails to
produce the desired effect.
When the local object goes out of scope at the end of the call to
f( )
h2
, a previously unconstructed object, is created from the return
value of
f( )
, so again a new object is created from an existing one.
The compiler’s assumption is that you want to perform this
creation using a bitcopy, and in many cases this may work fine, but
in
HowMany
it doesn’t fly because the meaning of initialization
goes beyond simply copying. Another common example occurs if
the class contains pointers – what do they point to, and should you
copy them or should they be connected to some new piece of
memory?
Fortunately, you can intervene in this process and prevent the
compiler from doing a bitcopy. You do this by defining your own
function to be used whenever the compiler needs to make a new
object from an existing object. Logically enough, you’re making a
new object, so this function is a constructor, and also logically
enough, the single argument to this constructor has to do with the
object you’re constructing from. But that object can’t be passed into
the constructor by value because you’re trying to
define
the function
that handles passing by value, and syntactically it doesn’t make
sense to pass a pointer because, after all, you’re creating the new
object from an existing object. Here, references come to the rescue,
so you take the reference of the source object. This function is called
the
}
~HowMany2() {
objectCount;
print("~HowMany2()");
}
// The copy-constructor:
HowMany2(const HowMany2& h) : name(h.name) {
name += " copy";
++objectCount;
print("HowMany2(const HowMany2&)");
}
void print(const string& msg = "") const {
if(msg.size() != 0)
out << msg << endl;
out << '\t' << name << ": "
<< "objectCount = "
<< objectCount << endl;
}
};
int HowMany2::objectCount = 0;
11: References & the Copy-Constructor 489
// Pass and return BY VALUE:
HowMany2 f(HowMany2 x) {
x.print("x argument inside f()");
out << "Returning from f()" << endl;
return x;
}
objectCount
as before, and the destructor
decrements it.
Next is the copy-constructor,
HowMany2(const HowMany2&)
.
The copy-constructor can create a new object only from an existing
one, so the existing object’s name is copied to
name
, followed by
the word “copy” so you can see where it came from. If you look
closely, you’ll see that the call
name(h.name)
in the constructor
initializer list is actually calling the
string
copy-constructor.
Inside the copy-constructor, the object count is incremented just as
it is inside the normal constructor. This means you’ll now get an
accurate object count when passing and returning by value.
The
print( )
function has been modified to print out a message, the
object identifier, and the object count. It must now access the
name
490 Thinking in C++ www.BruceEckel.com
data of a particular object, so it can no longer be a
static
member
"Adds line numbers to file");
ifstream in(argv[1]);
assure(in, argv[1]);
string line;
vector<string> lines;
while(getline(in, line)) // Read in entire file
lines.push_back(line);
if(lines.size() == 0) return 0;
int num = 0;
// Number of lines in file determines width:
const int width = int(log10(lines.size())) + 1;
for(int i = 0; i < lines.size(); i++) {
cout.setf(ios::right, ios::adjustfield);
cout.width(width);
11: References & the Copy-Constructor 491
cout << ++num << ") " << lines[i] << endl;
}
} ///:~
The entire file is read into a
vector<string>
, using the same code
that you’ve seen earlier in the book. When printing the line
numbers, we’d like all the lines to be aligned with each other, and
this requires adjusting for the number of lines in the file so that the
width allowed for the line numbers is consistent. We can easily
determine the number of lines using
vector::size( )
, but what we
3) Entering f()
4) HowMany2(const HowMany2&)
5) h copy: objectCount = 2
6) x argument inside f()
7) h copy: objectCount = 2
8) Returning from f()
9) HowMany2(const HowMany2&)
10) h copy copy: objectCount = 3
11) ~HowMany2()
12) h copy: objectCount = 2
13) h2 after call to f()
14) h copy copy: objectCount = 2
492 Thinking in C++ www.BruceEckel.com
15) Call f(), no return value
16) HowMany2(const HowMany2&)
17) h copy: objectCount = 3
18) x argument inside f()
19) h copy: objectCount = 3
20) Returning from f()
21) HowMany2(const HowMany2&)
22) h copy copy: objectCount = 4
23) ~HowMany2()
24) h copy: objectCount = 3
25) ~HowMany2()
26) h copy copy: objectCount = 2
27) After call to f()
28) ~HowMany2()
29) h copy copy: objectCount = 1
30) ~HowMany2()
31) h: objectCount = 0
name becomes “h copy copy” for
h2
’s identifier because it’s being
copied from the copy that is the local object inside
f( )
. After the
object is returned, but before the function ends, the object count
becomes temporarily three, but then the local object “h copy” is
destroyed. After the call to
f( )
completes in line 13, there are only
two objects,
h
and
h2
, and you can see that
h2
did indeed end up as
“h copy copy.”
11: References & the Copy-Constructor 493
Temporary objects
Line 15 begins the call to
f(h)
, this time ignoring the return value.
You can see in line 16 that the copy-constructor is called just as
before to pass the argument in. And also, as before, line 21 shows
the copy-constructor is called for the return value. But the copy-
constructor must have an address to work on as its destination (a
this
494 Thinking in C++ www.BruceEckel.com
Here’s an example to show the more intelligent approach the
compiler takes. Suppose you create a new class composed of objects
of several existing classes. This is called, appropriately enough,
composition
, and it’s one of the ways you can make new classes from
existing classes. Now take the role of a naive user who’s trying to
solve a problem quickly by creating a new class this way. You don’t
know about copy-constructors, so you don’t create one. The
example demonstrates what the compiler does while creating the
default copy-constructor for your new class:
//: C11:DefaultCopyConstructor.cpp
// Automatic creation of the copy-constructor
#include <iostream>
#include <string>
using namespace std;
class WithCC { // With copy-constructor
public:
// Explicit default constructor required:
WithCC() {}
WithCC(const WithCC&) {
cout << "WithCC(WithCC&)" << endl;
}
};
class WoCC { // Without copy-constructor
string id;
public:
WoCC(const string& ident = "") : id(ident) {}
announces that it has been called, and this brings up an interesting
issue. In the class
Composite
, an object of
WithCC
is created using
a default constructor. If there were no constructors at all in
WithCC
, the compiler would automatically create a default
constructor, which would do nothing in this case. However, if you
add a copy-constructor, you’ve told the compiler you’re going to
handle constructor creation, so it no longer creates a default
constructor for you and will complain unless you explicitly create a
default constructor as was done for
WithCC
.
The class
WoCC
has no copy-constructor, but its constructor will
store a message in an internal
string
that can be printed out using
print( )
. This constructor is explicitly called in
Composite
’s
constructor initializer list (briefly introduced in Chapter 8 and
covered fully in Chapter 14). The reason for this becomes apparent
later.
and base classes. That is, if the member object also contains another
object, its copy-constructor is also called. So in this case, the
compiler calls the copy-constructor for
WithCC
. The output shows
this constructor being called. Because
WoCC
has no copy-
constructor, the compiler creates one for it that just performs a
bitcopy, and calls that inside the
Composite
copy-constructor. The
call to
Composite::print( )
in main shows that this happens because
the contents of
c2.wocc
are identical to the contents of
c.wocc
. The
process the compiler goes through to synthesize a copy-constructor
is called
memberwise initialization
.
It’s always best to create your own copy-constructor instead of
letting the compiler do it for you. This guarantees that it will be
under your control.
Alternatives to copy-construction
At this point your head may be swimming, and you might be
wondering how you could have possibly written a working class
int i;
NoCC(const NoCC&); // No definition
public:
NoCC(int ii = 0) : i(ii) {}
};
void f(NoCC);
int main() {
NoCC n;
//! f(n); // Error: copy-constructor called
//! NoCC n2 = n; // Error: c-c called
//! NoCC n3(n); // Error: c-c called
} ///:~
Notice the use of the more general form
NoCC(const NoCC&);
using the
const
.
498 Thinking in C++ www.BruceEckel.com
Functions that modify outside objects
Reference syntax is nicer to use than pointer syntax, yet it clouds
the meaning for the reader. For example, in the iostreams library
one overloaded version of the
get( )
function takes a
char&
as an
argument, and the whole point of the function is to modify its
pointer-to-member
follows this same concept, except that what it
selects is a location inside a class. The dilemma here is that a
pointer needs an address, but there is no “address” inside a class;
selecting a member of a class means offsetting into that class. You
can’t produce an actual address until you combine that offset with
the starting address of a particular object. The syntax of pointers to
members requires that you select an object at the same time you’re
dereferencing the pointer to member.
11: References & the Copy-Constructor 499
To understand this syntax, consider a simple structure, with a
pointer
sp
and an object
so
for this structure. You can select
members with the syntax shown:
//: C11:SimpleStructure.cpp
struct Simple { int a; };
int main() {
Simple so, *sp = &so;
sp->a;
so.a;
} ///:~
Now suppose you have an ordinary pointer to an integer,
ip
. To
access what
in the definition. The only difference is that you must say what
class of objects this pointer-to-member is used with. Of course, this
is accomplished with the name of the class and the scope resolution
operator. Thus,
int ObjectClass::*pointerToMember;
defines a pointer-to-member variable called
pointerToMember
that
points to any
int
inside
ObjectClass
. You can also initialize the
pointer-to-member when you define it (or at any other time):
500 Thinking in C++ www.BruceEckel.com
int ObjectClass::*pointerToMember = &ObjectClass::a;
There is actually no “address” of
ObjectClass::a
because you’re just
referring to the class and not an object of that class. Thus,
&ObjectClass::a
can be used only as pointer-to-member syntax.
Here’s an example that shows how to create and use pointers to
data members:
//: C11:PointerToMemberData.cpp
#include <iostream>
using namespace std;
Chapter 3) is defined like this:
int (*fp)(float);
The parentheses around
(*fp)
are necessary to force the compiler to
evaluate the definition properly. Without them this would appear
to be a function that returns an
int*
.
Parentheses also play an important role when defining and using
pointers to member functions. If you have a function inside a class,
you define a pointer to that member function by inserting the class
name and scope resolution operator into an ordinary function
pointer definition:
//: C11:PmemFunDefinition.cpp
class Simple2 {
public:
int f(float) const { return 1; }
};
int (Simple2::*fp)(float) const;
int (Simple2::*fp2)(float) const = &Simple2::f;
int main() {
fp = &Simple2::f;
} ///:~
In the definition for
fp2
you can see that a pointer to member
function can also be initialized when it is created, or at any other
int main() {
Widget w;
Widget* wp = &w;
void (Widget::*pmem)(int) const = &Widget::h;
(w.*pmem)(1);
(wp->*pmem)(2);
} ///:~
Of course, it isn’t particularly reasonable to expect the casual user
to create such complicated expressions. If the user must directly
manipulate a pointer-to-member, then a
typedef
is in order. To
really clean things up, you can use the pointer-to-member as part of
the internal implementation mechanism. Here’s the preceding
example using a pointer-to-member
inside
the class. All the user
needs to do is pass a number in to select a function.
1
//: C11:PointerToMemberFunction2.cpp
#include <iostream>
using namespace std; 1
Thanks to Owen Mortensen for this example
, you can see that the entire
implementation, including the functions, has been hidden away.
The code must even ask for the
count( )
of functions. This way, the
class implementer can change the quantity of functions in the
underlying implementation without affecting the code where the
class is used.
The initialization of the pointers-to-members in the constructor
may seem overspecified. Shouldn’t you be able to say
fptr[1] = &g;
because the name
g
occurs in the member function, which is
automatically in the scope of the class? The problem is this doesn’t
conform to the pointer-to-member syntax, which is required so
504 Thinking in C++ www.BruceEckel.com
everyone, especially the compiler, can figure out what’s going on.
Similarly, when the pointer-to-member is dereferenced, it seems
like
(this->*fptr[i])(j);
is also over-specified;
this
looks redundant. Again, the syntax
requires that a pointer-to-member always be bound to an object
when it is dereferenced.
Summary
Pointers in C++ are almost identical to pointers in C, which is good.