Inapoi
Inainte
Cuprins
13: Dynamic Object Creation
Sometimes you know the exact
quantity, type, and lifetime of
the objects in your program. But not
always.
How many planes will an air-traffic
system need to handle? How many shapes will a CAD system use? How many nodes
will there be in a network?
To solve the general programming problem,
it’s essential that you be able to create and destroy objects at runtime.
Of course, C has always provided the dynamic memory allocation functions
malloc( )
and free( ) (along with variants of malloc( )) that
allocate storage from the heap (also called the free store) at
runtime.
However, this simply won’t work in
C++. The constructor doesn’t allow you to hand it
the address of the memory to initialize, and for good reason. If you could do
that, you
might:
- Forget. Then guaranteed
initialization of objects in C++ wouldn’t be
guaranteed.
- Accidentally
do something to the object before you initialize it, expecting the right thing
to happen.
- Hand it
the wrong-sized object.
And of
course, even if you did everything correctly, anyone who modifies your program
is prone to the same errors. Improper initialization is responsible for a large
portion of programming problems, so it’s especially important to guarantee
constructor calls for objects created on the heap.
So how does C++ guarantee proper
initialization
and cleanup, but allow you to create objects dynamically on the
heap?
The answer is by bringing dynamic object
creation into the core of the language. malloc( ) and
free( ) are library functions, and thus outside the control of the
compiler. However, if you have an operator to perform the combined act of
dynamic storage allocation and initialization and another operator to perform
the combined act of cleanup and releasing storage, the compiler can still
guarantee that constructors and destructors will be called for all
objects.
In this chapter, you’ll learn how
C++’s new and delete elegantly solve this problem by safely
creating objects on the
heap.
Object creation
When a C++ object is created, two events
occur:
- Storage is allocated for
the object.
- The
constructor is called to initialize that
storage.
By now you should
believe that step two always happens. C++ enforces it because
uninitialized objects are a major source of program bugs. It doesn’t
matter where or how the object is created – the constructor is always
called.
Step one, however, can occur in several
ways, or at alternate
times:
- Storage can be allocated
before the program begins, in the static storage area. This storage exists for
the life of the
program.
- Storage can
be created on the stack whenever a particular execution point is reached (an
opening brace). That storage is released automatically at the complementary
execution point (the closing brace). These stack-allocation operations are built
into the instruction set of the processor and are very efficient. However, you
have to know exactly how many variables you need when you’re writing the
program so the compiler can generate the right
code.
- Storage can be
allocated from a pool of memory called the heap (also known as the free store).
This is called dynamic memory allocation. To allocate this memory, a function is
called at runtime; this means you can decide at any time that you want some
memory and how much you need. You are also responsible for determining when to
release the memory, which means the lifetime of that memory can be as long as
you choose – it isn’t determined by
scope.
Often these three
regions are placed in a single contiguous piece of physical memory: the static
area, the stack, and the heap (in an order determined by the compiler writer).
However, there are no rules. The stack may be in a special place, and the heap
may be implemented by making calls for chunks of memory from the operating
system. As a programmer, these things are normally shielded from you, so all you
need to think about is that the memory is there when you call for
it.
C’s approach to the heap
To allocate memory dynamically at
runtime, C provides functions in its standard library:
malloc( ) and its variants
calloc( ) and
realloc( ) to produce memory from the
heap, and
free( ) to release the memory back to the
heap. These functions are pragmatic but primitive and require understanding and
care on the part of the programmer. To create an instance of a class on the heap
using C’s dynamic memory functions, you’d have to do something like
this:
//: C13:MallocClass.cpp
// Malloc with class objects
// What you'd have to do if not for "new"
#include "../require.h"
#include <cstdlib> // malloc() & free()
#include <cstring> // memset()
#include <iostream>
using namespace std;
class Obj {
int i, j, k;
enum { sz = 100 };
char buf[sz];
public:
void initialize() { // Can't use constructor
cout << "initializing Obj" << endl;
i = j = k = 0;
memset(buf, 0, sz);
}
void destroy() const { // Can't use destructor
cout << "destroying Obj" << endl;
}
};
int main() {
Obj* obj = (Obj*)malloc(sizeof(Obj));
require(obj != 0);
obj->initialize();
// ... sometime later:
obj->destroy();
free(obj);
} ///:~
You can see the use of
malloc( ) to create storage for the object in the
line:
Obj* obj = (Obj*)malloc(sizeof(Obj));
Here, the user must determine the size of
the object (one place for an error). malloc( ) returns a
void* because it just produces a patch of memory, not an object. C++
doesn’t allow a void* to be assigned to any other pointer, so it
must be cast.
Because malloc( ) may fail to
find any memory (in which case it returns zero), you must check the returned
pointer to make sure it was successful.
But the worst problem is this
line:
Obj->initialize();
If users make it this far correctly, they
must remember to initialize the object before it is used. Notice that a
constructor was not used because the constructor cannot
be called
explicitly[50]
– it’s called for you by the compiler when an object is created. The
problem here is that the user now has the option to forget to perform the
initialization before the object is used, thus reintroducing a major source of
bugs.
It also turns out that many programmers
seem to find C’s dynamic memory functions too confusing and complicated;
it’s not uncommon to find C programmers who use virtual memory
machines allocating huge arrays of variables in the
static storage area to avoid thinking about dynamic memory allocation. Because
C++ is attempting to make library use safe and effortless for the casual
programmer, C’s approach to dynamic memory is
unacceptable.
operator new
The solution in C++ is to combine all the
actions necessary to create an object into a single operator called
new. When you create an
object with new (using a
new-expression), it
allocates enough storage on the heap to hold the object and calls the
constructor for that storage. Thus, if you say
MyType *fp = new MyType(1,2);
at runtime, the equivalent of
malloc(sizeof(MyType)) is called (often, it is literally a call to
malloc( )), and the constructor for
MyType is called with the resulting address as the this
pointer, using (1,2) as the argument list. By the
time the pointer is assigned to fp, it’s a live, initialized object
– you can’t even get your hands on it before then. It’s also
automatically the proper MyType type so no cast
is necessary.
The default new checks to make
sure the memory allocation was successful before passing the address to the
constructor, so you don’t have to explicitly determine if the call was
successful. Later in the chapter you’ll find out what happens if
there’s no memory left.
You can create a new-expression using any
constructor available for the class. If the constructor has no arguments, you
write the new-expression without the constructor argument list:
MyType *fp = new MyType;
Notice how simple the process of creating
objects on the heap becomes – a single expression, with all the sizing,
conversions, and safety checks built in. It’s as easy to create an object
on the heap as it is on the
stack.
operator delete
The complement to the new-expression is
the delete-expression, which first calls the
destructor and then releases the memory (often with a call to
free( )). Just as a new-expression returns a
pointer to the object, a delete-expression requires the address of an
object.
delete fp;
This destructs and then releases the
storage for the dynamically allocated MyType object created
earlier.
delete can
be called only for an object created by new. If you malloc( )
(or calloc( ) or realloc( )) an object and then
delete it, the behavior is undefined. Because most default
implementations of new and delete use malloc( ) and
free( ), you’d probably end up releasing the memory without
calling the destructor.
If the pointer you’re deleting is
zero, nothing will happen. For this reason, people often
recommend setting a pointer to zero immediately after you delete it, to prevent
deleting it twice. Deleting an object more than once is definitely a bad thing
to do, and will cause
problems.
A simple example
This example shows that initialization
takes place:
//: C13:Tree.h
#ifndef TREE_H
#define TREE_H
#include <iostream>
class Tree {
int height;
public:
Tree(int treeHeight) : height(treeHeight) {}
~Tree() { std::cout << "*"; }
friend std::ostream&
operator<<(std::ostream& os, const Tree* t) {
return os << "Tree height is: "
<< t->height << std::endl;
}
};
#endif // TREE_H ///:~
//: C13:NewAndDelete.cpp
// Simple demo of new & delete
#include "Tree.h"
using namespace std;
int main() {
Tree* t = new Tree(40);
cout << t;
delete t;
} ///:~
We can prove that the constructor is
called by printing out the value of the Tree. Here, it’s done by
overloading the operator<< to use with an ostream and a
Tree*.
Note, however, that even though the function is declared as a
friend, it is defined as an inline! This is a
mere convenience – defining a friend function as an inline to a
class doesn’t change the friend status or the fact that it’s
a global function and not a class member function. Also notice that the return
value is the result of the entire output expression, which is an
ostream& (which it must be, to satisfy the return value type of the
function).
Memory manager overhead
When you create automatic objects on the
stack, the size of the objects
and their lifetime is built
right into the generated code, because the compiler knows the exact type,
quantity, and scope. Creating objects on the heap
involves
additional overhead, both in time and in space. Here’s a typical scenario.
(You can replace malloc( ) with
calloc( ) or
realloc( ).)
You call malloc( ), which
requests a block of memory from the pool. (This code may actually be part of
malloc( ).)
The pool is searched for a block of
memory large enough to satisfy the request. This is done by checking a map or
directory of some sort that shows which blocks are currently in use and which
are available. It’s a quick process, but it may take several tries so it
might not be deterministic – that is, you can’t necessarily count on
malloc( ) always taking exactly the same
amount of time.
Before a pointer to that block is
returned, the size and location of the block must be recorded so further calls
to malloc( ) won’t use it, and so that when you call
free( ), the system knows how much memory to
release.
The way all this is implemented can vary
widely. For example, there’s nothing to prevent primitives for memory
allocation being implemented in the processor. If you’re curious, you can
write test programs to try to guess the way your malloc( ) is
implemented. You can also read the library source code, if you have it (the GNU
C sources are always
available).
 |
|