您的位置:首页 > 其它

"Pure Virtual Function Called": An Explanation

2009-06-10 16:42 393 查看
2009-06-10 16:21

The C++ Source

"Pure Virtual Function Called": An Explanation

by Paul S. R. Chisholm

February 26, 2007



Summary
"Pure virtual function called" is the dying message of the occasional
crashed C++ program. What does it mean? You can find a couple of
simple, well-documented explanations out there that apply to problems
easy to diagnose during postmortem debugging. There is also another
rather subtle bug that generates the same message. If you have a
mysterious crash associated with that message, it might well mean your
program went indirect on a dangling pointer. This article covers all
these explanations.

Object-Oriented C++: The Programmer's View

(If you know what pure virtual functions and abstract classes are, you can skip this section.)

In C++, virtual functions let instances of related classes have different behavior at run time (aka, runtime polymorphism
) :

class Shape {

public:

 virtual double area() const;

 double value() const;

 // Meyers 3rd Item 7:

 virtual ~Shape();

protected:

 Shape(double valuePerSquareUnit);

private:

 double valuePerSquareUnit_;

};

class Rectangle : public Shape {

public:

 Rectangle(double width, double height, double valuePerSquareUnit);

 virtual double area() const;

 // Meyers 3rd Item 7:

 virtual ~Rectangle();

// ...

};

class Circle : public Shape {

public:

 Circle(double radius, double valuePerSquareUnit);

 virtual double area() const;

 // Meyers 3rd Item 7:

 virtual ~Circle();

// ...

};

double

Shape::value() const

{

 // Area is computed differently, depending

 // on what kind of shape the object is:

 return valuePerSquareUnit_ * area();

}

(The comments before the destructors refer to Item 7 in the third edition of Scott Meyers's Effective C++
:
"Declare destructors virtual in polymorphic base classes." This code
follows a convention used on several projects, where references like
this are put in the code, serving as reminders to maintainers and
reviewers. To some people, the point is obvious and the reminder is
distracting; but one person's distraction is another person's helpful
hint, and programmers in a hurry often forget what should be "obvious.")

In C++, a function's interface is specified by declaring the
function. Member functions are declared in the class definition. A
function's implementation is specified by defining the function.
Derived classes can redefine a function, specifying an implementation
particular to that derived class (and classes derived from it
).
When a virtual function is called, the implementation is chosen based
not on the static type of the pointer or reference, but on the type of
the object being pointed to, which can vary at run time:

print(shape->area());  // Might invoke Circle::area() or Rectangle::area().

A pure
virtual function is declared, but not necessarily
defined, by a base class. A class with a pure virtual function is
"abstract" (as opposed to "concrete"), in that it's not possible to
create instances of that class. A derived class must define all
inherited pure virtual functions of its base classes to be concrete.

class AbstractShape {

public:

 virtual double area() const = 0;

 double value() const;

 // Meyers 3rd Item 7:

 virtual ~AbstractShape();

protected:

 AbstractShape(double valuePerSquareUnit);

private:

 double valuePerSquareUnit_;

protected:

 AbstractShape(double valuePerSquareUnit);

private:

 double valuePerSquareUnit_;

};

// Circle and Rectangle are derived from AbstractShape.

// This will not compile, even if there's a matching public constructor:

// AbstractShape* p = new AbstractShape(value);

// These are okay:

Rectangle* pr = new Rectangle(height, weight, value);

Circle* pc = new Circle(radius, value);

// These are okay, too:

AbstractShape* p = pr;

p = pc;

Object Oriented C++: Under the Covers

(You can skip this section if you already know what a "vtbl" is.)

How does all this run time magic happen? The usual implementation
is, every class with any virtual functions has an array of function
pointers, called a "vtbl". Every instance of such as class has a
pointer to its class's vtbl
, as depicted below.





Figure 1. A class's vtbl points to the class's instance member functions.



If an abstract class with a pure virtual function doesn't define the function, what goes in the corresponding place in the vtbl
?
Traditionally, C++ implementors have provided a special function, which
prints "Pure virtual function called" (or words to that effect), and
then crashes the program.





Figure 2. An abstract class's vtbl can have a pointer to a special function.



Build 'em Up, Tear 'em Down

When you construct an instance of a derived class, what happens, exactly? If the class has a vtbl
, the process goes something like the following.

Step 1: Construct the top-level base part:.

Make the instance point to the base class's vtbl
.

Construct the base class instance member variables.

Execute the body of the base class constructor.



Step 2: Construct the derived part(s) (recursively):

Make the instance point to the derived class's vtbl
.

Construct the derived class instance member variables.

Execute the body of the derived class constructor.



Destruction happens in reverse order, something like this:

Step 1: Destruct the derived part:

(The instance already points to the derived class's vtbl
.)

Execute the body of the derived class destructor.

Destruct the derived class instance member variables.



Step 2: Destruct the base part(s) (recursively):

Make the instance point to the base class's vtbl.

Execute the body of the base class destructor.

Destruct the base class instance member variables.





Two of the Classic Blunders

What if you try to call a virtual function from a base class constructor?

// From sample program 1:

AbstractShape(double valuePerSquareUnit)

 : valuePerSquareUnit_(valuePerSquareUnit)

{

 // ERROR: Violation of Meyers 3rd Item 9!

 std::cout << "creating shape, area = " << area() << std::endl;

}

(Meyers, 3rd edition, Item 9: "Never call virtual functions during construction or destruction.")

This is obviously an attempt to call a pure virtual function. The
compiler could alert us to this problem, and some compilers do. If a
base class destructor calls a pure virtual function directly (sample
program 2), you have essentially the same situation.

If the situation is a little more complicated, the error will be less obvious (and the compiler is less likely to help us):

// From sample program 3:

AbstractShape::AbstractShape(double valuePerSquareUnit)

 : valuePerSquareUnit_(valuePerSquareUnit)

{

 // ERROR: Indirect violation of Meyers 3rd Item 9!

 std::cout << "creating shape, value = " << value() << std::endl;

}

The body of this base class constructor is in step 1(c) of the
construction process described above, which calls a instance member
function (
value()

), which in turn calls a pure virtual function (
area()

).
The object is still an AbstractShape at this point. What happens when
it tries to call the pure virtual function? Your program likely crashes
with a message similar to, "Pure virtual function called."

Similarly, calling a virtual function indirectly from a base class
destructor (sample program 4) results in the same kind of crash. The
same goes for passing a partially-constructed (or partially-destructed)
object to any function that invokes virtual functions.

These are the most commonly described root causes of the "Pure
Virtual Function Called" message. They're straightforward to diagnose
from postmortem debugging; the stack trace will point clearly to the
problem.

Pointing Out Blame

There's at least one other problem that can lead to this message,
which doesn't seem to be explicitly described anywhere in print or on
the net. (There have been some discussions on the ACE mailing list that
touch upon the problem but they don't go into detail.)

Consider the following (buggy) code:

// From sample program 5:

 AbstractShape* p1 = new Rectangle(width, height, valuePerSquareUnit);

 std::cout << "value = " << p1->value() << std::endl;

 AbstractShape* p2 = p1;  // Need another copy of the pointer.

 delete p1;

 std::cout << "now value = " << p2->value() << std::endl;

Let's consider these lines one at a time.

AbstractShape* p1 = new Rectangle(width, height, valuePerSquareUnit);

A new object is created. It's constructed in two stages: Step 1,
where the object acts like a base class instance, and Step 2, where it
acts like a derived class instance.

std::cout << "value = " << p1->value() << std::endl;

Everything's working fine.

AbstractShape* p2 = p1;  // Need another copy of the pointer.

Something odd might happen to
p1

, so let's make a copy of it.

delete p1;

The object is destructed in two stages: Step 1, where the object
acts like a derived class instance, and Step 2, where it acts like a
base class instance.

Note that the value of
p1

might change after the call to
delete

.
Compilers are allowed to "zero out" (i.e., render unusable) pointers
after destructing their pointed-to data. Lucky (?) for us, we have
another copy of the pointer,
p2

, which didn't change.

std::cout << "now value = " << p2->value() << std::endl;

Uh oh.

This is another classic blunder: going indirect on a "dangling"
pointer. That's a pointer to an object that's been deleted, or memory
that's been freed, or both. C++ programmers never write such code ...
unless they're clueless (unlikely) or rushed (all too likely).

So now
p2

points to an
ex-object. What does that thing look like? According to the C++
standard, it's "undefined". That's a technical term that means, in
theory, anything can happen: the program can crash, or keep running but
generate garbage results, or send Bjarne Stroustrup e-mail saying how
ugly you are and how funny your mother dresses you. You can't depend on
anything; the behavior might vary from compiler to compiler, or machine
to machine, or run to run. In practice, there are several common
possibilities (which may or may not happen consistently):

The memory might be marked as deallocated. Any attempt to
access it would immediately be flagged as the use of a dangling
pointer. That's what some tools (BoundsChecker, Purify, valgrind, and
others) try to do. As we'll see, the Common Language Runtime (CLR) from
Microsoft's .NET Framework, and Sun Studio 11's dbx debugger, work this
way.

The memory might be deliberately scrambled. The
memory management system might write garbage-like values into the
memory after it's freed. (One such value is "dead beef": 0xDEADBEEF,
unsigned decimal 3735928559, signed decimal -559038737.)

The
memory might be reused. If other code was executed between the deletion
of the object and the use of dangling pointer, the memory allocation
system might have created a new object out of some or all of the memory
used by the old object. If you're lucky, this will look enough like
garbage that the program will crash immediately. Otherwise the program
will likely crash sometime later, possibly after curdling other
objects, often long after the root cause problem occurred. This is the
kind of problem that drives C++ programmers crazy (and makes Java
programmers overly smug).

The memory might have been left exactly the way it was.

The last is an interesting case. What was the object "exactly the
way it was"? In this case, it was an instance of the abstract base
class; certainly that's the way the vtbl
was left. What happens if we try to call a pure virtual member function for such an object?

"Pure virtual function called".

(Exercise for the reader: Imagine a function that, unwisely and
unfortunately, returned a pointer or reference to a local variable.
This is a different kind of dangling pointer. How could this also
generate this message?)

Meanwhile, Back in the Real World

Nice theory. What happens in practice?

Consider five test programs, each with its own distinctive defect:

Directly calling a virtual function from a base class constructor.

Directly calling a virtual function from a base class destructor.

Indirectly calling a virtual function from a base class constructor.

Indirectly calling a virtual function from a base class destructor.

Calling a virtual function via a dangling pointer.

These were built and tested with several compilers (running on x86 Windows XP unless stated otherwise):

Visual C++ 8.0

Digital Mars C/C++ compiler version 8.42n

Open Watcom C/C++ version 1.4

SPARC Solaris 10, Sun Studio 11

gcc:


x86 Linux (Red Hat 3.2), gcc 2.96 / 3.0 / 3.2.2

x86 Windows XP (Cygwin), gcc 3.4.4

SPARC Solaris 8, gcc 3.2.2

PowerPC Mac OS X.4 (Tiger), gcc 3.3 / 4.0

Direct Invocation

Some compilers recognized what was happening in the first two examples, with various results.

Visual C++ 8.0, Open Watcom C/C++ 1.4, and gcc 4.x recognize that a
base class's constructor or destructor can't possibly invoke a derived
class's member function. As a result, these compilers optimize away any
runtime polymorphism, and treat the call as an invocation of the base
class member function. If that member function is not defined, the
program doesn't link. If the member function is defined, the program
runs without problems. gcc 4.x produces a warning ("abstract virtual
'virtual double AbstractShape::area() const' called from constructor"
for the first program, and similarly for the destructor for the second
program). Visual C++ 8.0 built the programs without any complaint, even
at the maximum warning level (/Wall); similarly for Open Watcom C/C++
1.4.

gcc 3.x and Digital Mars C/C++ compiler 8.42n rejected these
programs, complaining, respectively, "abstract virtual `virtual double
AbstractShape::area() const' called from constructor" (or "from
destructor") and "Error: 'AbstractShape::area' is a pure virtual
function".

Sun Studio 11 produced a warning, "Warning: Attempt to call a pure
virtual function AbstractShape::area() const will always fail", but
builds the programs. As promised, both crash, with the message, "Pure
virtual function called".

Indirect Invocation

The next two examples built without warning for all compilers.
(That's to be expected; this is not the kind of problem normally caught
by static analysis.) The resulting programs all crashed, with various
error messages:

Visual C++ 8.0: "R6025 - pure virtual function call (__vftpr[0] == __purecall)".

Digital
Mars C/C++ compiler 8.42n: did not generate an error message when the
program crashed. (That's fine; this is "undefined" behavior, and the
compiler is free to do whatever it wants.)

Open Watcom C/C++ 1.4: "pure virtual function called!".

Sun Studio 11: "Pure virtual function called" (same as for the first two programs).

gcc: "pure virtual method called".



Invocation via a Dangling Pointer

The fifth example in the previous list always built without warning
and crashed when run. Again, this is to be expected. For all compilers
except Microsoft's, the error message was the same as for the third and
fourth examples. Sun's compiler generated the same message, but Sun's
debugger provided some additional information.

Microsoft Visual C++ 8.0 has a number of runtime libraries. Each handles this error in its own way.

Win32 console application:


When run without the debugger, the program crashes silently.

When
run in the debugger, a program built in debug mode generates the
message, "Unhandled exception ... Access violation reading location
0xfeeefeee." This is clearly "dead beef" behavior; when memory was
freed, the runtime overwrote it with garbage.

When
built in release mode and run in the debugger, the program produces the
message, "Unhandled exception ... Illegal Instruction".

CLR console application:


When
built in debug mode, the message is, "Attempted to read or write
protected memory. This is often an indication that other memory is
corrupt." The debug runtime system has marked the freed memory, and
terminates the program when it tries to use that memory.

When built in release mode, the program crashes with the message, "Object reference not set to an instance of an object."

When compiled with Sun Studio 11, and run in dbx with Run-Time
Checking, the program died with an new error: "Read from unallocated
(rua): Attempting to read 4 bytes at address 0x486a8 which is 48 bytes
before heap block of size 40 bytes at 0x486d8". This is the debugger's
way of saying, "You just used something in a block of memory, but this
isn't a block of memory I think you should be using." Once the object
was destructed and its memory deallocated, the program could no longer
(legally) use that object, or that memory, again.

Owning Up

How can you avoid these kind of problems?

It's easy for the problems in the first four example programs. Pay
attention to Scott Meyers, and (for the first two examples) pay
attention to any warning messages you get.

What about the "dangling pointer" problem in the fifth example?
Programmers, in any language, need to design in terms of object
ownership. Something (or some collection of things) owns an object.
Ownership might be:

transferred to something else (or some other collection of things), or

"loaned" without transferring ownership, or

shared, by using reference counts or garbage collection.



What kind of "thing" can own an object?

Another object, obviously.

A collection of objects; for example, all the smart pointers that point to the owned object.

A
function. When a function is called, it may assume ownership
(transferred) or not (loaned). Functions always own their local
variables, but not necessarily what those local variables point or
refer to.



In our example, there was no clear ownership. Some function created
an object, and pointed two pointers at it. Who owns the object?
Probably the function, in which case, it should be responsible for
avoiding the problem somehow. It could have used one "dumb" pointer
(and explicitly zeroed it out after deletion) instead of two, or used
some sort of smart pointers.

In real life, it's never that simple, except sometimes in
retrospect. Objects can be passed from one module to one very different
module, written by other person or another organization. Object
ownership issues span equally long chasms.

Any time you pass an object around, you always need to know the
answer to the ownership question. It's a simple issue, sometimes with a
simple answer, but never a question that magically answers itself.
There is no substitute for thought.

Thinking for yourself doesn't mean thinking by yourself, however;
there is some good existing work that can help you. Tom Cargill wrote
up a pattern language, "Localized Ownership," that describes strategies
for these alternatives. Scott Meyers also addresses this in Item 13,
"Use objects to manage resources," and Item 14, "Think carefully about
copying behavior in resource-managing classes," in the third edition of
Effective C++
. See References for details.

No Smart Pointer Panacea

Reference-counted smart pointers are very helpful in avoiding these
kinds of problems. With smart pointers, ownership belongs to the set of
smart pointers that point to the object. When the last such smart
pointer stops pointing to that object, the object is deleted. That
would certainly solve the problem we've seen here.

But many programmers are just beginning to use smart pointers, and
just beginning to learn how to use them. Even with smart pointers, you
can still run into these kinds of problems ... if you use smart
pointers in dumb ways.

But that's another problem for another day.

References

Tom Cargill, "Localized Ownership: Managing Dynamic Objects in C++"; in Vlissides, Coplien, and Kerth, Pattern Languages of Program Design 2
, 1996, Addison-Wesley.

Scott Meyers, Effective C++, Third Edition: 55 Specific Ways to Improve Your Programs and Designs
, 2005, Addison-Wesley.

Share Your Opinion

Discuss this article in the Articles Forum topic, "Pure Virtual Function Called": An Explanation
.

Resources

Scott Meyers’ home page:

About the Author

Paul S. R. Chisholm
has been developing software for 25 years. He started at AT&T Bell
Laboratories, and has since worked at Ascend Communications / Lucent
Technologies, Cisco Systems, and three small startups you've probably
never heard of. He lives and works in New Jersey.

http://www.aristeia.com/
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: