Virtual method table - AbsoluteAstronomy.com

A virtual method table, virtual function table, dispatch table

Dispatch table

In computer science, a dispatch table is a table of pointers to functions or methods. Use of such a table is a common technique when implementing late binding in object-oriented programming.-Perl implementation:...

, or vtable, is a mechanism used in a programming language

Programming language

A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

to support dynamic dispatch

Dynamic dispatch

In computer science, dynamic dispatch is the process of mapping a message to a specific sequence of code at runtime. This is done to support the cases where the appropriate method can't be determined at compile-time...

(or run-time method binding

Name binding

In programming languages, name binding is the association of objects with identifiers. An identifier bound to an object is said to reference that object. Machine languages have no built-in notion of identifiers, but name-object bindings as a service and notation for the programmer is implemented...

).

Suppose a program contains several classes in an inheritance hierarchy: a superclass, Cat, and two subclasses, HouseCat and Lion. Class Cat defines a virtual function

Virtual function

In object-oriented programming, a virtual function or virtual method is a function or method whose behaviour can be overridden within an inheriting class by a function with the same signature...

named speak, so its subclasses may provide an appropriate implementation (e.g. either meow or roar).

When the program calls the speak method on a Cat pointer (which can point to a Cat class, or any subclass of Cat), the calling code must be able to determine which implementation to call, depending on the actual type of object that is pointed to. Because the type of object pointed to by the Cat pointer is not determined at compile-time, the decision as to which branch to take can not be decided at compile-time.

There are a variety of different ways to implement such dynamic dispatch, but the vtable (virtual table) solution is especially common among C++

C++

C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

and related languages (such as D

D (programming language)

The D programming language is an object-oriented, imperative, multi-paradigm, system programming language created by Walter Bright of Digital Mars. It originated as a re-engineering of C++, but even though it is mainly influenced by that language, it is not a variant of C++...

and C#). Languages which separate the programmatic interface of objects from the implementation, like Visual Basic

Visual Basic

Visual Basic is the third-generation event-driven programming language and integrated development environment from Microsoft for its COM programming model...

and Delphi

Object Pascal

Object Pascal refers to a branch of object-oriented derivatives of Pascal, mostly known as the primary programming language of Embarcadero Delphi.-Early history at Apple:...

, also tend to use the vtable approach, because it allows objects to use a different implementation simply by using a different set of method pointers.

Implementation

An object's dispatch table will contain the addresses of the object's dynamically bound methods. Method calls are performed by fetching the method's address from the object's dispatch table. The dispatch table is the same for all objects belonging to the same class, and is therefore typically shared between them. Objects belonging to type-compatible classes (for example siblings in an inheritance hierarchy) will have dispatch tables with the same layout: the address of a given method will appear at the same offset for all type-compatible classes. Thus, fetching the method's address from a given dispatch table offset will get the method corresponding to the object's actual class.

The C++ standards do not mandate exactly how dynamic dispatch must be implemented, but compilers generally use minor variations on the same basic model.

Typically, the compiler creates a separate vtable for each class. When an object is created, a pointer to this vtable, called the virtual table pointer, vpointer or VPTR, is added as a hidden member of this object (becoming its first member unless it's made the last). The compiler also generates "hidden" code in the constructor of each class to initialize the vpointers of its objects to the address of the corresponding vtable. Note that the location of the vpointer in the object instance is not standard among all compilers, and relying on the position may result in unportable code. For example, g++ previously placed the vpointer at the end of the object.

Example

Consider the following class declarations in C++ syntax:

class B1 {
public:
void f0 {}
virtual void f1 {}
int int_in_b1;
};

class B2 {
public:
virtual void f2 {}
int int_in_b2;
};

used to derive the following class:

class D : public B1, public B2 {
public:
void d {}
void f2 {} // override B2::f2
int int_in_d;
};

and the following piece of C++ code:

B2 *b2 = new B2;
D *d = new D;

g++ 3.4.6 from GCC

GNU Compiler Collection

The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

produces the following 32-bit memory layout for the object b2:G++'s -fdump-class-hierarchy argument can be used to dump virtual method tables for manual inspection. For AIX VisualAge XlC compiler, use -qdump_class_hierarchy to dump class hierarchy and virtual function table layout.



b2:

  +0: pointer to virtual method table of B2

  +4: value of int_in_b2



virtual method table of B2:

  +0: B2::f2

and the following memory layout for the object d:



d:

  +0: pointer to virtual method table of D (for B1)

  +4: value of int_in_b1

  +8: pointer to virtual method table of D (for B2)

 +12: value of int_in_b2

 +16: value of int_in_d



Total size: 20 Bytes.



virtual method table of D (for B1):

  +0: B1::f1  // B1::f1 is not overridden



virtual method table of D (for B2):

  +0: D::f2   // B2::f2 is overridden by D::f2

Note that those functions not carrying the keyword virtual in their declaration (such as f0 and d) do not generally appear in the vtable. There are exceptions for special cases as posed by the default constructor

Default constructor

In computer programming languages the term “default constructor” refers to a constructor that is automatically generated in the absence of explicit constructors ; this automatically provided constructor is usually a nullary constructor...

.

Overriding of the method f2 in class D is implemented by duplicating the virtual method table of B2 and replacing the pointer to B2::f2 with a pointer to D::f2.

Multiple inheritance and thunks

The G++ compiler implements the multiple inheritance

Multiple inheritance

Multiple inheritance is a feature of some object-oriented computer programming languages in which a class can inherit behaviors and features from more than one superclass....

of the classes B1 and B2 in class D using two virtual method tables, one for each base class. (There are other ways to implement multiple inheritance, but this is the most common.) This leads to the necessity for "pointer fixups", also called thunks, when casting.

Consider the following C++ code:

D *d = new D;
B1 *b1 = static_cast(d);
B2 *b2 = static_cast(d);

While d and b1 will point to the same memory location after execution of this code, b2 will point to the location d+8 (eight bytes beyond the memory location of d). Thus, b2 points to the region within d which "looks like" an instance of B2, i.e., has the same memory layout as an instance of B2.

Invocation

A call to d->f1 is handled by dereferencing d's D::B1 vpointer, looking up the f1 entry in the vtable, and then dereferencing that pointer to call the code.

In the case of single inheritance (or in a language with only single inheritance), if the vpointer is always the first element in d (as it is with many compilers), this reduces to the following pseudo-C++:

(*((*d)[0]))(d)

Where *d refers to the virtual method table of D and [0] refers to the first method in the vtable. The parameter d becomes the "this" pointer

This (computer science)

In many object-oriented programming languages, this is a keyword that is used in instance methods to refer to the object on which they are working. C++ and languages which derive in style from it generally use this...

to the object.

In the more general case, calling B1::f1 and D::f2 are more complicated:

(*(*(d[+0]/*pointer to virtual method table of D (for B1)*/)[0]))(d) /* Call d->f1 */
(*(*(d[+8]/*pointer to virtual method table of D (for B2)*/)[0]))(d+8) /* Call d->f2 */

The call to d->f1 passes a B1 pointer as a parameter. The call to d->f2 passes a B2 pointer as a parameter. This second call requires a fixup to produce the correct this pointer. It is impossible to call B2::f2 since it has been overridden in D's implementation. The location of B2::f2 is not in the vtable for D.

By comparison, a call to d->f0 is much simpler:

(*B1::f0)(d)

Efficiency

A virtual call requires at least an extra indexed dereference, and sometimes a "fixup" addition, compared to a non-virtual call, which is simply a jump to a compiled-in pointer. Therefore, calling virtual functions is inherently slower than calling non-virtual functions. An experiment done in 1996 indicates that approximately 6–13% of execution time is spent simply dispatching to the correct function, though the overhead can be as high as 50%. The cost of virtual functions may not be so high on modern architectures due to much larger caches and better branch prediction.

Furthermore, in environments where JIT compilation

Just-in-time compilation

In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

is not in use, virtual function calls usually cannot be inlined

Inline expansion

In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the callee. This optimization may improve time and space usage at runtime, at the possible cost of increasing the final size of the program In computing, inline...

. While a compiler could replace the lookup and indirect call with, for instance, a conditional execution of each inlined body, such optimizations are not common.

To avoid this overhead, compilers usually avoid using vtables whenever the call can be resolved at compile time

Compile time

In computer science, compile time refers to either the operations performed by a compiler , programming language requirements that must be met by source code for it to be successfully compiled , or properties of the program that can be reasoned about at compile time.The operations performed at...

.

Thus, the call to f1 above may not require a vtable lookup because the compiler may be able to tell that d can only hold a D at this point, and D does not override f1. Or the compiler (or optimizer) may be able to detect that there are no subclasses of B1 anywhere in the program that override f1. The call to B1::f1 or B2::f2 will probably not require a vtable lookup because the implementation is specified explicitly (although it does still require the 'this'-pointer fixup).

Comparison with alternatives

The vtable is generally a good performance trade-off to achieve dynamic dispatch, but there are alternatives, such as binary tree dispatch, with higher performance but different costs.

However, vtables only allow for single dispatch on the special "this" parameter, in contrast to multiple dispatch

Multiple dispatch

Multiple dispatch or multimethods or function overloading is the feature of some object-oriented programming languages in which a function or method can be dynamically dispatched based on the run time type of more than one of its arguments...

(as in CLOS or Dylan), where the types of all parameters can be taken into account in dispatching.

Vtables also only work if dispatching is constrained to a known set of methods, so they can be placed in a simple array built at compile time, in contrast to duck typing

Duck typing

In computer programming with object-oriented programming languages, duck typing is a style of dynamic typing in which an object's current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class or implementation of a specific interface...

languages (such as Smalltalk

Smalltalk

Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

, Python

Python (programming language)

Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

or JavaScript

JavaScript

JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

).

Languages that provide either or both of these features often dispatch by looking up a string in a hash table

Hash table

In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...

, or some other equivalent method. There are a variety of tricks to speed this up (e.g., interning/tokenizing method names, caching lookups, just-in-time compilation

Just-in-time compilation

), and dispatch time often does not have a significant effect on total processing time, but nonetheless, vtable lookup is obviously faster. Vtables are also simpler to implement and debug, and closer to the "C style" than hash tables of strings.

Note that Objective-C

Objective-C

Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

2.1 now supports vtable dispatch under 64-bit Mac OS X

Mac OS X

Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

10.6+.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.