Overview
THIS ARTICLE IS AN INTRODUCTION to object-oriented programming.
There are many approaches to object-oriented design, as evidenced by the number of books written about it. The following introduction takes a fairly pragmatic approach and doesn’t spend a lot of time on design, but the design-oriented approaches can be quite useful to newcomers.
What Is an Object?
An object is merely a collection of related information and functionality. An object can be something that has a corresponding real-world manifestation (such as an employee object), something that has some virtual meaning (such as a window on the screen), or just some convenient abstraction within a program (a list of work to be done, for example).
An object is composed of the data that describes the object and the operations that can be performed on the object. Information stored in an employee object, for example, might be various identification information (name, address), work information (job title, salary), and so on. The operations performed might include creating an employee paycheck or promoting an employee. When creating an object-oriented design, the first step is to determine what the objects are. When dealing with real-life objects, this is often straightforward, but when dealing with the virtual world, the boundaries become less clear. That’s where the art of good design shows up, and it’s why good architects are in such demand.
Inheritance
Inheritance is a fundamental feature of an object-oriented system, and it is simply the ability to inherit data and functionality from a parent object. Rather than developing new objects from scratch, new code can be based on the work of other programmers, adding only the new features that are needed. The parent object that the new work is based upon is known as a base class, and the child object is known as a derived class. Inheritance gets a lot of attention in explanations of object-oriented design, but the use of inheritance isn’t particularly widespread in most designs. There are several reasons for this.
First, inheritance is an example of what is known in object-oriented design as an "is-a" relationship. If a system has an animal object and a cat object, the cat object could inherit from the animal object, because a cat "is-a" animal. In inheritance, the base class is always more generalized than the derived class. The cat class would inherit the eat function from the animal class, and would have an enhanced sleep function.
In real-world design, such relationships aren’t particularly common. Second, to use inheritance, the base class needs to be designed with inheritance in mind. This is important for several reasons. If the objects don’t have the proper structure, inheritance can’t really work well. More importantly, a design that enables inheritance also makes it clear that the author of the base class is willing to support other classes inheriting from the class. If a new class is inherited from a class where this isn’t the case, the base class might at some point change, breaking the derived class.
Some less-experienced programmers mistakenly believe that inheritance is "supposed to be" used widely in object-oriented programming, and therefore use it far too often. Inheritance should only be used when the advantages that it brings are needed. See the coming section on "Polymorphism and Virtual Functions."
In the .NET Common Language Runtime, all objects are inherited from the ultimate base class named object, and there is only single inheritance of objects (i.e., an object can only be derived from one base class). This does prevent the use of some common idioms available in multiple-inheritance systems such as C++, but it also removes many abuses of multiple inheritance and provides a fair amount of simplification. In most cases, it’s a good tradeoff. The .NET Runtime does allow multiple inheritance in the form of interfaces, which cannot contain implementation.
Containment
So, if inheritance isn’t the right choice, what is? The answer is containment, also known as aggregation. Rather than saying that an object is an example of another object, an instance of that other object will be contained inside the object. So, instead of having a class look like a string, the class will contain a string (or array, or hash table). The default design choice should be containment, and you should switch to inheritance only if needed (i.e., if there really is an "is-a" relationship). At this point there should perhaps be an appropriate comment about standing "on the shoulders of giants…" Perhaps there should be a paper called "Multiple inheritance considered harmful." There probably is one, someplace.
Polymorphism and Virtual Functions
A while back I was writing a music system, and I decided that I wanted to be able to support both WinAmp and Windows Media Player as playback engines, but I didn’t want all of my code to have to know which engine it was using. I therefore defined an abstract class, which is a class that defines the functions a derived class must implement, and that sometimes provides functions that are useful to both classes.
In this case, the abstract class was called MusicServer, and it had functions like Play(), NextSong(), Pause(), etc. Each of these functions was declared as abstract, so that each player class would have to implement those functions themselves.
Abstract functions are automatically virtual functions, which allow the programmer to use polymorphism to make their code simpler. When there is a virtual function, the programmer can pass around a reference to the abstract class rather than the derived class, and the compiler will write code to call the appropriate version of the function at runtime.
An example will probably make that clearer. The music system supports both WinAmp and Windows Media Player as playback engines. The following is a basic outline of what the classes look like:
using System;
public abstract class MusicServer {
public abstract void Play();
}
public class WinAmpServer: MusicServer {
public override void Play() {
Console.WriteLine("WinAmpServer.Play()");
}
}
public class MediaServer: MusicServer {
public override void Play() {
Console.WriteLine("MediaServer.Play()");
}
}
class Test {
public static void CallPlay(MusicServer ms) {
ms.Play();
}
public static void Main() {
MusicServer ms = new WinAmpServer();
CallPlay(ms); ms = new MediaServer();
CallPlay(ms);
}
}
This code produces the following output:
WinAmpServer.Play()
MediaServer.Play()
Polymorphism and virtual functions are used in many places in the .NET Runtime system. For example, the base object object has a virtual function called ToString() that is used to convert an object into a string representation of the object. If you call the ToString() function on an object that doesn’t have its own version of ToString(), the version of the ToString() function that’s part of the object class will be called,[3] which simply returns the name of the class. If you overload—write your own version of—the ToString() function, that one will be called instead, and you can do something more meaningful, such as writing out the name of the employee contained in the employee object. In the music system, this meant overloading functions for play, pause, next song, etc.
Or, if there is a base class of the current object, and it defines ToString(), that version will be called.
Encapsulation and Visibility
When designing objects, the programmer gets to decide how much of the object is visible to the user, and how much is private within the object. Details that aren’t visible to the user are said to be encapsulated in the class. In general, the goal when designing an object is to encapsulate as much of the class as possible. The most important reasons for doing this are these:
- The user can’t change private things in the object, which reduces the chance that the user will either change or depend upon such details in their code. If the user does depend on these details, changes made to the object may break the user’s code.
- Changes made in the public parts of an object must remain compatible with the previous version. The more that is visible to the user, the fewer things that can be changed without breaking the user’s code.
- Larger interfaces increase the complexity of the entire system. Private fields can only be accessed from within the class; public fields can be accessed through any instance of the class. Having more public fields often makes debugging much tougher.