presented by Mathias Ricken
Welcome to our workshop about object-oriented design! In the next three hours, we will use two example projects, a small temperature calculator and a large card game framework, to discuss object-oriented software design.
In this first part of the workshop, we will use the temperature calculator to talk about problem analysis, general themes in object-oriented programming, such as abstraction, separation of variants and invariants, and decouping, and their benefits. We do, however, assume that you are familiar with the basic principles of object-oriented programming like polymorphism, inheritance, and composition.
In the second part of the workshop, we will use the card game framework for a group exercise. At the end of the workshop, volunteers get the chance to describe their designs, and we will compare their solutions and the one we have created.
Let's assume that instead of traveling to beautiful Houston, Texas, to attend this conference, you would take a vacation. Your cousin Pierre has invited you to spend two weeks with his family in Paris, France. Before you leave, however, he warns you: "It's 35 degrees here!" Remembering that France measures temperatures in degrees Celsius, you frown... Does Pierre want you to pack your winter coat or your bathing suit?
Of course, we all know how to solve this problem. Simple algebra is sufficient, we may just have to look up the formula before we can apply it: F = (C / 5) * 9 + 32. In this case, C = 35, so the solution is
F = (35 / 5) * 9 + 32 = 7 * 9 + 32 = 63 + 32 = 95
35 degrees Celsius are equivalent to 95 degrees Fahrenheit. Pack the bathing suit.
By performing this little calculation, we have solved our simple problem: We know how hot it is in Paris and can pack accordingly. Since we have solved the problen for just one value, the 35 degrees Celsius, this solution is called a "constant solution".
The good thing about this program is that it is simple, easy to understand, and most importantly it correctly solves the original problem. However, if we change slightly the problem to converting 30 degrees Celsius to Fahrenheit, the program no longer is applicable. It is correct, but not flexible.
Intuitively, though, the problem has not really changed at all: the "real" problem is about converting a temperature measurement in Celsius to the corresponding value in Fahrenheit. What has changed is the value of the temperature measurement.
The above constant program fails to capture the variant/invariant nature of the mathematical formula, and as a result, cannot be used to solve the same conversion problem with a different temperature measurement.
We are Computer Scientists. We are interested in solving families of problems. What we want to do is to write a program that has an invariant part that corresponds to the formula and a variant part that allows entering arbitrary values for C. We therefore write a Java method that mirrors the mathematical formula:
public double convertCtoF(double degreesCelsius) { return degreesCelsius / 5.0 * 9.0 + 32.0; }
The formula is encoded in the method body and refers to the value of the parameter, degreesCelsius, for the temperature. This solution captures the mathematical formula more faithfully. The solution is still simple and easy to understand, but now we are able to solve all problems that involve converting temperature in Celsius to Fahrenheit!
We have achieved this by separating variants from invariants. Here is a UML class diagram of our solution:
The applet below demonstrates the new functionality:
Our program still does not do a lot: It only allows converting from Celsius to Fahrenheit, not conversion in the opposite direction. Granted, we have already solved infinitely more problems than we had originally planned, but anyone using our program will quickly tell us that its inadequate. This becomes especially obvious if the program has a graphical user interface (GUI) like our program above: Why is one text field enabled and the other disabled?
If this program is intended for more than just personal use, we may find that we have to add the capability to convert in both directions. So we enable the second list box and modify our model by adding a second method to it:
public double convertFtoC(double degreesFahrenheit) { return (degreesFahrenheit - 32.0) / 9.0 * 5.0; }
public double convertCtoF(double degreesCelsius) { return degreesCelsius / 5.0 * 9.0 + 32.0; }
Here is the updated UML class diagram:
The applet below shows the added feature:
While we can now convert in both directions, there isn't very much new. We have just added a second method, one that does the same thing found in the first one: It applies a mathematical formula to the parameter and returns the value. To achive the new feature, we had to add code to our program. In fact, if we just look at our model, we had to double the amount of code!
The pattern for extending this to a third temperature scale, perhaps Kelvin, should be clear: We just add more methods that convert from one temperature scale to another. Here's code for our model that supports arbitrary conversion between Celsius, Fahrenheit, and Kelvin:
public double convertFtoC(double degreesFahrenheit) { return (degreesFahrenheit - 32.0) / 9.0 * 5.0; }
public double convertCtoF(double degreesCelsius) { return degreesCelsius / 5.0 * 9.0 + 32.0; }
public double convertCtoK(double degreesCelsius) { return degreesCelsius + 273.2; }
public double convertKtoC(double kelvin) { return kelvin - 273.2; }
public double convertFtoK(double degreesFahrenheit) { return (degreesFahrenheit + 459.7) / 9.0 * 5.0; }
public double convertKtoF(double kelvin) { return kelvin / 5.0 * 9.0 - 459.7; }
While the code still isn't complicated, it is getting a little long, considering we are in principle always doing the same thing: Converting from one temperature scale to another. It's shocking that we had to add four new methods, for a total of six, to support just one additional temperature scale. If we were to add a fourth scale, we would have to have 12 methods already. In general, for n temperature scales, we need n(n-1) methods. The size of our code increases quadratically with the number of temperature scales!
It could be even worse: Depending on the way we present the model to the user, even more methods may be needed. Consider this applet and its model code:
public double convertFtoF(double degreesFahrenheit) { return degreesFahrenheit; }
public double convertCtoF(double degreesCelsius) { return degreesCelsius / 5.0 * 9.0 + 32.0; }
public double convertFtoC(double degreesFahrenheit) { return (degreesFahrenheit - 32.0) / 9.0 * 5.0; }
public double convertCtoC(double degreesCelsius) { return degreesCelsius; }
While it supports only two temperature scales, Fahrenheit and Celsius, the users are allowed to select the scale used for input and output independently. They may therefore "convert" from Celsius to Celsius or from Fahrenheit to Fahrenheit. While the size of our code still grows quadratically, we now need a full n2 methods to support this, and when we write code, we certainly care about the difference between n2 and n(n-1)!
You may say that this notion of converting doesn't make sense, that nobody would ever want such a GUI, but that might not be true if your converter is part of a larger program package, one that does more data processing. Perhaps the user can choose from a library of formulas that are written in terms of different units, but input and output should be independent of the units the formula uses. NASA may tell you that it's a bad idea to use formulas that use more than one unit for the same quantity, but we are told people nonetheless do that from time to time.
We can improve our situation slightly by realizing that (ignoring issues of numerical stability) we don't have to convert from Celsius to Kelvin directly. We can do so by converting from Celsius to Fahrenheit first, something our old model can already do, and then converting from Fahrenheit to Kelvin. We can consider the temperature scales as nodes of a directed graph, and the conversion methods of our model as edges: As long as there is a path from every node to every other node, we can perform arbitrary conversions.
The easiest way to guarantee this is to set up a star-shaped graph: We place one temperature scale in the center and all other scales around it; then we provide conversion methods to the center and back. The choice of the temperature scale in the middle is arbitrary. It doesn't even have to be one of the scales we are using; it can be completely made up as long as it is consistent and there are conversions to and from all the other scales.
The star-shaped topology means that we now only have to add two conversion methods for every new scale, one to the scale in the center and one back from it. The size of our code now only grows linearly.
There are a few lessons we can take away from this simple problem at this time: First, the specifications are going to change. Regardless of what the users state as task initially, when they see the product, they will ask for changes. The way you design your GUI may have a large impact on that. Of course, the users will probably have to pay for changing the specifications, but you still want to be able to make the changes as easily as possible. That requires programming for change, i.e. anticipating the need for modification and writing your program in a flexible and extensible way. Ideally, you would want to add features at runtime, like new temperature scales, without any change to existing code. NASA may also be interested in writing robust code, a programming in a way that helps you avoid errors.
Please take a look at the applet below. It displays a list of three temperature scales, Celsius, Fahrenheit, and Kelvin, and allows you to select several of them. When you have selected two or more, you may press the "Convert..." button to open another window that supports the arbitrary conversion between the selected temperature scales.
Furthermore, there is an "Add Unit..." button that opens up a dialog with a text field. You can load a new temperature scale by typing in its fully qualified class name: Try model.temperature.Reamur, for example. If you are using the applet on the website or the jar file, you can only use the temperature scales we have provided, but if you extract the jar and experiment with it, you can write your own classes and load them at runtime.
Loading temperature scales at runtime excludes the possibility to hard-code all temperature scales in the model. Let's take a moment to figure out how we can write the program below using the principles we have seen so far. Remember that we were able to solve an all Celsius-to-Fahrenheit conversions by separating the variant (the temperature value) from the invariant (the mathematical formula). What are the variants and invariants in this problem?
You can add the following class names using the "Add Unit..." button: model.temperature.Celsius model.temperature.Fahrenheit model.temperature.Kelvin model.temperature.Reamur |
To simplify our design, let's assume that we are using a star-shaped topology, as we have discussed before. For simplicity, let's pick an base temperature scale that is identical to the Celsius scale (i.e. the conversion functions from the Celsius scale to the base scale and back are just the identity function).
Now we just need to provide the two functions for a temperature scale, one to and one from the base scale, to define and load a new temperature scale at runtime. Since Java does this on a class basis, we need to encapsulate this into a class.
In the functional programming world, functions have been treated as data for a long time. Such a function, called a lambda, can be passed around, stored, and then later applied. In the object-oriented world, it is often also called a command (there is also a command design pattern). We can easily model a unary lambda interface using generic Java:
public interface ILambda <R, P> { /** * Apply the lambda. * @param param lambda-specific parameter * @return lambda-specific return value */ public R apply(P param); }
A function accepting a Double and returning a Double twice the magnitude can be written as an anonymous inner class like this.
ILambda<Double,Double> dbl = new ILambda<Double,Double>() {
public Double apply(Double param) { return param * 2; }
};
Whenever we want to use the lambda, we apply it to a value:
double d = dbl.apply(10.0); // d now contains 20
The two conversion functions that a temperature scale has to provide happen to be inverses of each other: f(g(x)) = x. Functions that have an inverse are called bijections. We therefore model a bijection using Java, which can be done by extending the ILambda interface:
public interface IBijection <R,P> extends ILambda<R, P> { /** * Returns the inverse of this lambda, which is also a bijection. * * @return inverse */ public IBijection<P, R> getInverse(); }
An IBijection is just an ILambda that can provide its own inverse, which is another IBijection. Note that the domain of one function (the parameter type) is the range of its inverse (the return type), and vice versa, but since our conversion functions map real numbers (Double) to real numbers, both conversion functions are IBijection<Double,Double>.
To pick up our example of the function that doubles its input, here is the bijection that does the same:
IBijection<Double,Double> dbl = new IBijection<Double,Double>() { IBijection<Double,Double> _this = this; public Double apply(Double param) { return param * 2; } public IBijection<Double,Double> getInverse() { return new IBijection<Double,Double>() { public Double apply(Double param) { return param / 2; } public IBijection<Double,Double> getInverse() { return _this; } }; }
};
double d = dbl.apply(10.0); // d now contains 20 d = dbl.getInverse().apply(20.0); // d now contains 10
There is another part of a temperature scale that we have ignored so far: The string representation, i.e. "°C", "°F", or "K". Since there are several pieces of data - the conversion functions and the string representation, we these should be bundled together into a single class to represent an abstract temperature scale. If they are stored together as part of one class, there is less risk that data that belongs together gets disassociated. This is the purpose of the AUnit class.
For the same reason, we should create a class to represent a quantity in one of the temperature scales, i.e. a number-unit pair. This Measurement class represents both the quantity (a double) and the unit (an AUnit) together and prevents misinterpreting the quantity as being expressed in another unit.
Since a unit is an object, be can build intelligence into it and provide a method to automatically convert a measurement expressed in one unit to a measurement in another unit. This is an example of the template method design pattern and the very heart of our design:
public abstract class AUnit {
...
/**
* Converts a measurement in some unit to a measurement in this unit.
* @param m measurement in some unit
* @return measurement in this unit
*/
public Measurement convertTo(Measurement m) {
double amount = m.getValue();
double amountInBase = m.getUnit().getConversionFunction().apply(amount);
double amountInThis = getConversionFunction().getInverse().apply(amountInBase);
return new Measurement(amountInThis, this);
}
}
This method abstractly encodes the entire process of converting to the base scale (by applying m.getUnit().getConversionFunction() to amount) and back to the desired scale (by applying getConversionFunction().getInverse() to amountInBase). This method represents our invariant: It is always the same. The variants are encoded in the conversion functions that are being used from within this template:
model.temperature.Celsius, model.temperature.Fahrenheit, model.temperature.Kelvin, and model.temperature.Reamur extend the abstract class AUnit and provide the conversion functions and the string representation, thereby placing the variants in the appropriate gaps in the template.
To allow dynamic loading, the program keeps a list of AUnit instances it has available and places them in the list. It does not care about which ones they are; in fact, right now it does not care whether a unit is present several times. Try adding Celsius again! It is completely oblivious to the variant behavior of the different temperature scales and just deals with them at the highest level of abstraction.
For greater detail, please refer to the UML class diagram below and the source code.
Ultimately, we can employ the same techniques of abstraction and separating variants from invariants to convert more than just temperatures. Abstractly, in our program a temperature is just a number with a unit attached to it, and we can convert it to another number with a unit attached as long, as the two units measure the same thing - temparature, so far.
The applet below provides a general unit conversion calculator for arbitrary "measurable" quantities. As examples, we have included temperature and distance. For each "measurable", units can be added at runtime, just like we have seen before. However, we can also add new "measurables" at runtime, like weight, or even currency (we might want it to be backed by gold to have a common denominator). You can select a "measurable" from the drop-down box, and then select two or more units of it for the conversion.
You can add the following class names using the "Add Unit..." button: model.temperature.Celsius model.temperature.Fahrenheit model.temperature.Kelvin model.temperature.Reamur model.length.Meter model.length.Yard model.length.Mile model.volume.Liter model.volume.Pint model.volume.Gallon You can add the following class names using the "Add..." entry in the "measurables" drop-down box: model.temperature.ITemperature model.length.ILength model.volume.IVolume |
The program still works using the same design as the temperature calculator above, except that we keep multiple lists of units and only allow conversion among units in the same list: temperatures to temperatures, lengths to lengths, etc. This brings up an additional problem: How do we ensure that we cannot convert a temperature to a length? We could check the class name when a unit is added to a list, but that only provides protection from user errors, not from programmer errors.
We decided to use Java generics again and "tag" the AUnit and Measurement classes with an interface describing what it measures:
public class Measurement <M extends IMeasurable> { ... }
public abstract class AUnit <M extends IMeasurable> {
... public Measurement<M> convertTo(Measurement<M> m) { ... } }
Celsius now extends AUnit<ITemperature>, whereas Meter extends AUnit<ILength>. Since the convertTo method expects and returns a Measurement with the same tag as the unit that serves as target of the conversion, the compiler will issue a compile-time error if the programmer tries to convert to an incompatible unit.
Unfortunately, due to the way generics are implemented, types are not first class in Java and we still have to perform the class name check when the user adds a unit to a list, and therefore, we have to largely work without these compile-time safety measures.
This solution is not only correct, it is also flexible and extensible. We can add units, even completely new unit systems, at runtime. It is robust as well, because it uses an intelligent object to represent the number-unit pair that avoids misinterpreting a number of one unit for one in another unit. Since it the program is modular (so classes can be loaded at runtime), it is also easy to test: Once the framework has been tested, classes to be added at runtime can be tested in isolation from each other.
We have achieved this again by programming abstractly, by separating variants from invariants, by encoding the template of converting from one unit to another in the framework, and then encapsulating the varying conversion formulas in external objects.
Let's move on to a bigger, more interesting example and apply these ideas!