C++ Machine Objects


Contents

Introduction

The Machine Objects class library (in short Macho) supports a subset of the UML statechart notation for implementing hierarchical state machines in straight C++, similar in spirit to the "State" design pattern. The currently supported features are hierarchical states, entry and exit actions, state histories and state variables.

The "just show me code, who has time to read" link to an example state machine: Microwave

For license information please see the MIT License or the header in file Macho.hpp.

Homepage is at  http://ehiti.sdf-eu.org/macho
Feedback goes to  macho@gmx.org

Copyright 2005 by Eduard Hiti.

You are encouraged to provide any changes, extensions and corrections for this software to the author for inclusion into future versions.

Motivation

I like the "Gang of Four" State design pattern. It enables implementing the important concept of state machines with common programming language features. By utilizing only basic language mechanisms it is easy to apply in real-life software development.

Another important property that stems from this simplicity is orthogonality, meaning that the pattern can be combined with other design elements, patterns and idioms in arbitrary ways.

In contrast stand the tool supported approaches to state machine creation (of which there is no shortage). Based on code generators and graphical editors, they tend to generate incomprehensible code as product and forfeit orthogonality by necessarily being outside the domain of the programming language.

Unfortunately the "State" pattern is limited in scope because it does not allow for hierarchical state machines. This is regrettable because flat state machines tend to become unwieldy when getting bigger, for the sheer number of states they produce.

Hierarchical state machines as defined by the statechart notation alleviate this problem by providing an additional structural element with the grouping of states into hierarchies, thereby facilitating the sharing of common logic between states.

The "State" pattern in its original form is not capable of modeling state hierarchies. The Macho class library extends the concept with this possibility, while keeping the properties of simplicity (where possible) and tool independence from its inspiration.

Installation

The class library as such does not need to be installed. Just include the header file Macho.hpp to make use of it. Prerequisite however is a C++ compiler with sane support for templates (like gcc 2.95+ or MSVC7+). For the obsolete but inexplicably popular compiler MSVC6 a special version of the library is provided.

Included are the example state machines HelloWorld, Example, Microwave and Test. To make the examples run just compile them in the directory they are in, for example:

# GCC
g++ -o microwave Microwave.cpp

# MSVC7
cl /EHsc Microwave.cpp

Design

The following descriptions assume some knowledge of the statechart notation. For more information see Wikipedia.

Starting point is the GoF "State" design pattern. In that pattern states are represented by classes. State transitions are performed by instantiating objects of those classes.

From this aspect the constructors and destructors of state classes can be seen as taking on the role of entry and exit actions. Object attributes then constitute state variables. Events are dispatched by calling methods on state objects which implement the guards, actions and transitions of states.

For states to compose a hierarchy, substates must be able to take over the event handling logic of superstates, selectively redefining it where necessary. There exists a mechanism in C++ allowing such redefinitions on method level: polymorphism through class inheritance.

Modeling the substate-superstate relation with class inheritance is problematic however: the use of constructors and destructors as entry and exit actions is not possible anymore, neither is keeping state variables in objects.

The reason is that base classes are constituent parts of deriving classes, meaning object construction or destruction will trigger all involved class constructors or destructors and initialize or destroy all data members.

This runs counter to the semantics of entry/exit actions and state variables, where a transition between sibling substates should not trigger superstate entry/exit actions nor destroy superstate state variables.

Our solution to this problem is to use explicit methods for state entry and exit, being called in the correct sequence on state transitions. State variables are kept in separate state specific data structures which have a life cycle consistent with the hierarchy of states.

Using Machine Objects

Before diving into implementation details we first define some terminology to be used from here on. By state we mean the class defined to represent a particular state. A state machine is a set of states having a common top state class. A state machine instance maintains a current state to which events can be dispatched. A state machine's event protocol is the set of events that are understood by the machine.

Furthermore we will use extracts from the simple state machine defined in the file Example.cpp for illustration. It is a good idea to have a look at this file now if you are reading this for the first time. It contains a state machine with four states: a top state, a superstate Super, two substates StateA and StateB, and a small test run. Each state has entry/exit actions, the top state and StateA have state variables and the event protocol consist of two events, event1 and event2.

State definitions

The top state of the example is defined by using the macro TOPSTATE:

TOPSTATE(Top) {
    ...
};

Every state machine must have a top state. The top state's interface defines the machine's event protocol: only the public virtual methods of the top state can be event handlers.

All other states of a machine are direct or indirect substates of the top state. The top state has a typedef alias TOP known to all states in the machine.

Substates are defined by the SUBSTATE macro:

SUBSTATE(Super, Top) {
    ...
};

This example defines the class Super as substate of class Top. The macro parameters are the new substate's name and the name of its superstate. The superstate can be the top state or any other substate. A typedef alias SUPER will point to the superstate within the substate class.

Why are we using macros in Macho? The reason being only convenience, you'll most likely agree with their use if you look at the definitions of the macros TOPSTATE and SUBSTATE:

#define TOPSTATE(TOP) \
    struct TOP : public Macho::Link< TOP, Macho::Root< TOP >  >

#define SUBSTATE(STATE, SUPERSTATE) \
    struct STATE : public Macho::Link< STATE, SUPERSTATE >

The code is more readable and actually less error-prone (notice the multiple use of macro parameters in TOPSTATE and SUBSTATE) by using macro expansion.

Another macro is STATE(S) which MUST appear in every state body:

SUBSTATE(Super, Top) {
    STATE(Super)
    ...
};

The macro parameter is the state's name again. With this macro some definitions are provided in the class body (a constructor for instance). For more details look up the macro's definition.

Every state must be instantiable (the top state as well)! This means states may not have pure virtual methods!

Entry and Exit

A state's (optional) entry and exit actions are defined this way:

TOPSTATE(Top) {
    ...
private:
    void entry();
    void exit();
    void init();
};

void Top::entry() { ... }
void Top::exit() { ... }
void Top::init() { ... }

The methods entry and exit of a state are called upon transitioning into or out of it. The sequence of calls for a state transition is as follows:

From this follows that self transitions (where the new state is the same state as the current state) will trigger that state's exit and entry actions.

Entry/exits actions may NOT initiate new state transitions! Doing otherwise will cause an assertion.

The method init defines a special kind of entry action: Upon state transition, entry actions of the new state and its superstates are triggered; init however is called only on the one state the transition actually goes to, after its entry action is performed. Please see the section History for an exception if the state has history enabled.

With init the initial transitions of states are implemented, where a superstate immediately enters a substate on direct entry (like top state in Example). This implies that init actions ARE in fact allowed to initiate state transitions (but only to substates). Please note that transitioning to a state in init will trigger that new state's init method in turn.

It is recommended that you put the methods exit, entry and init into the private section of your classes, at least for the top state, so that they may not be part of the event protocol.

Constructors or destructors can not be used for state classes. If you need to make initializations, do them in your entry actions or box constructors!

State variables

State variables are data members that have the lifetime of their associated state. For superstate variables this means that they exist as long as the machine is in that state or in any of its substates. Substates have full access to state variables of their superstates.

State variables allow states to accumulate information from the events they have received in the past. They can also be used to parametrize a new state to be entered by initializing them accordingly.

Since every state is a substate of the top state, top state variables are accessible to all states of a machine.

Here is an example of a definition of state variables:

TOPSTATE(Top) {
    struct Box {
        Box() : myData(0) {}
        Box(long l) : myData(l) {}
        long myData;
    };
    STATE(Top)
    ...
};

State variables are contained in a data type named Box nested into the state class (the type name Box is mandatory). The box definition must appear BEFORE the use of the STATE macro.

The box type must be default constructable (means: has a default constructor), but may have other constructors, too. Apart from this the box can be any regular C++ type. It could even be just a simple typedef:

TOPSTATE(Top) {
    typedef int Box;
    STATE(Top)
    ...
};

A box is created before its state is entered (before the call to entry) and destroyed after the state is left (after the call to exit).

A state's box is accessed by calling the method box, which returns a reference to the state's box object:

void StateA::event1(int i) {
    ...
    cout << box().myData;
    ...
}

Superstate boxes are accessed by qualifying the box method with the superstate's name:

void StateB::event2(long l) {
    ...
    cout << TOP::box().myData;
    ...
}

Here the top state's box is requested.

Event handlers

As mentioned the event protocol of a state machine is defined by its top state interface:

TOPSTATE(Top) {
    ...
    virtual void event1(int i) {}
    virtual void event2(long l) {}
    ...
};

The Example state machine understands the events event1 and event2: events are named after their event handler methods. Event handlers may have arbitrary parameters.

Event handlers are simple C++ methods:

SUBSTATE(StateA, Super) {
    ...
    void event1(int i);
    ...
};

void StateA::event1(int i) { ... }

In this fragment the event handler for event1 in state StateA is defined. The method implements all the logic to handle the event.

The top state event handlers define the default behaviour for the whole state machine. If there is no meaningful implementation for an event handler at top level, the handler could either

State transitions

State transitions are made by calling the method setState inside an event handler:

void StateA::event1(int i) {
    ...
    setState<StateB>();
}

The template parameter to setState is the new state to be entered.

The state transition takes place AFTER control flow leaves the event handler. This means it is possible to do meaningful work even after calling setState (all involved objects are still alive). It is however not allowed to call setState multiple times with different states in a single event handler run: this will assert on you!

There is an optional parameter to setState:

void Top::init() {
    setState<StateA>(new StateA::Box(44));
}

By creating a state's box prior transition it is possible to parametrize a new state by initializing its box suitably.

We recommend that you do not define event handlers inline to your state class definitions. The reason is that C++ needs a complete definition of a class to be used as template parameter, and that may not be the case for the classes you want to setState to in inline event handlers.

History

It is possible for a superstate to remember a previously entered substate. The next (direct) entry into the superstate will then reenter the remembered substate.

History is activated for a state by use of the macro HISTORY or DEEPHISTORY:

SUBSTATE(Super, Top) {
    STATE(Super)
    HISTORY()
    ...
};

Use of HISTORY selects the shallow history strategy, whereby only direct substates of a superstate are remembered. DEEPHISTORY remembers even the substates of substates.

History is only evaluated for a state if it is the direct target of a state transition. If the transition goes to one of its substates, that substate is entered without checking superstate history.

If a state has history enabled, its init method will not be called if a history state is available. This is because history takes precedence over initial transitions. This means that for a state with history, init is called only the very first time the state is entered.

It is possible to ignore history information for a state transition however by using the method setStateDirect instead of setState:

void StateA::event1(int i) {
    ...
    setStateDirect<Super>();
}

This example will enter Super without checking its history, calling init instead.

Keep in mind that state boxes are not remembered but rather instantiated anew for reentered history states!

Machine creation

A state machine instance is created by instantiating a machine object:

Macho::Machine<Example::Top> m();

The class Machine is a template class in namespace Macho; its template parameter is the top state of the state machine to be run.

The top state is immediately entered and initialized upon machine creation (entry and init are called on it).

Upon destruction of the machine object the current sub- and superstates up to and including the top state are exited correctly and their boxes destroyed.

The Machine constructor takes as optional parameter a box for the top state:

Macho::Machine<Example::Top> m(new Example::Top::Box(11));

This allows parametrization of a state machine on startup.

It is also possible to read the top state box of a running machine by calling the box method of Machine:

cout << m.box().myData;

The method box of class Machine returns a const reference to the top state box. It is therefore not possible to change box data (you should really do this from INSIDE your machine)!

A state's history for a particular machine instance can be cleared by calling the state's static method clearHistory with the machine object as parameter:

Example::Super::clearHistory(m);

This statement resets history information for Super inside machine m, without affecting substate history however (use clearHistoryDeep for this).

Another static method of state classes allows testing for the current state of a machine object:

assert(Example::StateA::isCurrent(m));

The method isCurrent returns true if the given machine object is in the selected state or any of its substates at that moment (use isCurrentDirect to ignore substates).

Event dispatch

The simplest way to dispatch events (synchronously) to a state machine is by calling event handlers through the machine object's arrow operator, a technique commonly found with C++ smart pointers:

m->event1(42);
m->event2(43);

Another possibility is to create explicit event objects that are then used as parameters to a machine's dispatch method:

IEvent<Example::Top> * event = Event(&Example::Top::event1, 42);
m.dispatch(event);

The Event function takes as parameters a pointer-to-member to an event handler and all parameters needed to invoke that event handler.

Currently the number of possible event parameters for Event is limited to four, but this limit can easily be raised. Of course the event parameters must be consistent with the event handler's signature.

The result of an Event call is a pointer to an object with the interface IEvent<T> created on the heap (with T being the top state of the state machine to dispatch to). This pointer can then be queued for later asynchronous dispatching:

typedef vector<IEvent<Example::Top> *> EventQueue;
EventQueue queue;
queue.push_back(Event(&Example::Top::event1, 42));
...
for (EventQueue::iterator it = queue.begin(); it != queue.end(); ++it)
    m.dispatch(*it);

Event objects are deleted after being dispatched to a machine and cannot be reused afterwards. To avoid memory leaks you should delete undispatched event objects explicitly.

To maximize performance the use of a custom memory allocator for event objects optimized for high frequency allocation of small objects is advised. Such an allocator however is not included in the Macho library. We recommend the SmallObject allocator in the free Loki library.

The Event function uses type inference to determine the types of event parameters. Since type inference is somewhat limited in C++ you must sometimes disambiguate event parameters by casting to avoid type errors:

m.dispatch(Event(&Example::Top::event2, (long) 42));

In this example the literal value 42 needs to be cast to long because the compiler will infer type int which is not the type expected by the event handler event2.

For compilers not supporting "Koenig lookup" name resolution you will have to call Event fully qualified:

Macho::Event(&Example::Top::event1, 42);

Internal event dispatch

Event objects can also be dispatched inside an event handler:

void StateA::event1(int i) {
    ...
    dispatch(Event(&Top::event2, (long) 42));
}

The difference to calling the event handler directly is that the event object is dispatched after control flow has left the event handler, and then not until a pending state transition has been performed. Therefore the following code fragment:

void StateA::event1(int i) {
    setState<StateB>();
    dispatch(Event(&Top::event1, i));
}

is equivalent to:

void StateA::event1(int i) {
    dispatch(Event(&Top::event1, i));
    setState<StateB>();
}

The consequence in both examples is that the event object will be dispatched in the context of a new state.

The main intended use of this feature is indeed the "forwarding" of events to a new state better suited to handle a particular event.

There are similarities to the concept of "deferred" events in the UML specification, although these have conceptual problems (to be discussed in a later version of this document) which prohibit their implementation in the Macho library.

Please note that only a single event can be dispatched inside an event handler. This is not a technical limitation, but rather an aesthetical: Event chains are hard to follow (since any event could change state) and make the state machine brittle to changes in structure.

Because it should always be possible to coalesce event chains into a single new event having the same side effects (the state machine is deterministic after all), there seems to be no good reason to allow internal dispatching of more than one event.

For similar reasons it is also not possible to dispatch events in entry/exit actions or init.


For more detailed information about Macho please consult the implementation source code and included examples.


Conclusion

The Macho library makes it possible to create complex state machines in C++ with the well known mechanisms of class inheritance and object polymorphism. But let us look a little harder: there is something else going on, too. With state machines stripped of their predominant graphical notation, a different concept seems to shine through: type migration!

In implementation and behaviour it is conceivable to view Machine Objects as objects changing their type within a single-rooted multi-level inheritance tree at runtime!

This could be used as an approach to tackle a long standing problem in object-oriented programming: the combination of mutable values with static taxonomies. The issue thereby is that values of specific type could change at runtime in such ways that their type's invariant property does not hold anymore.

The classic example of this problem is the case of Rectangle/Square: a type Square can reasonably be viewed as a subtype of Rectangle. By changing a Square object's width independently from its height (an operation quite conceivable for a Rectangle), a square could become a rectangle and vice versa.

Currently there are no convincing solutions in standard object-oriented programming languages for these kind of problems. Machine Objects cope with the Rectangle/Square case easily, and could be used for much more complex situations.

FAQ


Q: I'm getting weird compiler errors...
A: Check if you have
  • used the STATE macro in your state class bodies
  • defined your boxes before using the STATE macro
  • declared your top state with the TOPSTATE macro
  • disambiguated event parameters by casting
Q: How fast is this stuff?
A: Benchmarking the nontrivial state machine Test (without printing to console) gives me around 1.000.000 state transitions per second on my 2.4 GHz P4 machine.

Q: Does it work with the Visual Studio 6 compiler?
A: You'll find a variant version of Macho for this compiler in the "msvc6" directory. Please consult the contained "Readme" file for differences!

Version History