C++ Machine Objects

Table of Contents

1   Introduction
2   Motivation
3   Installation
4   Design
5   Using Machine Objects
6   Conclusion
7   FAQ
8   Version History

1   Introduction

The Machine Objects class library allows the creation of state machines based on the "State" design pattern in plain C++. It extends the pattern with the option to create hierarchical state machines, making it possible to convert the popular UML statechart notation to working code in a straightforward way. Other features are entry and exit actions, state histories and state variables.

The "just show me code" link to an example state machine: Microwave

For license information please see the MIT License or the header in file Macho.hpp.

Homepage is at  http://ehiti.de/machine_objects   Syndicate This Site (Atom)
Feedback goes to  reverse:de.ehiti@macho

Copyright 2005 by Eduard Hiti.

You are encouraged to provide any changes, extensions and corrections for this software to the author for inclusion into future versions.

2   Motivation

In my experience as software developer I have found the State design pattern to be very useful. It enables implementing the important concept of state machines with common programming language features. By using only basic language mechanisms it is easy to apply in real-life software development.

Another important property that stems from this simplicity is orthogonality: the pattern can be combined with other design elements, patterns and idioms in arbitrary ways.

In contrast stand tool supported approaches to state machine creation (of which there is no shortage). Based on code generators and graphical editors, they tend to generate incomprehensible code as product and forfeit orthogonality by necessarily being outside the domain of the programming language.

Unfortunately the "State" pattern is limited in scope because it does not allow for hierarchical state machines. This is regrettable because flat state machines tend to become unwieldy when getting more elaborate, for the sheer number of states they produce.

Hierarchical state machines as defined by the statechart notation alleviate this problem by providing an additional structural element with the grouping of states into hierarchies, thereby facilitating the sharing of common logic between states.

The "State" pattern in its original form is not capable of modeling state hierarchies. The Machine Objects class library (in short Macho) extends the concept with this possibility, while keeping the properties of simplicity (where possible) and tool independence from its inspiration.

3   Installation

The class library as such does not need to be installed. Just include the header file Macho.hpp and add the file Macho.cpp to your make file or project definition.

Prerequisite however is a C++ compiler with sane support for templates (like gcc 2.95+ or MSVC7+). For the obsolete but inexplicably popular compiler MSVC6 a special version of the library is provided.

Included are the example state machines HelloWorld, Example, Microwave and Test. To make the examples run just compile them in the directory they are in, for example:

# GCC
g++ -o microwave Microwave.cpp Macho.cpp

# MSVC7
cl /EHsc Microwave.cpp Macho.cpp

4   Design

The following descriptions assume some knowledge of the statechart notation. For more information see Wikipedia .

Starting point is the "State" design pattern. The essence of the pattern is to represent states by classes. State transitions are performed by instantiating objects of these classes.

From this perspective the constructors and destructors of state classes can be seen as taking on the role of entry and exit actions. Object attributes then constitute state variables. Events are dispatched by calling methods on state objects which implement the guards, actions and transitions of states.

For states to compose a hierarchy, substates must be able to take over the event handling logic of superstates, selectively redefining it where necessary. There exists a mechanism in C++ allowing redefinition of behaviour on the level of methods: polymorphism through class inheritance.

Modeling the substate/superstate relation with class inheritance is problematic however: the use of constructors and destructors as entry and exit actions is not possible anymore, neither is keeping state variables in objects.

The reason is that base classes are constituent parts of deriving classes, meaning object construction or destruction will trigger all involved class constructors or destructors and initialize or destroy all data members.

This runs counter to the semantics of entry/exit actions and state variables, where a transition between sibling substates should not trigger superstate entry/exit actions nor destroy superstate state variables.

Our solution to this problem is to use explicit methods for state entry and exit, being called in the correct sequence on state transitions. State variables are kept in separate state specific data structures which have a life cycle consistent with the hierarchy of states.

5   Using Machine Objects

Before diving into implementation details we first define some terminology to be used from here on.

By state class we mean the class defined to represent a particular state. A state machine is a partially ordered set of states having a common top state class. A state machine instance maintains an element of that set as current state to which events can be dispatched. A state machine's event protocol is the set of events that are understood by the machine.

Furthermore extracts from the simple state machine defined in the file Example.cpp will be used for illustration. It is a good idea to have a look at this file now if you are reading this for the first time. It contains a state machine with four states: a top state, a superstate Super, two substates StateA and StateB, and a small test run. Each state has entry/exit actions, the top state and StateA have state variables and the event protocol consist of two events, event1 and event2.

5.1   State Definitions

A top state is defined by the macro TOPSTATE:

TOPSTATE(Top) {
    ...
};

Every state machine must have a top state. The top state's interface defines the machine's event protocol: only the public virtual methods of the top state can be event handlers.

Note:The default visibility in states is public.

The top state is representative for the whole state machine. All other states of a machine are direct or indirect substates of the top state.

Note: The top state has a typedef alias TOP available to all states of the machine.

Substates are defined by the SUBSTATE macro:

SUBSTATE(Super, Top) {
    ...
};

This statement defines the class Super as substate of class Top. The macro parameters are the substate's name and the name of its superstate. The superstate can be the top state or any other substate.

Note: A typedef alias SUPER will point to the superstate within the substate class.

Why are macros used for Machine Objects? The reason being convenience only, you'll most likely agree with their use if you look at the definitions of the macros TOPSTATE and SUBSTATE:

#define TOPSTATE(TOP) \
    struct TOP : public Macho::Link< TOP, Macho::TopBase< TOP >  >
#define SUBSTATE(STATE, SUPERSTATE) \
    struct STATE : public Macho::Link< STATE, SUPERSTATE >

The code is more readable and actually less error-prone (notice the multiple use of the same macro parameter in TOPSTATE and SUBSTATE) by using macro expansion.

Another macro is STATE(S) which MUST be invoked in every state body:

SUBSTATE(Super, Top) {
    STATE(Super)
    ...
};

The macro parameter is the state's name again. With this macro some definitions are provided in the class body (a constructor for instance). For more details look up the macro's definition.

Note:
 
Every state class must be instantiable (the top state as well). This means states must not have pure virtual methods!

5.2   Entry and Exit

A state's (optional) entry and exit actions are defined this way:

TOPSTATE(Top) {
    ...
private:
    void entry();
    void exit();
    void init();
};

void Top::entry() { ... }
void Top::exit() { ... }
void Top::init() { ... }

The methods entry and exit of a state are called upon transitioning into or out of it. The sequence of calls for a state transition is as follows:

Note:
 
From this follows that self transitions (where the new state is the same state as the current state) will trigger that state's exit and entry actions.

Entries and exits may NOT initiate new state transitions (doing otherwise will cause an assertion).

The method init defines a special kind of entry action: Upon state transition, entry actions of the new state and its superstates are triggered; init however is called only on the one state the transition actually goes to (after its entry action is performed).

With init initial transitions of states can be implemented, where a superstate immediately enters a substate on entry (like the top state in Example). This implies that init actions ARE in fact allowed to initiate state transitions (but only to substates). Transitioning to a state in init will trigger that new state's init method in turn.

Constructors or destructors can not be used for state classes: if you need to make initializations, do them in entry actions or box constructors.

Note:
 
It is recommended that you put the methods exit, entry and init into the private section of your classes, at least for the top state, so that they are not part of the event protocol.

5.3   State Variables

State variables allow states to accumulate information from the events they have received in the past. This information is maintained in the scope of the associated state: the variables of a superstate are accessible to that state or any of its substates. Substates have full access to state variables of their superstates.

Note:
 
Since every state is a substate of the top state, top state variables are accessible to all states of a machine.

This is how state variables are defined:

TOPSTATE(Top) {
    struct Box {
        Box() : data(0) {}
        long data;
    };
    STATE(Top)
    ...
};

State variables are contained in a data type named Box nested into the state class (the type name Box is mandatory).

Note:
 
The box definition must appear BEFORE the use of the STATE macro.

The box type must be default constructable (means: has a default constructor). Apart from this requirement the box type can be any regular C++ type. It could even be a simple typedef:

TOPSTATE(Top) {
    typedef int Box;
    STATE(Top)
    ...
};

A box is created before its state is entered (before the call to entry) and by default destroyed after the state is left (after the call to exit). By marking a state class with the PERSISTENT macro, you can override this default behaviour and have boxes survive state transitions:

SUBSTATE(StateA, Top)
    struct Box {
        ...
    };
    STATE(StateA)
    PERSISTENT()
    ...
};

Persistent boxes are created once at first entry of their state and exist for as long as the state machine instance itself exists.

A state's box is accessed by calling the method box, which returns a reference to the state's box object:

void StateA::event1(int i) {
    ...
    cout << box().data;
    ...
}

Superstate boxes are available by qualifying the box method with the superstate's name:

void StateB::event2(long l) {
    ...
    cout << TOP::box().data;
    ...
}

In this example the top state's box is accessed.

5.4   Event Handlers

As mentioned the event protocol of a state machine is defined by its top state public interface:

TOPSTATE(Top) {
    ...
    virtual void event1(int i) {}
    virtual void event2(long l) {}
    ...
};

The Example state machine understands the events event1 and event2: events are named like their event handler methods.

Event handlers are simple C++ methods and may have arbitrary parameters:

SUBSTATE(StateA, Super) {
    ...
    void event1(int i);
    ...
};

void StateA::event1(int i) { ... }

This example defines the event handler for event1 in StateA. The method should implement all the logic to handle the event.

Note:
 
Event handlers may have return values.

The top state event handlers define the default behaviour for the whole state machine. If there is no meaningful implementation for an event handler at top level, the handler could either

5.5   State Transitions

State transitions are made by calling the method setState inside an event handler:

void StateA::event1(int i) {
    ...
    setState<StateB>();
}

The template parameter to setState is the new state to be entered.

Note:
 
 
We recommend that you do not define event handlers inline to your state class definitions.
The reason is that C++ needs a complete definition of a class to be used as template parameter, and that may not be the case for the classes you want to setState to in inline event handlers.

The state transition takes place AFTER control flow leaves the event handler. It is therefore possible to do meaningful work even after calling setState (all involved objects are still alive). It is however not allowed to call setState multiple times with different states in a single event handler run: this will assert on you!

To allow for parametrization of target states setState can take up to six parameters of arbitrary type:

setState<StateB>("Some text", 42, true);

The arguments provided to setState are used to invoke an init method of the target state with matching signature:

void StateB::init(const char *, int, bool) { ... }

If no matching init method is provided a (template mess) compile time error will result.

Note:
 
The setState method uses type inference to determine the types of its parameters. Since type inference is somewhat limited in C++ you must sometimes disambiguate arguments by casting to avoid type errors.

5.5.1   Parametrized State Transitions

The target of a state transition can also be provided as a runtime value:

setState(State<StateA>());

or alternatively

setState(StateA::alias());

These examples create objects of type Alias to represent the state given as template parameter. Objects of this type can also be stored for later use:

Alias state = State<StateA>();
...
setState(state);

The ability to pass around and store state aliases makes it possible to create reusable code by parametrizing state relationships.

State aliases can also be provided with arguments for initialization of the target state on transition:

Alias s = State<StateA>("Some text", 42, true);

For these arguments the same specifics as for setState above apply.

5.5.2   Reflection

To allow for runtime decisions on state relationships these can be inspected at runtime:

// Read like this: StateA "is child of" Super
assert(StateA::isChild(Super::alias()));

// Read like this: Super "is parent of" StateA
assert(Super::isParent(StateA::alias()));

The same can be done with state aliases:

Alias stateA = StateA::alias();
Alias super = Super::alias();

assert(stateA.isChild(super));
assert(super.isParent(stateA));
Note:A state is both child and parent of itself.

State equality is determined as expected:

assert(Super::alias() == Super::alias());
assert(stateA != super);
Note:Transition parameters are not considered then testing for equality.

5.6   History

It is possible for a superstate to remember a previously entered substate. This history state can be reentered later with a special state transition.

History for a state is enabled by invoking the macro HISTORY or DEEPHISTORY in the state definition:

SUBSTATE(Super, Top) {
    STATE(Super)
    HISTORY()
    ...
};

Use of HISTORY selects the shallow history strategy, whereby only direct substates of a superstate are remembered. DEEPHISTORY remembers even the substates of substates.

Note:
 
Keep in mind that by default state boxes are not saved but rather instantiated anew for reentered history states (to override use the PERSISTENT macro)!

The history of a state can be specified as target of a state transition:

setStateHistory<Super>();

This example will enter the history state of Super. Entry actions of all involved states will be invoked, with a final call to the parameterless init method of the actual history state. If no history information is available (because the state has not been entered yet or history was not enabled), Super itself is the target of the transition.

Note:
 
It is not possible to provide transition parameters to setStateHistory because the exact target of the transition is not determinable at compile time.

History information can also be inspected directly:

Alias s = Super::history(machine);

A call to a state's static method history with a machine instance as argument will return the current history of the specified state at the moment of the call for the specified machine instance. If the state has no history information available an alias of the state itself is returned.

Note:
 
To get the current machine instance in an event handler call the method machine().

Similar but slightly different is this special alias:

Alias h = StateHistory<Super>(machine());

This alias will keep pointing to the history of Super, not just the history state at the moment of alias creation. This means that the alias can represent different states in its life, depending on the changes to the history of Super.

Consequently the statement

setStateHistory<Super>();

and this statement:

Alias state = StateHistory<Super>(machine());
...
setState(state);

are equivalent.

Note:
 
The StateHistory alias becomes invalid when the associated machine instance has shutdown.

5.7   Machine Creation

A state machine instance is created by instantiating a machine object:

Macho::Machine<Example::Top> m();

The class Machine is a template class in namespace Macho; its template parameter is the top state of the state machine to be run.

The top state is immediately entered and initialized upon machine creation (entry and init are called).

The Machine constructor takes as optional parameter a box for the top state:

Macho::Machine<Example::Top> m(new Example::Top::Box(42));

This allows parametrization of a state machine on startup.

An alias can be used to specify a different start state:

Machine<Top> m(State<Example::StateA>());

The machine instance will enter the argument state with all appropriate entry and init actions being executed.

Upon destruction of the machine object the current sub- and superstates up to and including the top state are exited correctly and their boxes destroyed.

Note:
 
You can override this behaviour by implementing the predefined event handler _shutdown as empty method.
Boxes are still destroyed of course.

It is also possible to read the top state box of a running machine by calling the box method of Machine:

cout << m.box().data;
Note:
 
The method box of class Machine returns a const reference to the top state box. It is therefore not possible to change box data (you should really do this from INSIDE your machine)!

An alias of the machine's current state can be obtained by calling the method currentState:

assert(m.currentState() == StateA::alias());

5.7.1   Managing Machines

A state's history for a particular machine instance can be cleared by calling the state's static method clearHistory with the machine object as argument:

Super::clearHistory(m);

This statement resets history information for Super inside machine m, without affecting substate history however (use clearHistoryDeep for this).

Another static method of state classes allows testing for the current state of a machine object:

assert(StateA::isCurrent(m));

The method isCurrent returns true if the given machine object is in the specified state or any of its substates at that moment.

Note:StateA::alias() == m.currentState() checks if the machine is exactly in StateA.

To use these methods in event handlers you have to retrieve the current state machine instance by calling the machine method:

void StateA::event1(int i) { StateA::clearHistory(machine()); }

5.7.2   Machine Snapshots

It's possible to create "snapshots" of machine objects:

Machine<Example::Top> machine;
Snapshot<Example::Top> snapshot(machine);

A snapshot object stores the entire configuration of a machine at the time the object is created. This includes box contents, history information and the current state of the machine instance.

Note: Snapshot functionality is not available in event handlers.

The snapshot object can then be used to backtrack machine objects to the stored configuration.

To create snapshots ALL box types of a state machine must have a copy constructor. Because of this additional requirement snapshot functionality must be explicitly enabled by defining the symbol MACHO_SNAPSHOTS at compile time:

g++ -D MACHO_SNAPSHOTS Example.cpp Macho.cpp

To restore a machine object to a previous configuration simply assign the snapshot object to it:

Machine<Example::Top> machine;
Snapshot<Example::Top> snapshot(machine);
machine = snapshot;

This will first shutdown the current machine instance (see this note though), consequently exiting all states and deleting all boxes, and then assigns the snapshot's configuration to the machine object. The snapshot object itself is unaffected by this operation.

Note:
 
By default no entry actions of restored states are called. To override implement the predefined event handler _restore like this: void _restore(Macho::_StateInfo & current) { setState(current); }

Destroying a snapshot object will not execute any exit or entry actions, but will delete all box objects contained by the snapshot, executing their destructors.

5.8   Event Dispatch

The simplest way to dispatch events (synchronously) to a state machine is by calling event handlers through the machine object's arrow operator, a technique commonly found with C++ smart pointers:

m->event1(42);
m->event2(43);

Another possibility is to create explicit event objects that are then used as arguments to a machine's dispatch method:

IEvent<Example::Top> * event = Event(&Example::Top::event1, 42);
m.dispatch(event);

IEvent<Example::Top> * event = Event(&Example::Top::event2, (long) 43);
m.dispatch(event);
Note:Return values of event handlers are not available when dispatching this way.

The Event function takes as arguments a pointer-to-member to an event handler and all arguments needed to invoke that event handler. Of course the event parameters must be consistent with the event handler's signature.

Note:
 
Currently the number of event parameters for Event is limited to six, but this limit can easily be raised.

The result of an Event call is a pointer to an object with the interface IEvent<T> created on the heap (with T being the top state of the state machine to dispatch to). This pointer can then be queued for later asynchronous dispatching:

typedef vector<IEvent<Example::Top> *> EventQueue;
EventQueue q;
q.push_back(Event(&Example::Top::event1, 42));
...
for (EventQueue::iterator it = q.begin(); it != q.end(); ++it)
    m.dispatch(*it);

Event objects are by default deleted after being dispatched to a machine and cannot be reused afterwards. To avoid memory leaks you should delete undispatched event objects explicitly.

The Event function, too, uses type inference to determine the types of event parameters. The same specifics as for setState above apply.

For compilers not supporting "Koenig lookup" name resolution you will have to call Event fully qualified:

Macho::Event(&Example::Top::event1, 42);

5.8.1   Internal Event Dispatch

Event objects can also be dispatched inside an event handler:

void StateA::event1(int i) {
    ...
    dispatch(Event(&Top::event2, (long) 43));
}

The difference to calling the event handler directly is that the event object is dispatched after control flow has left the event handler, and then also not until a pending state transition has been performed. Therefore the following code fragment:

void StateA::event1(int i) {
    setState<StateB>();
    dispatch(Event(&Top::event1, i));
}

is equivalent to:

void StateA::event1(int i) {
    dispatch(Event(&Top::event1, i));
    setState<StateB>();
}

The consequence in both examples is that the event object will be dispatched in the context of a new state.

The main intended use of this feature is indeed the "forwarding" of events to a new state better suited to handle a particular event.

There are similarities to the concept of "deferred" events in the UML specification, although these have conceptual problems (to be discussed in a later version of this document) which prohibit their implementation in the Machine Objects library.

Please note that only a single event can be dispatched inside an event handler. This is not a technical limitation, but rather an aesthetical: Event chains are hard to follow (since any event could change state) and make the state machine brittle to changes in structure.

Since event chains can be coalesced into a single new event having the same side effects (the state machine is deterministic after all), there seems to be no good reason to allow internal dispatching of more than one event.

For similar reasons it is also not possible to dispatch events in entry, exit or init.

5.9   Advanced Topics

The following sections focus on topics not essential for effective use of the Machine Objects library and can therefore be omitted on first reading.

5.9.1   Template States

On occasion considerable amounts of logic are repeatedly being used in different parts of a state machine design, whenever similar processes are to be followed in different contexts (error handling for example).

To achieve a better factoring of this kind of common behaviour so called template states can be used. Template states allow the implementation of fragments of state machine logic that can be plugged into an actual state machine where appropriate.

A template state is defined similarly to ordinary states:

template<class T>
TSUBSTATE(Template, T) {
    TSTATE(Template)
};

This example defines a template state named Template. The difference to previous state definitions is in the use of the TSUBSTATE and TSTATE macros instead of SUBSTATE and STATE, the parameter to these macros being again the name and the superstate of the template state.

As is apparent by the occurence of the template keyword, a template state really is a C++ template class behind the curtains.

The single template parameter to be provided is called the anchor of the template state. The anchor must always be an ordinary, non-template state. It is the state where the template state will hook into the state machine, with the anchor being a direct or indirect superstate of the template state.

The following statement

setState<Template<StateA> >();

consequentally means that the state machine will enter an instance of the template state Template with the anchor state StateA being the direct superstate of that template instance, which therefore inherits all event handlers of StateA.

Template states can themselves form hierarchies, when template states are substates of other template states:

template<class T>
TSUBSTATE(TemplateSub, Template<T>) {
    TSTATE(TemplateSub)
};

In this example the second parameter to the TSUBSTATE macro is another template state, making it the superstate of TemplateSub. The result is a state machine fragment that can be reused repeatedly in a regular state machine simply by providing an anchor state:

setState<TemplateSub<StateB> >();

This statement will put a state machine into state TemplateSub<StateB>, which as defined is a substate of Template<StateB>, which in turn is a substate of anchor StateB. The template parameter to all the template states is the non-template state that is the anchor to the whole state machine fragment.


For more detailed information about Machine Objects please consult the implementation source code and included examples or contact the author.


6   Conclusion

The Machine Objects library supports the creation of complex state machines in C++ with the well known mechanisms of class inheritance and object polymorphism. But let's look again: there is something else going on. With state machines stripped of their predominant graphical notation, a different concept seems to shine through: type migration.

In implementation and behaviour it is conceivable to view Machine Objects as objects changing their type within a single-rooted multi-level inheritance tree at runtime.

This could be used as an approach to tackle a long standing problem in object-oriented programming: the combination of mutable values with static taxonomies. The issue thereby is that values of specific type could change at runtime in such ways that their type's invariant property does not hold anymore.

The classic example of this problem is the case of Rectangle/Square: a type Square can reasonably be viewed as a subtype of Rectangle. By changing a Square object's width independently from its height (an operation quite conceivable for the supertype Rectangle), the object looses its characteristic property of having a width equal to its height and effectively becomes a Rectangle.

Currently there are no convincing solutions in standard object-oriented programming languages for more difficult problems of this kind. Machine Objects cope with the Rectangle/Square case easily, and can be used for more complex situations.

7   FAQ

Q: I'm getting weird compiler errors...
A: Check if you have
  • used the STATE macro in your state class bodies
  • defined your boxes before using the STATE macro
  • declared your top state with the TOPSTATE macro
  • disambiguated inferred argument types by casting
Q: How fast is this stuff?
A: Benchmarking the nontrivial state machine Test (without printing to console) gives me around 1.000.000 state transitions per second on my 2.4 GHz P4 machine.

Q: Does it work with the Visual Studio 6 compiler?
A: You'll find a variant version of Machine Objects for this compiler in the "msvc6" directory. Please consult the contained "Readme" file for differences!

8   Version History



eXTReMe Tracker