The Machine Objects class library (in short Macho) supports a subset of the UML statechart notation for implementing hierarchical state machines in straight C++, similar in spirit to the "State" design pattern. The currently supported features are hierarchical states, entry and exit actions, state histories and state variables.
The "just show me code, who has time to read" link to an example state machine: Microwave
For license information please see the MIT License or the header in file Macho.hpp.
Homepage is at | http://ehiti.sdf-eu.org/macho |
Feedback goes to | macho@gmx.org |
Copyright 2005 by Eduard Hiti.
You are encouraged to provide any changes, extensions and corrections for this software to the author for inclusion into future versions.
I like the "Gang of Four" State design pattern. It enables implementing the important concept of state machines with common programming language features. By utilizing only basic language mechanisms it is easy to apply in real-life software development.
Another important property that stems from this simplicity is orthogonality, meaning that the pattern can be combined with other design elements, patterns and idioms in arbitrary ways.
In contrast stand the tool supported approaches to state machine creation (of which there is no shortage). Based on code generators and graphical editors, they tend to generate incomprehensible code as product and forfeit orthogonality by necessarily being outside the domain of the programming language.
Unfortunately the "State" pattern is limited in scope because it does not allow for hierarchical state machines. This is regrettable because flat state machines tend to become unwieldy when getting bigger, for the sheer number of states they produce.
Hierarchical state machines as defined by the statechart notation alleviate this problem by providing an additional structural element with the grouping of states into hierarchies, thereby facilitating the sharing of common logic between states.
The "State" pattern in its original form is not capable of modeling state hierarchies. The Macho class library extends the concept with this possibility, while keeping the properties of simplicity (where possible) and tool independence from its inspiration.
The class library as such does not need to be installed. Just include the header file Macho.hpp to make use of it. Prerequisite however is a C++ compiler with sane support for templates (like gcc 2.95+ or MSVC7+). For the obsolete but inexplicably popular compiler MSVC6 a special version of the library is provided.
Included are the example state machines HelloWorld, Example, Microwave and Test. To make the examples run just compile them in the directory they are in, for example:
# GCC g++ -o microwave Microwave.cpp # MSVC7 cl /EHsc Microwave.cpp
The following descriptions assume some knowledge of the statechart notation. For more information see Wikipedia.
Starting point is the GoF "State" design pattern. In that pattern states are represented by classes. State transitions are performed by instantiating objects of those classes.
From this aspect the constructors and destructors of state classes can be seen as taking on the role of entry and exit actions. Object attributes then constitute state variables. Events are dispatched by calling methods on state objects which implement the guards, actions and transitions of states.
For states to compose a hierarchy, substates must be able to take over the event handling logic of superstates, selectively redefining it where necessary. There exists a mechanism in C++ allowing such redefinitions on method level: polymorphism through class inheritance.
Modeling the substate-superstate relation with class inheritance is problematic however: the use of constructors and destructors as entry and exit actions is not possible anymore, neither is keeping state variables in objects.
The reason is that base classes are constituent parts of deriving classes, meaning object construction or destruction will trigger all involved class constructors or destructors and initialize or destroy all data members.
This runs counter to the semantics of entry/exit actions and state variables, where a transition between sibling substates should not trigger superstate entry/exit actions nor destroy superstate state variables.
Our solution to this problem is to use explicit methods for state entry and exit, being called in the correct sequence on state transitions. State variables are kept in separate state specific data structures which have a life cycle consistent with the hierarchy of states.
Before diving into implementation details we first define some terminology to be used from here on. By state we mean the class defined to represent a particular state. A state machine is a set of states having a common top state class. A state machine instance maintains a current state to which events can be dispatched. A state machine's event protocol is the set of events that are understood by the machine.
Furthermore we will use extracts from the simple state machine
defined in the file Example.cpp for
illustration. It is a good idea to have a look at this file now if
you are reading this for the first time. It contains a state
machine with four states: a top state, a superstate
Super
, two substates StateA
and
StateB
, and a small test run. Each state has
entry/exit actions, the top state and StateA
have
state variables and the event protocol consist of two events,
event1
and event2
.
The top state of the example is defined by using the macro
TOPSTATE
:
TOPSTATE(Top) {
...
};
Every state machine must have a top state. The top state's interface defines the machine's event protocol: only the public virtual methods of the top state can be event handlers.
All other states of a machine are direct or indirect substates
of the top state. The top state has a typedef
alias
TOP
known to all states in the machine.
Substates are defined by the SUBSTATE
macro:
SUBSTATE(Super, Top) {
...
};
This example defines the class Super
as substate of
class Top
. The macro parameters are the new substate's
name and the name of its superstate. The superstate can be the top
state or any other substate. A typedef
alias
SUPER
will point to the superstate within the substate
class.
Why are we using macros in Macho? The reason being only
convenience, you'll most likely agree with their use if you look at
the definitions of the macros TOPSTATE
and
SUBSTATE
:
#define TOPSTATE(TOP) \ struct TOP : public Macho::Link< TOP, Macho::Root< TOP > > #define SUBSTATE(STATE, SUPERSTATE) \ struct STATE : public Macho::Link< STATE, SUPERSTATE >
The code is more readable and actually less error-prone (notice
the multiple use of macro parameters in TOPSTATE
and
SUBSTATE
) by using macro expansion.
Another macro is STATE(S)
which MUST appear in
every state body:
SUBSTATE(Super, Top) {
STATE(Super)
...
};
The macro parameter is the state's name again. With this macro some definitions are provided in the class body (a constructor for instance). For more details look up the macro's definition.
Every state must be instantiable (the top state as well)! This means states may not have pure virtual methods!
A state's (optional) entry and exit actions are defined this way:
TOPSTATE(Top) { ... private: void entry(); void exit(); void init(); }; void Top::entry() { ... } void Top::exit() { ... } void Top::init() { ... }
The methods entry
and exit
of a state
are called upon transitioning into or out of it. The sequence of
calls for a state transition is as follows:
First the exit action of the current state is called and then those of its superstates (in bottom up order), up to and excluding the first superstate that is also superstate of the new state.
Then entry actions of superstates of the new state are called, top down from the first superstate that is not also a superstate of the previous state. Finally the entry action of the new state is called.
From this follows that self transitions (where the new state is the same state as the current state) will trigger that state's exit and entry actions.
Entry/exits actions may NOT initiate new state transitions! Doing otherwise will cause an assertion.
The method init
defines a special kind of entry
action: Upon state transition, entry actions of the new state and
its superstates are triggered; init
however is called
only on the one state the transition actually goes to, after its
entry action is performed. Please see the section History for an exception if the state has history
enabled.
With init
the initial transitions of states are
implemented, where a superstate immediately enters a substate on
direct entry (like top state in Example). This implies that
init actions ARE in fact allowed to initiate state transitions (but
only to substates). Please note that transitioning to a state in
init
will trigger that new state's init
method in turn.
It is recommended that you put the methods exit
,
entry
and init
into the private section
of your classes, at least for the top state, so that they may not be
part of the event protocol.
Constructors or destructors can not be used for state classes. If you need to make initializations, do them in your entry actions or box constructors!
State variables are data members that have the lifetime of their associated state. For superstate variables this means that they exist as long as the machine is in that state or in any of its substates. Substates have full access to state variables of their superstates.
State variables allow states to accumulate information from the events they have received in the past. They can also be used to parametrize a new state to be entered by initializing them accordingly.
Since every state is a substate of the top state, top state variables are accessible to all states of a machine.
Here is an example of a definition of state variables:
TOPSTATE(Top) { struct Box { Box() : myData(0) {} Box(long l) : myData(l) {} long myData; }; STATE(Top) ... };
State variables are contained in a data type named
Box
nested into the state class (the type name
Box
is mandatory). The box definition must appear
BEFORE the use of the STATE
macro.
The box type must be default constructable (means: has a default constructor), but may have other constructors, too. Apart from this the box can be any regular C++ type. It could even be just a simple typedef:
TOPSTATE(Top) { typedef int Box; STATE(Top) ... };
A box is created before its state is entered (before the call to
entry
) and destroyed after the state is left (after
the call to exit
).
A state's box is accessed by calling the method
box
, which returns a reference to the state's box
object:
void StateA::event1(int i) { ... cout << box().myData; ... }
Superstate boxes are accessed by qualifying the box
method with the superstate's name:
void StateB::event2(long l) { ... cout << TOP::box().myData; ... }
Here the top state's box is requested.
As mentioned the event protocol of a state machine is defined by its top state interface:
TOPSTATE(Top) { ... virtual void event1(int i) {} virtual void event2(long l) {} ... };
The Example state machine understands the events
event1
and event2
: events are named after
their event handler methods. Event handlers may have arbitrary
parameters.
Event handlers are simple C++ methods:
SUBSTATE(StateA, Super) { ... void event1(int i); ... }; void StateA::event1(int i) { ... }
In this fragment the event handler for event1
in
state StateA
is defined. The method implements all the
logic to handle the event.
The top state event handlers define the default behaviour for the whole state machine. If there is no meaningful implementation for an event handler at top level, the handler could either
assert(false)
): not handling the event will be an
error then.State transitions are made by calling the method
setState
inside an event handler:
void StateA::event1(int i) { ... setState<StateB>(); }
The template parameter to setState
is the new state
to be entered.
The state transition takes place AFTER control flow leaves the
event handler. This means it is possible to do meaningful work even
after calling setState
(all involved objects are still
alive). It is however not allowed to call setState
multiple times with different states in a single event handler run:
this will assert on you!
There is an optional parameter to setState
:
void Top::init() { setState<StateA>(new StateA::Box(44)); }
By creating a state's box prior transition it is possible to parametrize a new state by initializing its box suitably.
We recommend that you do not define event handlers inline to
your state class definitions. The reason is that C++ needs a
complete definition of a class to be used as template parameter,
and that may not be the case for the classes you want to
setState
to in inline event handlers.
It is possible for a superstate to remember a previously entered substate. The next (direct) entry into the superstate will then reenter the remembered substate.
History is activated for a state by use of the macro
HISTORY
or DEEPHISTORY
:
SUBSTATE(Super, Top) { STATE(Super) HISTORY() ... };
Use of HISTORY
selects the shallow history
strategy, whereby only direct substates of a superstate are
remembered. DEEPHISTORY
remembers even the substates
of substates.
History is only evaluated for a state if it is the direct target of a state transition. If the transition goes to one of its substates, that substate is entered without checking superstate history.
If a state has history enabled, its init
method
will not be called if a history state is available. This is because
history takes precedence over initial transitions. This means that
for a state with history, init
is called only the very
first time the state is entered.
It is possible to ignore history information for a state
transition however by using the method setStateDirect
instead of setState
:
void StateA::event1(int i) { ... setStateDirect<Super>(); }
This example will enter Super
without checking its
history, calling init
instead.
Keep in mind that state boxes are not remembered but rather instantiated anew for reentered history states!
A state machine instance is created by instantiating a machine object:
Macho::Machine<Example::Top> m();
The class Machine
is a template class in namespace
Macho
; its template parameter is the top state of the
state machine to be run.
The top state is immediately entered and initialized upon
machine creation (entry
and init
are
called on it).
Upon destruction of the machine object the current sub- and superstates up to and including the top state are exited correctly and their boxes destroyed.
The Machine
constructor takes as optional parameter
a box for the top state:
Macho::Machine<Example::Top> m(new Example::Top::Box(11));
This allows parametrization of a state machine on startup.
It is also possible to read the top state box of a running
machine by calling the box
method of
Machine
:
cout << m.box().myData;
The method box
of class Machine
returns a const
reference to the top state box. It is
therefore not possible to change box data (you should really do
this from INSIDE your machine)!
A state's history for a particular machine instance can be
cleared by calling the state's static method
clearHistory
with the machine object as parameter:
Example::Super::clearHistory(m);
This statement resets history information for Super
inside machine m
, without affecting substate history
however (use clearHistoryDeep
for this).
Another static method of state classes allows testing for the current state of a machine object:
assert(Example::StateA::isCurrent(m));
The method isCurrent
returns true
if
the given machine object is in the selected state or any of its
substates at that moment (use isCurrentDirect
to
ignore substates).
The simplest way to dispatch events (synchronously) to a state machine is by calling event handlers through the machine object's arrow operator, a technique commonly found with C++ smart pointers:
m->event1(42); m->event2(43);
Another possibility is to create explicit event objects that are then
used as parameters to a machine's dispatch
method:
IEvent<Example::Top> * event = Event(&Example::Top::event1, 42); m.dispatch(event);
The Event
function takes as parameters a
pointer-to-member to an event handler and all parameters needed to
invoke that event handler.
Currently the number of possible event parameters for
Event
is limited to four, but this limit can easily be
raised. Of course the event parameters must be consistent with the
event handler's signature.
The result of an Event
call is a pointer to an
object with the interface IEvent<T>
created on
the heap (with T
being the top state of the state machine to dispatch to).
This pointer can then be queued for later
asynchronous dispatching:
typedef vector<IEvent<Example::Top> *> EventQueue; EventQueue queue; queue.push_back(Event(&Example::Top::event1, 42)); ... for (EventQueue::iterator it = queue.begin(); it != queue.end(); ++it) m.dispatch(*it);
Event objects are deleted after being dispatched to a machine and cannot be reused afterwards. To avoid memory leaks you should delete undispatched event objects explicitly.
To maximize performance the use of a custom memory allocator for
event objects optimized for high frequency allocation of small
objects is advised. Such an allocator however is not included in
the Macho library. We
recommend the SmallObject
allocator in the free
Loki library.
The Event
function uses type inference to determine
the types of event parameters. Since type inference is somewhat
limited in C++ you must sometimes disambiguate event parameters by
casting to avoid type errors:
m.dispatch(Event(&Example::Top::event2, (long) 42));
In this example the literal value 42
needs to be
cast to long
because the compiler will infer type
int
which is not the type expected by the event
handler event2.
For compilers not supporting "Koenig lookup"
name resolution you will have to call Event
fully
qualified:
Macho::Event(&Example::Top::event1, 42);
Event objects can also be dispatched inside an event handler:
void StateA::event1(int i) { ... dispatch(Event(&Top::event2, (long) 42)); }
The difference to calling the event handler directly is that the event object is dispatched after control flow has left the event handler, and then not until a pending state transition has been performed. Therefore the following code fragment:
void StateA::event1(int i) { setState<StateB>(); dispatch(Event(&Top::event1, i)); }
is equivalent to:
void StateA::event1(int i) { dispatch(Event(&Top::event1, i)); setState<StateB>(); }
The consequence in both examples is that the event object will be dispatched in the context of a new state.
The main intended use of this feature is indeed the "forwarding" of events to a new state better suited to handle a particular event.
There are similarities to the concept of "deferred" events in the UML specification, although these have conceptual problems (to be discussed in a later version of this document) which prohibit their implementation in the Macho library.
Please note that only a single event can be dispatched inside an event handler. This is not a technical limitation, but rather an aesthetical: Event chains are hard to follow (since any event could change state) and make the state machine brittle to changes in structure.
Because it should always be possible to coalesce event chains into a single new event having the same side effects (the state machine is deterministic after all), there seems to be no good reason to allow internal dispatching of more than one event.
For similar reasons it is also not possible to dispatch events
in entry/exit actions or init
.
For more detailed information about Macho please consult the implementation source code and included examples.
The Macho library makes it possible to create complex state machines in C++ with the well known mechanisms of class inheritance and object polymorphism. But let us look a little harder: there is something else going on, too. With state machines stripped of their predominant graphical notation, a different concept seems to shine through: type migration!
In implementation and behaviour it is conceivable to view Machine Objects as objects changing their type within a single-rooted multi-level inheritance tree at runtime!
This could be used as an approach to tackle a long standing problem in object-oriented programming: the combination of mutable values with static taxonomies. The issue thereby is that values of specific type could change at runtime in such ways that their type's invariant property does not hold anymore.
The classic example of this problem is the case of Rectangle/Square: a type Square can reasonably be viewed as a subtype of Rectangle. By changing a Square object's width independently from its height (an operation quite conceivable for a Rectangle), a square could become a rectangle and vice versa.
Currently there are no convincing solutions in standard object-oriented programming languages for these kind of problems. Machine Objects cope with the Rectangle/Square case easily, and could be used for much more complex situations.
Q: | I'm getting weird compiler errors... |
A: | Check if you have
|
Q: | How fast is this stuff? |
A: | Benchmarking the nontrivial state machine Test (without printing to console) gives me around 1.000.000 state transitions per second on my 2.4 GHz P4 machine. |
Q: | Does it work with the Visual Studio 6 compiler? |
A: | You'll find a variant version of Macho for this compiler in the "msvc6" directory. Please consult the contained "Readme" file for differences! |
clearHistoryDeep
to state
classes.