This package contains helper classes for building a parser based on the well-known RegExp scheme, yet with a strong object-oriented approach in mind. Keyword human-readibility !
The base superclass of this package is the AbstractRegularExpression
class. Two
daughter classes then help building a grammar tree, namely
AlternateExpression
and SequenceExpression
,
which perform RegExp-like OR and AND operations respectively.
Regular expressions work hand-in-hand with two important classes :
an instance of the Context
class,
which is used to feed successive pieces of text to the set of reg-exp's that build up
the grammar tree, and an instance of the Pool
class,
which allows regular expression to share data across the whole grammar tree.
A RootExpression
might then help building a stand-alone parser, by serving as a communication hub
between the context, the pool and the various regular expressions that build up the tree.
We make use of the classical callback mechanism to process parsing events, yet
instead of using a unique, separate content-handler (e.g. as in most XML/HTML parsers),
each AbstractRegularExpression
has its own content-handler, which is part of the same class, and is implemented
in the core of the action(ParserEvent e)
method. This allows subclasses to
easily specialize content-handling behaviour only, by simply overiding the latter method.
(see e.g. InstanciationExpression
for an example).
As a rule of thumb, developpers should as much as possible avoid dealing directly
with the Context instance in their own implementations, and rather rely on existing
helper classes, since the latters already encapsulate much of the tricky communication scheme.
To set the stage, classes LiteralExpression
, NumericalExpression
,
StatementExpression
and WordExpression
comprise a minimal set
of helpers from which it might be easy to build a rather complicated grammar rule.
Besides, classes OptionalExpression
and RepeatExpression
allow
to handle RegExp-like *
, +
and ?
operations quite easily.