Chapter 6: The Lexer, Compiler, Resolver, and
Interpreter Objects
Now that you're familiar with Mason's basic syntax and some of its more
advanced features, it's time to explore the details of how the various pieces
of the Mason architecture work together to process components. By knowing
the framework well, you can use its pieces to your advantage, processing
components in ways that match your intentions.
In this chapter we'll discuss four of the persistent objects in the Mason
framework: the Interpreter, Resolver, Lexer, and Compiler. These objects
are created once (in a mod_perl setting, they're typically created when the
server is starting up) and then serve many Mason requests, each of which
may involve processing many Mason components.
Each of these four objects has a distinct purpose. The Resolver is responsible
for all interaction with the underlying component source storage mechanism,
which is typically a set of directories on a filesystem. The main job of the
Resolver is to accept a component path as input and return various properties
of the component such as its source, time of last modification, unique
identifier, and so on.
The Lexer is responsible for actually processing the component source code
and finding the Mason directives within it. It interacts quite closely with the
Compiler, which takes the Lexer's output and generates a Mason component
object suitable for interpretation at runtime.
The Interpreter ties the other three objects together. It is responsible for
taking a component path and arguments and generating the resultant output.
This involves getting the component from the resolver, compiling it, then
caching the compiled version so that next time the interpreter encounters the
same component it can skip the resolving and compiling phases.
Figure 6-1
illustrates the relationship between these four objects. The
Interpreter has a Compiler and a Resolver, and the Compiler has a Lexer.
own custom Mason classes. Chapter 12
covers this in its discussion of the
Class::Container class, where all the funkiness is located.
The Lexer
Mason's built-in Lexer class is, appropriately enough,
HTML::Mason::Lexer . All it does is parse the text of Mason
components and pass off the sections it finds to the Compiler. As of Version
1.10, the Lexer doesn't actually accept any parameters that alter its behavior,
so there's not much for us to say in this section.
Future versions of Mason may include other Lexer classes to handle
alternate source formats. Some people -- crazy people, we assure you -- have
expressed a desire to write Mason components in XML, and it would be
fairly simple to plug in a new Lexer class to handle this. If you're one of
these crazy people, you may be interested in Chapter 12
to see how to use
objects of your own design as pieces of the Mason framework.
By the way, you may be wondering why the Lexer isn't called a Parser, since
its main job seems to be to parse the source of a component. The answer is
that previous implementations of Mason had a Parser class with a different
interface and role, and a different name was necessary to maintain forward
(though not backward) compatibility.
The Compiler
By default, Mason will use the
HTML::Mason::Compiler::ToObject class to do its compilation. It
is a subclass of the generic HTML::Mason::Compiler class, so we
describe here all parameters that the ToObject variety will accept,
including parameters inherited from its parent:
•
allow_globals
You may want to allow access to certain Perl variables across all
{ Handle => $dbh, LockHandle => $dbh };
</%init>
Remember, don't go too crazy with globals: too many of them in the
same process space can get very difficult to manage, and in an
environment like Mason's, especially under mod_perl, the process
space can be very large and long-lasting. But a few well-placed and
well-scoped globals can make life nice.
•
default_escape_flags
This parameter allows you to set a global default for the escape flags
in <%$substitution %> tags. For instance, if you set
default_escape_flags to 'h', then all substitution tags in your
components will pass through HTML escaping. If you decide that an
individual substitution tag should not obey the
default_escape_flag parameter, you can use the special escape
flag 'n' to ignore the default setting and add whatever additional flags
you might want to employ for that particular substitution tag.
in compiler settings:
default_escape_flags => 'h',
in a component:
You have <% $amount %> clams in your
aquarium.
This is <% $difference |n %> more than your
rival has.
<a href="emotion.html?emotion=<% $emotion |nu
%>">Visit
your <% $emotion %> place!</a>
acts as if you had written:
To specify a different package, set the in_package compiler
parameter. Under normal circumstances you shouldn't concern
yourself with this package name (almost everything in Mason is done
with lexically scoped my variables), but for historical reasons you're
allowed to change it to whatever package you want.
Related settings are the Compiler's allow_globals
parameter/method and the Interpreter's set_global() method.
These let you declare and assign to variables in the package you
specified with in_package, without actually needing to specify that
package again by name.
You may also want to control the package name in order to import
symbols (subroutines, constants, etc.) for use in components.
Although the importing of subroutines seems to be gradually going
out of style as people adopt more strict object-oriented programming
practices, importing constants is still quite popular, and especially
useful in a web context, where various numerical values are used as
HTTP status codes. The following example, meant for use in an