10.1. typped.pratt_parser¶
A general Pratt parser module that uses dispatching of handler functions and can check types. The API is documented here. See the general Sphinx documentation for Typped for how to use the class and examples.
10.1.1. User-accessible parser attributes¶
User accessible attributes are mostly the same as the initialization
keywords to the PrattParser
initializer. Most are read-only, but
some can be changed between parses.
10.1.2. User-accessible token attributes¶
Token instances straight from a Lexer
instance have certain attributes set,
as documented in the lexer
module. In particular, the token_label
,
value
, and children
attributes are commonly used. The pratt_parser
module defines its own subclass of the TokenNode
class, which additionally
assigns some extra attributes when tokens are defined. The parsing process
also sets several user-accessible attributes.
Attributes set on TokenNode
subclasses (representing kinds of tokens):
token_label
Attributes set on token instances (scanned tokens) during parsing:
parser_instance
– the parser instance that parsed the tokenoriginal_formal_sig
– aTypeSig
instance of the resolved original formal signatureexpanded_formal_sig
– aTypeSig
instance of the expanded formal signatureactual_sig
– aTypeSig
instance of the actual signatureconstruct_label
– the string label of the winning preconditions function
Note that both original_formal_sig
and expanded_formal_sig
are set to the
string "Unresolved"
before the token is parsed. The actual signature is
found during parsing and type-checking. Out of all possible overloads in the
original formal signatures associated with the token (via modify_token
) the
one which matches the actual arguments is chosen. The expanded formal
signature is the same as the original formal signature except that wildcards,
etc., are expanded in the attempt to match the actual arguments.
These two attributes are actually properties which look up the value if
necessary (to avoid unnecessary lookups during parsing). They both only
work after parsing, since they use the original_formal_sig
to look up
the corresponding data or function.
ast_data
– any AST data that was set with the construct for the resolved typeeval_fun
– any eval_fun that was set with the construct for the resolved type
Optional attributes that can be set to a node inside a handler:
not_in_tree
– set on a root node returned by the handler to hide itprocess_and_check_kwargs
– a kwargs dict to pass to type-checking routine
10.1.3. Implementation details¶
This section gives a general overview of the lower-level details of the
PrattParser
implementation.
10.1.3.1. The basic class structure¶
TODO: Update diagram to and discussion to have ConstructTable.
There are five basic classes, with instances which interact. The main class is
the PrattParser
class, which users will mostly interact with. The overall
relationships are shown in this image, with discussion below.
The next three classes are defined in the lexer module, although one is
redefined here. They are the TokenSubclass
, TokenTable
, and Lexer
classes.
A Lexer
instance is always initialized with a TokenTable
instance, whether
it is passed-in as an argument or created internally as an empty token table.
A PrattParser
instance always creates its own token table and then passes
that to the lexer, which it also creates.
Every PrattParser
instance contains a fixed TokenTable
instance, which
never changes (except for the tokens in it). So each token-table created by a
parser can save a pointer back to the parser which “owns” it. Each
PrattParser
instance also contains a Lexer
instance, which contains
a pointer to a parser instance (so the lexer can access the parser).
The TokenSubclass
class is a subclass of the TokenNode
class (which is
defined in the lexer module). The subclassing adds many additional methods and
attributes which are needed in the parsing application. The TokenSubclass
class is actually defined inside a factory function, called
token_subclass_factory
, which produces a different subclass to represent each
kind of token that is defined (tokens are defined via the def_token
method of
PrattParser
). Instances of those subclasses represent the actual tokens
(i.e., tokens scanned and returned by the lexer containing individual
text-string values).
A TokenTable
instance is basically a dict for holding all the defined
token-subclasses. But it also has related methods and attributes associated
with it. It is where all new tokens are ultimately created and defined, for
example (although other classes like the parser class can add extra attributes
to the created tokens).
A TokenTable
instance contains all the tokens defined for a language, and
stays with the PrattParser
instance which created it (from which the tokens
were necessarily defined). A Lexer
instance can use different TokenTable
instances, possibly switching on-the-fly. A lexer instance always has a
pointer to its current token-table instance, but that can change on-the-fly
(such as when separate parsers are swapped in to parse sub-languages in the
same text stream). This is used when parser instances call other parser
instances.
Tokens defined by a parser also save a pointer to their defining parser, since the token-table has a fixed association to the parser.
Tokens also need to know their current lexer instance because they need to call
the next
and peek
methods, if nothing else. This is equivalent to the
token table knowing its current lexer instance. So, whenever a token table is
associated with a lexer using the lexer’s set_token_table
method it is also
given a pointer to that lexer as an attribute.
The final class of the five is the TypeTable
class. This is essentially a
dict to store all the defined types, but it also provides a nice place to
define many methods for acting on types. It is defined in the pratt_types
module and imported.
10.1.3.2. Using different parsers inside handler functions¶
It is useful to be able to call different PrattParser
instances from inside
handler functions in order to parse subexpressions which are defined as
sublanguages, having their own parsers. The implementation supports this as
follows.
Essentially, a common lexer is passed around and told which token table (and hence parser) to use at any given time. It would be possible to pass around a text stream of unprocessed text, but then the lexers would need to be initialized each time, and saving information like line numbers and columns in the text would need to move to the text stream object.
The parse
routine of a PrattParser
takes an optional lexer argument, which
is used by sub-parsers instead of the default lexer. When parsing a
sublanguage with a different parser the the TokenTable
instance of the lexer
is set to be the same as the token table instance of the current parser
(using the lexer’s set_token_table
method). So you can call the parse
method of a different parser instance from within a handler function, passing
that other parser’s parse
function the current parser’s lexer as an
argument. The lexer will use the token table of the new parser but still read
from the same text stream as the current parser.
Note that a sublanguage program (or expression or wff) must always be parsed
from the beginning, so the parse
method is called. When this parser reaches
the end, where it would normally stop, the symbol table of the lexer is
restored to the symbol table of the current parser (again using the lexer’s
set_token_table
method).
A sublanguage expression can end when the lexer doesn’t recognize a token, or when it would normally return a parsed expression.
10.1.4. Code¶
In reading the code, the correspondence between the naming convention used here and Pratt’s original naming conventions is given in this table:
This code | Pratt’s terminology |
---|---|
token precedence | left binding power, lbp |
subexpression precedence | right binding power, rbp |
head handler function | null denotation, nud |
tail handler function | left denotation, led |
-
class
typped.pratt_parser.
TokenSubclassMeta
[source]¶ Bases:
type
A trivial metaclass that will actually create the
TokenSubclass
objects. Since tokens are represented by classes, rather than instances, this is necessary in order to change their__repr__
(the defalt one is ugly for tokens) and to overload operators to work for token operands in the EBNF-like grammar.
-
typped.pratt_parser.
token_subclass_factory
()[source]¶ This function is called from the
create_token_subclass
method ofTokenTable
when it needs to create a new subclass to begin with. It should not be called directly.Create and return a new token subclass which will be modified and used to represent a particular kind of token. Specifically, each scanned token matching the regex defined for tokens with a given token label is represented as an instance of the subclass created by calling this function (with further attributes, such as the token label, added to it).
Using a separate subclass for each token label allows for attributes specific to a kind of token (including head and tail handler methods) to later be added to the class itself without conflicts. This function returns a bare-bones subclass without any head or tail functions, etc.
-
typped.pratt_parser.
lexer_add_parser_instance_attribute
(lexer, token)[source]¶ Passed to lexer to add a
parser_instance
attribute to each token it returns. This attribute is added to instances at the lexer, from its current token table, because of the case where parsers call other parsers. (It is not added to general token subclasses indef_token_master
because parsers could potentially share token subclasses.)
-
typped.pratt_parser.
ExtraDataTuple
¶ alias of
ExtraHandlerData
-
class
typped.pratt_parser.
PrattParser
(max_peek_tokens=None, max_deque_size=None, lexer=None, default_begin_end_tokens=True, type_table=None, skip_type_checking=False, overload_on_arg_types=True, overload_on_ret_types=False, partial_expressions=False, parser_label=None, raise_on_equal_priority_preconds=False)[source]¶ Bases:
object
A parser object. Each parser object contains a table of defined tokens, a lexer, a table of constructs, and a table of defined types.
-
def_token_master
(token_label, regex_string=None, on_ties=0, ignore=False, token_kind='regular', ignored_token_label=None, matcher_options=None)[source]¶ The master method for defining tokens; all the convenience methods actually call it. Allows for factoring out some common code and keeping the attributes of all the different kinds of tokens up-to-date. This routine calls the underlying lexer’s
def_token
to get tokens and then adds extra attributes needed by thePrattParser
class.The
token_kind
argument must be one of the following strings:"regular"
,"ignored"
,"begin"
,"end"
,"jop"
, or"null-string"
. Theignored_token_label
is used only when defining a jop.Tokens can be shared between parsers if all their properties are the same. Note that for now this includes the precedence value for any tail handlers (since that is made a token attribute). Null-string and jop tokens are the exception, but they are special in that they are never returned by the lexer, only by a particular parser.
-
def_token
(token_label, regex_string, on_ties=0, ignore=False, matcher_options=None)[source]¶ Define a token. Use this instead of the Lexer
def_token
method, since it adds extra attributes to the tokens.
-
def_ignored_token
(token_label, regex_string, on_ties=0, matcher_options=None)[source]¶ A convenience function to define a token with
ignored=True
.
-
def_begin_end_tokens
(begin_token_label='k_begin', end_token_label='k_end')[source]¶ Calls the
Lexer
method to define begin- and end-tokens. The subclasses are then given initial head and tail functions for use in the Pratt parser. To use thePrattParser
this method must be called, not the method ofLexer
with the same name (since it also creates head and tail handler functions that raise exceptions for better error messages). The default is to call this method automatically on initialization, with the default token labels for the begin and end tokens. If the flagdefault_begin_end_tokens
is set false onPrattParser
initalization then the user must call this function (setting whatever token labels are desired). Returns a tuple containing the new begin and endTokenNode
subclasses.
-
def_jop_token
(jop_token_label, ignored_token_label)[source]¶ Define a token for the juxtaposition operator. This token has no regex pattern. An instance is inserted in
recursive_parse
when it is inferred to be present. This method must be explicitly called before a juxtaposition operator can be used (i.e., beforedef_jop
). The parameterjop_token_label
is the label for the newly-created token representing the juxtaposition operator. Theignored_token_label
parameter is the label of an ignored token which must be present for a jop to be inferred. Some already-defined token is required; usually it will be a token for spaces and tabs. If set toNone
then no ignored space at all is required (i.e., the operands can be right next to each other).
-
def_null_string_token
(null_string_token_label='k_null-string')[source]¶ Define the null-string token. This token has no regex pattern. An instance is inserted in
recursive_parse
when it is inferred to be present based. This method must be called before a null-string can be used. The parameternull_string_token_label
is the label for the newly-created tok representing it.
-
get_token
(token_label)[source]¶ Return the token with the label
token_label
. The reverse operation, getting a label from a token instance, can be done by looking at thetoken_label
attribute of the token.
-
undef_token
(token_label)[source]¶ A method for undefining any token defined by the
PrattParser
methods. Since thetoken_kind
was set for all tokens when they were defined it knows how to undelete any kind of token.
-
def_assignment_op_dynamic
(parser, assignment_op_token_label, prec, assoc, identifier_token_label, symbol_value_dict=None, symbol_type_dict=None, allowed_types=None, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, eval_fun=None, create_eval_fun=False, ast_data=None)¶ Define an infix assignment operator which is dynamically typed, with types checked at evaluation time (i.e., when the tree is interpreted).
A precondition checks that the l.h.s. of the assignment operator is a token with label
identifier_token_label
. If not an exception is raised.No type-checking is done on the r.h.s. by default. To limit the types that can be assigned you can pass in a list or iterable of
TypeObject
instances as the argumentallowed_types
. These formal types are stored as the list attributeallowed_dynamic_assignment_types
of the parser instance. An exception will be raised by the generated evaluation function if an assigned value does not have an actual type consistent with a formal type on that list. If new types are created later they can be directly appended to that list without having to overload the assignment operator.If
create_eval_fun
is true (andeval_fun
is not set) then an evaluation function will be created automatically. Thesymbol_value_dict
is used to store the values, which defaults to the parser attribute of the same name.This method may not correctly set the return type when overloading on return types because currently
val_type_override
is used to set it.
-
def_assignment_op_static
(parser, assignment_op_token_label, prec, assoc, identifier_token_label, symbol_value_dict=None, symbol_type_dict=None, allowed_types=None, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, eval_fun=None, create_eval_fun=False, ast_data=None)¶ Define an infix assignment operator which is statically typed, with types checked at parse time. Each identifier (with token label
identifier_token_label
must already have a type associated with it in thesymbol_type_dict
. This dict and the type values in it should be set via whatever kind of a type definition construct the language uses.A precondition checks that the l.h.s. of the assignment operator is a token with label
identifier_token_label
. If not an exception is raised.An evaluation function can optionally be created automatically, but by default is not. See the
def_assignment_op_dynamic
routine for more details since the mechanism is the same. Ifeval_fun
is set then that evaluation function will always be used.This method may not correctly set the return type when overloading on return types because currently
val_type_override
is used to set it.
-
def_assignment_op_untyped
(parser, assignment_op_token_label, prec, assoc, identifier_token_label, symbol_value_dict=None, precond_fun=None, precond_priority=0, construct_label=None, eval_fun=None, create_eval_fun=False, ast_data=None)¶ Define an infix assignment operator which is statically typed, with types checked at parse time. Each identifier (with token label
identifier_token_label
must already have a type associated with it in thesymbol_type_dict
. This dict and the type values in it should be set via whatever kind of a type definition construct the language uses.A precondition checks that the l.h.s. of the assignment operator is a token with label
identifier_token_label
. If not an exception is raised.An evaluation function can optionally be created automatically, but by default is not. See the
def_assignment_op_dynamic
routine for more details since the mechanism is the same. Ifeval_fun
is set then that evaluation function will always be used.This method may not correctly set the return type when overloading on return types because currently
val_type_override
is used to set it.
-
def_bracket_pair
(parser, lbrac_token_label, rbrac_token_label, in_tree=True, precond_fun=None, precond_priority=0, construct_label=None, eval_fun=None, ast_data=None)¶ Define a matching bracket grouping operation. The returned type is set to the type of its single child (i.e., the type of the contents of the brackets). Defines a head handler for the left bracket token, so effectively gets the highest evaluation precedence. As far as types, it is treated as a function that takes one argument of wildcard type and returns whatever type the argument has.
-
def_construct
(head_or_tail, handler_fun, trigger_token_label, prec=0, construct_label=None, precond_fun=None, precond_priority=0, val_type=None, arg_types=None, eval_fun=None, ast_data=None, token_value_key=None, dummy_handler=False)[source]¶ Define a construct and register it with the token with label
trigger_token_label
. A token with that label must already be in the token table or an exception will be raised.Stores the construct instance in the parser’s construct table and also return the construct instance.
The
head_or_tail
argument should be set to eitherHEAD
orTAIL
. Ifhead_or_tail==TAIL
then the operator precedence will be set toprec
. For a head handler theprec
value is ignored and effectively set to zero. For a tail handler aprec
value greater than zero is required or else an exception will be raised (unlessdummy_handler
is set true). Similarly, an exception is raised for a non-zeroprec
value for a head-handler (the default value).The
construct_label
is an optional string value which can result in better error messages.The
eval_fun
and theast_data
arguments are saved in dicts associated with the type signature.If
token_value_key
is set to a string value then that value will be part of the key tuple for saving AST data and evaluation functions. This can be used, for example, when overloading a generic identifier with different evaluation functions for when the identifier value issin
,cos
, etc. In looking up the AST data and evaluation function the parsed token’s actual string value (from the program text) is used as the key. If any overload of a particular construct provides atoken_value_key
string then all the other overloads for that construct must also (for the time being, at least).
-
def_default_float_token
(parser, token_label='k_float', signed=True, require_decimal=False, on_ties=0)¶ Define a token for floats with default label ‘k_float’. If
signed
is true (the default) then a leading ‘+’ or ‘-‘ is optionally part of the float. Otherwise the sign is not included. This is sometimes needed when the signs are defined as a prefix operators instead.
-
def_default_identifier_token
(parser, token_label='k_identifier', signed=True, on_ties=0)¶ Define a identifier. It is like Python identifiers: a letter or underscore followed by any number of letters, underscores, and digits.
-
def_default_int_token
(parser, token_label='k_int', signed=True, on_ties=0)¶ Define a token for ints with default label ‘k_int’. If
signed
is true (the default) then a leading ‘+’ or ‘-‘ is optionally part of the float. Otherwise the sign is not included.
-
def_default_single_char_tokens
(parser, chars=None, exclude=None, make_literals=False)¶ The characters in the string
chars
are defined as tokens with default labels. Spaces are ignored in the string. Ifchars
is not set then all the labels will be defined except those in the stringexclude
. Ifmake_literals
is true then the tokens will also be defined as token literals (viadef_literal
).
-
def_default_whitespace
(parser, space_label='k_space', space_regex='[ \\t]+', newline_label='k_newline', newline_regex='[\\n\\f\\r\\v]+', matcher_options=None)¶ Define the standard whitespace tokens for space and newline, setting them as ignored tokens.
-
def_infix_multi_op
(parser, operator_token_labels, prec, assoc, repeat=False, not_in_tree=False, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None)¶ Takes a list of operator token labels and defines a multi-infix operator.
If
repeat=True
then any number of repetitions of the list of operators will be accepted. For example, a comma operator could be used to parse a full comma-separated list. Whenarg_types
is also set use theVarargs
object in the list to check the repetitions. For a single operator, repeating just has the effect of putting the arguments in a flat argument/child list instead of as nested binary operations based on left or right association. Any argument-checking is done after any node removal, which may affect the types that should be passed-in in the list arg_types of parent constructs.If
not_in_tree
is false then the root node will not appear in the final parse tree (unless it is the root).
-
def_infix_op
(parser, operator_token_label, prec, assoc, not_in_tree=False, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None)¶ This just calls the more general method
def_multi_infix_op
.
-
def_jop
(parser, prec, assoc, precond_fun=None, precond_priority=None, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None)¶ The function
precond_fun
is called to determine whether or not to accept a potentially-inferred a juxtaposition operator between the previously-parsed subexpression result and the next token. Note that this function have availableextra_data
as an attribute of its triggering token, andextra_data
contains thelookbehind
attribute. Through the lookbehind list thejop_precond
function has access to the type information for the potential left operand but not for the potential right operand.Note that if the juxtaposition operator always resolves to a single type signature based on its argument types then, even if overloading on return types is in effect, the jop can be effectively inferred based on type signature information.
-
def_literal
(parser, token_label, val_type=None, precond_fun=None, precond_priority=0, construct_label=None, val_type_override_fun=None, eval_fun=None, ast_data=None)¶ Defines the token with label
token_label
to be a literal in the syntax of the language being parsed. This method adds a head handler function to the token. Literal tokens are the leaves of the expression trees; they are things like numbers and variable names in a numerical expression. They always occur as the first (and only) token in a subexpression being evaluated byrecursive_parse
, so they need a head handler but not a tail handler. (Though note that the token itparser might also have a tail handler.)A function
val_type_override_fun
can be passed in, taking a token and a lexer as its two arguments and returning aTypeObject
instance. If it is set then it will called in the handler at parse-time to get the type to set as theval_type
of the node. This can be useful for dynamic typing such as when identifiers in an interpreted language are generic variables which can holding different types. This option currently does not work for overloading on return types.
-
def_literal_typed_from_dict
(parser, token_label, symbol_value_dict=None, symbol_type_dict=None, default_type=None, default_eval_value=None, raise_if_undefined=False, eval_fun=None, create_eval_fun=False, precond_fun=None, precond_priority=1, construct_label=None)¶ Define a dynamically typed literal, usually a variable-name identifier. The type is looked up in the dict
symbol_type_dict
, keyed by the string value of the token literal.If
create_eval_fun
is true (andeval_fun
is not set) then this method will provides an evaluation function automatically. This function returns the value looked up fromsymbol_value_dict
, keyed by the literal token’s string value. The default value returned by the evaluation if the symbol is not in the dict is set viadefault_eval_value
. (Currently there must be some default rather than raising an exception, with the default default value set toNone
.) Settingcreate_eval_fun
false will skip the setting of an evaluation function.The
def_assignment_op_dynamic
routine should be used to handle the corresponding variable assignment operation. That is, the assignment that dynamically sets the type of the literal to the type of the assigned value (storing it insymbol_type_dict
by default).This method may not correctly set the return type when overloading on return types because currently
val_type_override
is used to set it.
-
def_multi_ignored_tokens
(parser, tuple_list, **kwargs)¶ A convenience function, to define multiple ignored tokens at once. Each element of the passed-in list should be a tuple containing the arguments to the ordinary
def_token
method withignore=True
. Calls the equivalentLexer
function.
-
def_multi_literals
(parser, tuple_list)¶ An interface to the
def_literal
method which takes a list of tuples. Thedef_literal
method will be called for each tuple, unpacked in the order in the tuple. Unspecified optional arguments are assigned their default values.Usually it is better to define
literal = parser.def_literal
and use that as a shorter alias. This method does not allow for keyword arguments and depends on argument ordering.
-
def_multi_tokens
(parser, tuple_list, **kwargs)¶ A convenience function, to define multiple tokens at once. Each element of the passed-in list should be a tuple containing the arguments to the ordinary
def_token
method. Calls the equivalentLexer
function.
-
def_postfix_op
(parser, operator_token_label, prec, allow_ignored_before=True, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None)¶ Define a postfix operator. If
allow_ignored_before
is false then no ignored token (usually whitespace) can appear immediately before the operator.
-
def_prefix_op
(parser, operator_token_label, prec, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None)¶ Define a prefix operator. Note that head handlers do not have precedences, only tail handlers. (With respect to the looping in
recursive_parse
it wouldn’t make a difference.) But, within the head handler, the call torecursive_parse
can be made with a nonzero precedence. This allows setting a precedence to determine the argument expressions that the prefix operators grabs up (or doesn’t).
-
def_stdfun
(parser, fname_token_label, lpar_token_label, rpar_token_label, comma_token_label, precond_fun=None, precond_priority=1, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None, num_args=None, token_value_key=None)¶ This definition of stdfun uses lookahead to the opening paren or bracket token.
Note that all tokens must be defined as literal tokens except
fname_token_label
(which ends up as the root of the function evaluation subtree). If the latter is also a literal token thenprecond_priority
may need to be increased to give this use priority.The
num_args
parameter is optional for specifying the number of arguments when typing is not being used. If it is set to a nonnegative number then it will automatically setarg_types
to the corresponding list ofNone
values; ifarg_types
is set then it is ignored. If type-checking is disabled for the parser instance then the number of arguments is instead checked by the handler function.
-
def_stdfun_lpar_tail
(parser, fname_token_label, lpar_token_label, rpar_token_label, comma_token_label, prec_of_lpar, precond_fun=None, precond_priority=0, construct_label=None, val_type=None, arg_types=None, eval_fun=None, ast_data=None, num_args=None, token_value_key=None)¶ This is an alternate version of stdfun that defines lpar as an infix operator (i.e., with a tail handler). This function works in the usual cases but the current version without preconditions may have problems distinguishing “b (” from “b(” when a multiplication jop is set. The lookahead version
def_stdfun
is usually preferred.This method assumes type checking is turned on if
num_arg
is set.A peek backwards to a token with label
fname_token_label
is included in the preconditions function. Definitions for different leading tokens will give mutually exclusive preconditions.
-
undef_construct
(construct, type_sig=None, token_value_key=None)[source]¶ Undefine a construct. If
type_sig
is passed aTypeSig
instance then only that overload is deleted. Iftoken_value_key
is also defined then only that token key is unregistered. Otherwise the full construct is removed from the parser’s construct table.
-
parse_from_lexer
(lexer_to_use, pstate=None)[source]¶ The same as the
parse
method, but a lexer_to_use is already assumed to be initialized. This is ONLY used when one parser instance calls another parser instance (implicitly, via the handler functions of its tokens). The outer parser calls this routine of the inner, subexpression parser. Such a call to another parser would look something like:alternate_parser.parse_from_lexer(lexer_to_use)
where
lexer_to_use
is the lexer_to_use of the outer parser. This routine temporarily swaps the token table for the passed-in lexer_to_use to be the token table for this parser (remember that this parser is the inner parser when this routine is called).
-
parse
(program, pstate=None, partial_expressions=None, skip_lex_setup=False)[source]¶ The main routine for parsing a full program or expression. Users of the class should call this method to perform the parsing operations (after defining a grammar, of course).
Unless there was a parsing failure or
partial_expressions
is true then the lexer is left with the end-token as the current token.If the
pstate
variable is set then the value will be pushed as the initial state on the production rule stackpstate_stack
. The stack is then cleared after a successful call. (Set the parser attribute directly for more control.)The parser’s
partial_expressions
attribute will be used unless it is overridden by the parameterpartial_expressions
here. When it is true no check is made for the end-token afterrecursive_parse
returns a value. The lexer will be left at the last token consumed, so a check for the end-token will tell when all the text was consumed. Users are responsible for making sure their grammars are suitable for this kind of parsing if the option is set.If the
skip_lex_setup
parameter is true then the textprogram
is ignored and lexer setup is skipped. This is generally ONLY used when multiple parsers are parsing from a common text stream, andparse
is called from the methodparse_from_lexer
.
-
-
exception
typped.pratt_parser.
IncompleteParseException
[source]¶ Bases:
typped.shared_settings_and_exceptions.ParserException
Only raised at the end of the
PrattParser
functionparse
if tokens remain in the lexer after the parser finishes its parsing.