REBOL 3 Functions: parse

parse input rules /all /case

Parses a string or block series according to grammar rules.

Arguments:

input [series!] - Input series to parse

rules [block! string! char! none!] - Rules to parse by (none = ",;")

Refinements:

/all - For simple rules (not blocks) parse all chars including whitespace

/case - Uses case-sensitive comparison

Description

The parse function is used to match patterns of values and perform specific actions upon such matches. A full summary can be found in parsing: summary of parse operations .

Both string! and block! datatypes can be parsed. Parsing of strings matches specific characters or substrings. Parsing of blocks matches specific values, or specific datatypes, or sub-blocks.

Whereas most languages provide a method of parsing strings, the parsing of blocks is an important feature of the REBOL language.

The parse function takes two main arguments: an input to be parsed and the rules that are used to parse it. The rules are specified as a block of grammar productions that are to be matched.

General parse rules

Rules consist of these main elements:

Item	Description
keyword	a special word of the dialect, listed in the table below
word	get or set a variable (see below) - cannot be a keyword
path	get or set a variable via a path (see below)
value	match the input to a value (accepted datatypes depend on input datatype)
"\|"	backtrack and match to next alternate rule (or)
[block]	a block of sub-rules
(paren)	evaluate an expression (a production)

List of keywords

Within the parse dialect, these words are treated as keywords and cannot be used as variables.

Keyword	Description
and rule	match the rule, but do not advance the input (allows matching multiple rules to the same input)
any rule	match the rule zero or more times; stop on failure or if input does not change.
break	break out of a match loop (such as any, some, while), always indicating success.
change rule only value	match the rule, and if true, change the input to the new value (can be different lengths)
copy word	set the word to a copy of the input for matched rules
do rule	evaluate the input as code, then attempt to match to the rule
end	match end of input
fail	force current rule to fail, backtrack
if (expr)	evaluate the expression (in a paren) and if false or none, fail and backtrack
insert only value	insert a value at the current input position (with optional ONLY for blocks by reference); input position is adjusted just past the insert
into rule	match a series, then parse it with given rule; new series can be the same or different datatype.
opt rule	match to the rule once or not at all (zero or one times)
not rule	invert the result of the next rule
quote arg	accept next argument exactly as is (exception: paren)
reject	similar to break: break out of a match loop (such as any, some, while), but indicate failure.
remove rule	match the rule, and if true, remove the matched input
return rule	match the rule, and if true, immediately return the matched input as result of the PARSE function
set word	set the word to the value of the input for matched rules
skip	skip input (for the count range, if provided before it)
some rule	match to the rule one or more times; stop on failure or if input does not change.
then	regardless of failure or success of what follows, skip the next alternate rule (branch)
thru rule	scan forward in input for matching rules, advance input to tail of the match
to rule	scan forward in input for matching rules, advance input to head of the match
while rule	like any, match to the rule zero or more times; stop on failure; does not care if input changes or not.
??	Debugging output. Prints the next parse rule value and shows the current input position (e.g. where you are in the string.)

In addition, none is a special value that can be used as a default match rule. It is often used at the end of alternate rules to catch all no-match cases.

Simple Parse

There is also a simple parse mode that does not require rules, but takes a string of characters to use for splitting up the input string.

Parse also works in conjunction with bitsets (charset) to specify groups of special characters.

The result returned from a simple parse is a block of values. For rule-based parses, it returns TRUE if the parse succeeded through the end of the input string.

print parse "divide on spaces" none
divide on spaces

print parse "Harry Haiku, 264 River Rd., Ukiah, 95482" ","
Harry Haiku 264 River Rd. Ukiah 95482

page: read http://hq.rebol.net
parse page [thru <title> copy title to </title>]
print title
Now is REBOL

digits: charset "0123456789"
area-code: ["(" 3 digits ")"]
phone-num: [3 digits "-" 4 digits]
print parse "(707)467-8000" [[area-code | none] phone-num]
true

REBOL 3 Functions: parse

Contents

Description

General parse rules

List of keywords

Simple Parse