REBOL 3 Docs | Guide | Concepts | Functions | Datatypes | Errors |
TOC < Back Next > | Updated: 30-Nov-2013 Edit History |
Parses a string or block series according to grammar rules.
Arguments:
input [series!] - Input series to parse
rules [block! string! char! none!] - Rules to parse by (none = ",;")
Refinements:
/all - For simple rules (not blocks) parse all chars including whitespace
/case - Uses case-sensitive comparison
See also:
The parse function is used to match patterns of values and perform specific actions upon such matches. A full summary can be found in parsing: summary of parse operations .
Both string! and block! datatypes can be parsed. Parsing of strings matches specific characters or substrings. Parsing of blocks matches specific values, or specific datatypes, or sub-blocks.
Whereas most languages provide a method of parsing strings, the parsing of blocks is an important feature of the REBOL language.
The parse function takes two main arguments: an input to be parsed and the rules that are used to parse it. The rules are specified as a block of grammar productions that are to be matched.
Rules consist of these main elements:
Item | Description |
---|---|
keyword | a special word of the dialect, listed in the table below |
word | get or set a variable (see below) - cannot be a keyword |
path | get or set a variable via a path (see below) |
value | match the input to a value (accepted datatypes depend on input datatype) |
"|" | backtrack and match to next alternate rule (or) |
[block] | a block of sub-rules |
(paren) | evaluate an expression (a production) |
Within the parse dialect, these words are treated as keywords and cannot be used as variables.
Keyword | Description |
---|---|
and rule | match the rule, but do not advance the input (allows matching multiple rules to the same input) |
any rule | match the rule zero or more times; stop on failure or if input does not change. |
break | break out of a match loop (such as any, some, while), always indicating success. |
change rule only value | match the rule, and if true, change the input to the new value (can be different lengths) |
copy word | set the word to a copy of the input for matched rules |
do rule | evaluate the input as code, then attempt to match to the rule |
end | match end of input |
fail | force current rule to fail, backtrack |
if (expr) | evaluate the expression (in a paren) and if false or none, fail and backtrack |
insert only value | insert a value at the current input position (with optional ONLY for blocks by reference); input position is adjusted just past the insert |
into rule | match a series, then parse it with given rule; new series can be the same or different datatype. |
opt rule | match to the rule once or not at all (zero or one times) |
not rule | invert the result of the next rule |
quote arg | accept next argument exactly as is (exception: paren) |
reject | similar to break: break out of a match loop (such as any, some, while), but indicate failure. |
remove rule | match the rule, and if true, remove the matched input |
return rule | match the rule, and if true, immediately return the matched input as result of the PARSE function |
set word | set the word to the value of the input for matched rules |
skip | skip input (for the count range, if provided before it) |
some rule | match to the rule one or more times; stop on failure or if input does not change. |
then | regardless of failure or success of what follows, skip the next alternate rule (branch) |
thru rule | scan forward in input for matching rules, advance input to tail of the match |
to rule | scan forward in input for matching rules, advance input to head of the match |
while rule | like any, match to the rule zero or more times; stop on failure; does not care if input changes or not. |
?? | Debugging output. Prints the next parse rule value and shows the current input position (e.g. where you are in the string.) |
In addition, none is a special value that can be used as a default match rule. It is often used at the end of alternate rules to catch all no-match cases.
There is also a simple parse mode that does not require rules, but takes a string of characters to use for splitting up the input string.
Parse also works in conjunction with bitsets (charset) to specify groups of special characters.
The result returned from a simple parse is a block of values. For rule-based parses, it returns TRUE if the parse succeeded through the end of the input string.
print parse "divide on spaces" none
divide on spaces
print parse "Harry Haiku, 264 River Rd., Ukiah, 95482" ","
Harry Haiku 264 River Rd. Ukiah 95482
page: read http://hq.rebol.net
parse page [thru <title> copy title to </title>]
print title
Now is REBOL
digits: charset "0123456789"
area-code: ["(" 3 digits ")"]
phone-num: [3 digits "-" 4 digits]
print parse "(707)467-8000" [[area-code | none] phone-num]
true
TOC < Back Next > | REBOL.com - WIP Wiki | Feedback Admin |