REBOL 3 Docs | Guide | Concepts | Functions | Datatypes | Errors |
TOC < Back Next > | Updated: 6-Feb-2009 Edit History |
When parsing strings, these datatypes and words can be used to match characters in the input string:
Match Type | Description |
---|---|
'"abc" | match the entire string |
'#"c" | match a single character |
'tag | match a tag string |
'end | match to the end of the input |
'(bitset) | match any specified char in the set |
To use all of these words (except bitset, which is explained below) in a single rule, use:
[<B> ["excellent" | "incredible"] #"!" </B> end]
This example parses the input strings:
<B>excellent!</B> <B>incredible!</B>
The end specifies that nothing follows in the input stream. The entire input has been parsed. It is optional depending on whether the parse function's return value is to be checked. Refer to the [bad-link:concepts/evaluation.txt] section below for more information.
The bitset! datatype deserves more explanation. Bitsets are used to specify collections of characters in an efficient manner. The charset function enables you to specify individual characters or ranges of characters. For example, the line:
digit: charset "0123456789"
defines a character set that contains digits. This allows rules like:
[3 digit "-" 3 digit "-" 4 digit]
707-467-8000
To accept any number of digits, it is common to write the rule:
digits: [some digit]
A character set can also specify ranges of characters. For instance, the digit character set could have be written as:
digit: charset [#"0" - #"9"]
Alternatively, you can combine specific characters and ranges of characters:
the-set: charset ["+-." #"0" - #"9"]
To expand on this, here is the alphanumeric set of characters:
alphanum: charset [#"0" - #"9" #"A" - #"Z" #"a" - #"z"]
Character sets can also be modified with the insert and remove functions, or combinations of sets can be created with the union and intersect functions. This line copies the digit character set and adds a dot to it:
digit-dot: insert copy digit "."
The following lines define useful character sets for parsing:
digit: charset [#"0" - #"9"] alpha: charset [#"A" - #"Z" #"a" - #"z"] alphanum: union alpha digit
TOC < Back Next > | REBOL.com - WIP Wiki | Feedback Admin |