REBOL 3 Concepts: Parsing: Parsing Blocks and Dialects

Pending Revision

This document was written for R2 and has yet to be revised for R3.

Blocks are parsed similar to strings. A set of rules specify the order of expected values. However, unlike the parsing of strings, the parsing of blocks is not concerned with characters or delimiters. Parsing of blocks is done at the value level, making the grammar rules easier to specify and operation many times faster.

Block parsing is the easiest way to create REBOL dialects. Dialects are sub-languages of REBOL that use the same lexical form for all datatypes but allow a different ordering of the values within a block. The values do not need to conform to the normal order required by REBOL function arguments. Dialects are able to provide greater expressive power for specific domains of use. For instance, the parser rules themselves are specified as a dialect.

Matching Words

Matching Datatypes

Characters Not Allowed

Dialect Examples

Parsing Sub-blocks

Matching Words

When parsing a block, to match against a word specify the word as a literal:

'name
'when
'empty

Matching Datatypes

You can match a value of any datatype by specifying the data type word. See Datatype Matches below.

Datatype Word	Description
string!	matches any quoted string
time!	matches any time
date!	matches any date
tuple!	matches any tuple

NOTE: Don't forget the "!" that is part of the name or an error will be generated.

Characters Not Allowed

The parse operations allowed for blocks are those that deal with specific characters. For instance, a match cannot be specified to the first letter of a word or string, nor to spacing or newline characters.

Dialect Examples

A few concise examples help illustrate the parsing of blocks:

block: [when 10:30]
print parse block ['when 10:30]
print parse block ['when time!]
parse block ['when set time time! (print time)]

Notice that a specific word can be matched by using its literal word in the rule (as in the case of 'when ). A datatype can be specified rather than a value, as in the lines above containing time!. In addition, a variable can be set to a value with the set operation.

As with strings, alternate rules can be specified when parsing blocks:

rule: [some [
    'when set time time! |
    'where set place string! |
    'who set persons [word! | block!]
]]

These rules allow information to be entered in any order:

parse [
    who Fred
    where "Downtown Center"
    when 9:30
] rule
print [time place persons]

This example could have used variable assignment, but it illustrates how to provide alternate input ordering.

Here's another example that evaluates the results of the parse:

rule: [
    set count integer!
    set str string!
    (loop count [print str])
]
parse [3 "great job"] rule
parse [3 "hut" 1 "hike"] [some rule]

Finally, here is a more advanced example:

rule: [
    set action ['buy | 'sell]
    set number integer!
    'shares 'at
    set price money!
    (either action = 'sell [
            print ["income" price * number]
            total: total + (price * number)
        ][
            print ["cost" price * number]
            total: total - (price * number)
        ]
    )
]

total: 0
parse [sell 100 shares at $123.45] rule
print ["total:" total]

total: 0
parse [
    sell 300 shares at $89.08
    buy  100 shares at $120.45
    sell 400 shares at $270.89
] [some rule]
print ["total:" total]

It should be noted that this is one way how expressions that use the dialect concept first described in Chapter 4 can be evaluated.

Parsing Sub-blocks

When parsing a block, if a sub-block is found, it is treated as a single value that is of the block! datatype. However, to parse a sub-block, you must invoke the parser recursively on the sub-block. The into word provides this capability. It expects that the next value in the input block is a sub-block to be parsed. This is as if a block! datatype had been provided. If the next value is not a block! datatype, the match fails and into looks for alternates or exits the rule. If the next value is a block, the parser rule that follows the into word is used to begin parsing the sub-block. It is processed in the same way as a sub-rule.

rule: [date! into [string! time!]]
data: [10-Jan-2000 ["Ukiah" 10:30]]
print parse data rule

All of the normal parser operations can be applied to into.

rule: [
    set date date!
    set info into [string! time!]]
]
data: [10-Jan-2000 ["Ukiah" 10:30]]
print parse data rule

print info

rule: [date! copy items 2 into [string! time!]]
data: [10-Jan-2000 ["Ukiah" 10:30] ["Rome" 2:45]]
print parse data rule

probe items

REBOL 3 Concepts: Parsing: Parsing Blocks and Dialects

Contents

Matching Words

Matching Datatypes

Characters Not Allowed

Dialect Examples

Parsing Sub-blocks