Chapter 6 - Series
REBOL/Core Users Guide Main Table of Contents Send Us Feedback
Contents:
1. Basic Concepts
1.1 Traversing a Series
1.2 Skipping Around
1.3 Extracting Values
1.4 Extracting a Sub-series
1.5 Inserting and Appending
1.6 Removing Values
1.7 Changing Values
2. Series Functions
2.1 Creation Functions
2.2 Navigation Functions
2.3 Information Functions
2.4 Extraction Functions
2.5 Modification Functions
2.6 Search Functions
2.7 Ordering Functions
2.8 Data Set (Group) Functions
3. Series Datatypes
3.1 Block Types
3.2 String Types
3.3 Pseudo-types
3.4 Type Test Functions
4. Series Information
4.1 Length?
4.2 Head?
4.3 Tail?
4.4 Index?
4.5 Offset?
5. Making and Copying Series
5.1 Partial Copies
5.2 Deep Copies
5.3 Initial Copies
6. Series Iteration
6.1 Foreach Loop
6.2 While Loop
6.3 Forall Loop
6.4 Forskip Loop
6.5 The Break Function
7. Searching Series
7.1 Simple Find
7.2 Refinement Summary
7.3 Partial Searches
7.4 Tail Positions
7.5 Backward Searches
7.6 Repeated Searches
7.7 Matching
7.8 Wildcard Searches
7.9 Select
7.10 Search and Replace
8. Sorting Series
8.1 Simple Sorting
8.2 Group Sorting
8.3 Comparison Functions
9. Series as Data Sets
9.1 Unique
9.2 Intersect
9.3 Union
9.4 Exclude
9.5 Difference
9.6 Exclude
10. Multiple Series Variables
11. Modification Refinements
11.1 Part
11.2 Only
11.3 Dup
1. Basic Concepts
The concept of a series is simple, and it is the fundamental concept
that is used everywhere and for nearly everything in REBOL. In order
to understand REBOL, you must understand how to create and manipulate
series.
A series is a set of values arranged in a
specific order.
It's that simple. For example, these are all series:
1 2 3 4
A B C D
"ABCD"
10:30 4:20 7:11
There are many types of series in REBOL. A block, a string, a list, a
URL, a path, an email, a file, a tag, a binary, a bitset, a port, a
hash, an issue, and an image are all series and can be accessed and
processed in the same way with the same small set of series functions.
1.1 Traversing a Series
Since a series is an ordered set of values, you can traverse it from
one position to another. As an example, take a series of three colors
defined by the following block:
colors: [red green blue]
There is nothing special about this block. It is a series containing three
words. It has a set of values: red, green, and blue. The values are organized
into a specific order: red is first, green is second, and blue is third.
The first position of the block is called its head. This is
the position occupied by the word red. The last position of the block
is called its tail. This is the position immediately after the last word in the
block. If you were to draw a diagram of the block, it would look like this:
Notice that the tail is just past the end of the block. The importance of
this will become more clear shortly.
The variable colors is used to refer to the block. It is currently
set to the head of the block:
print head? colors
true
The colors variable is at the first index position of the block.
print index? colors
1
The block has a length of three:
print length? colors
3
The first item in the block is:
print first colors
red
The second item in the block is:
print second colors
green
You can reposition the colors variable in the block using various
functions. To move the colors variable to the next position in the
colors block, use the next function:
colors: next colors
The next function moves forward one value in the block and
returns that position as a result. The colors variable is now set to
that new position:
The position of the colors variable has changed. Now the variable is
no longer at the head of the block:
print head? colors
false
It is at the second position in the block:
print index? colors
2
However, if you obtain the first item of colors, you get:
print first colors
green
The position of the value that is returned by the first
function is relative to the position that colors has in the block. The
returned value is not the first color in the block, but the first color
immediately following the current position of the block.
Similarly, if you ask for the length or the second color, you find that these
are relative as well:
print length? Colors
2
print second colors
blue
You could move to the next position, and get a similar set of results:
colors: next colors
print index? colors
3
print first colors
blue
print length? colors
1
The block diagram now looks like this:
The colors variable is now at the last color in the block, but it is
not yet to the tail position.
print tail? colors
false
To reach the tail, it has to be moved to the next position:
colors: next colors
Now the colors variable is resting at the tail of the block. It is
no longer positioned at a valid color. It is past the end of the block.
If you try your code, you will get:
print tail? colors
true
print index? colors
4
print length? Colors
0
print first colors
** Script Error: Out of range or past end.
** Where: print first colors
You receive an error in this last case because there is no valid first item
when you are past the end of the block.
It is also possible to move backwards in the block. If you write:
colors: back colors
you will move the colors variable back one position in the
series:
All of the same code will work as before:
print index? colors
3
print first colors
blue
1.2 Skipping Around
The previous examples move through the series one item at a time. However,
there are times when you want to skip past multiple items using the
skip function. Assume that the colors variable is
positioned at the head of a series:
You can skip forward two items using:
colors: skip colors 2
The skip function is similar to next in
that skip returns the series at the new position.
The following code confirms the new position:
print index? colors
3
print first colors
blue
To move backward, use skip with negative values:
colors: skip colors -1
This is similar to back. In the above example, a skip of -1
moves back one item.
print first colors
green
Note that you cannot skip past the tail or the head of a series. If you
attempt to do so, skip only goes as far as it can. It will not
generate an error.
If you skip too far forward, skip returns the tail of the
series:
colors: skip colors 20
print tail? colors
true
If you skip too far back, skip returns the head of the
series:
colors: skip colors -100
print head? colors
true
To skip directly to the head of the series, use the head
function:
colors: head colors
print head? colors
true
print first colors
red
You can return to the tail with the tail function:
colors: tail colors
print tail? colors
true
1.3 Extracting Values
Some of the previous examples made use of the first and
second ordinal functions to extract specific
values from a series. The full set of ordinal functions is:
first
second
third
fourth
fifth
last
Ordinal functions are provided as a convenience, and are used for picking
values from the most common position in a series. Here are some examples:
colors: [red green blue gold indigo teal]
print first colors
red
print third colors
blue
print fifth colors
indigo
print last colors
teal
To extract from a numeric position, use the pick function:
print pick colors 3
blue
print pick colors 5
indigo
A shorthand notation for pick is to use a path:
print colors/3
blue
print colors/5
indigo
Remember, as shown earlier, extraction is performed relative
to the series variable that you provide. If the colors variable were at
another position in the series, the results would be different.
Extracting a value past the end of its series generates an error in the case
of the ordinal functions and returns none in the case of the
pick function or a pick path:
print pick colors 10
none
print colors/10
none
1.4 Extracting a Sub-series
You can extract multiple values from a series with the copy
function. To do so, use copy with the /part
refinement, which specifies the number of values that you want to extract:
colors: [red green blue]
sub-colors: copy/part colors 2
probe sub-colors
[red green]
Graphically, this would look like:
To copy a sub-series from any position within the series, first traverse to
the starting position. The following example moves forward to the second
position in the series using next before performing the
copy:
sub-colors: copy/part next colors 2
probe sub-colors
[green blue]
This would be diagrammed as:
The length of the series to copy can be specified as an ending position, as
well as a copy count. Note that the position indicates where the copy
should stop, not the ending position.
probe copy/part colors next colors
[red]
probe copy/part colors back tail colors
[red green]
probe copy/part next colors back tail colors
[green]
This can be useful when the ending position is found as the result of the
find function:
file: %image.jpg
print copy/part file find file "."
image
1.5 Inserting and Appending
You can insert one or more new values into any part of a series using the
insert function. When you insert a value at a position in a
series, space is made by shifting its prior values toward the tail of the
series.
For instance, the block:
colors: [red green]
would be shown as:
To insert a new value at the head of the block where the colors
variable is now positioned:
insert colors 'blue
The red and green words are shifted over and the
blue word (which is prefixed with a tick because it is a word and
should not be evaluated) is inserted at the head of the list.
Note that the colors variable remains positioned at the head of the
list.
probe colors
[blue red green]
Also note that the return from the insert function was not
used because it was not set to a variable or passed along to another function.
If the return had been used to set the value of the colors variable
with the line:
colors: insert colors 'blue
the effect on the block would have been the same, but the position of the
colors variable would have changed as a result of setting the return
value. The position returned from insert is immediately
following the insertion point.
An insertion can be made anywhere in the series. The position of the insert
can be specified, and it can include the tail. Inserting at the tail has the
effect of appending to the series.
colors: tail colors
insert colors 'gold
probe colors
[blue red green gold]
Before the insertion:
After the insertion:
The word gold has been inserted at the tail of the series.
Another way to insert at the tail of a series is with the
append function. The append function works in
the same way as insert, but always inserts at the tail. The
previous example would become:
append colors 'gold
The result is the same as the previous example.
The insert and append function also accept
a block of arguments to insert. As an example:
colors: [red green]
insert colors [blue yellow orange]
probe colors
[blue yellow orange red green]
If you want to insert the new values between the red and
green words:
colors: [red green]
insert next colors [blue yellow orange]
probe colors
[red blue yellow orange green]
The insert and append functions have other
capabilities that are covered in more detail in a later section.
1.6 Removing Values
You can remove one or more values from any part of a series by using the
remove function.
For instance, starting with the block:
colors: [red green blue gold]
As shown here:
You can remove the first value from the block with the line:
remove colors
The block becomes:
It can be printed with:
probe colors
[green blue gold]
The remove function removes values relative to the position
of the colors variable. You can remove values from anywhere in the
series by setting the position.
remove next colors
The block is now:
Multiple values can be removed by supplying the /part
refinement.
remove/part colors 2
This removes the remaining values, leaving an empty block:
Similar to insert/part, the argument to
remove/part can also be a position within the block.
Removing all of the remaining values is a common operation. The
clear function is provided to make this more direct.
Clear removes all values from the current position to the tail.
For example:
Colors: [blue red green gold]
As shown here:
Everything after blue can be removed with:
clear next colors
The block becomes:
You can easily clear the entire block with:
clear colors
1.7 Changing Values
One additional set of functions is provided for changing values in a series.
The change function replaces one or more values with new
values. Although this can be accomplished by removing and inserting values, it
is more efficient to use change.
Defining the block:
colors: [blue red green gold]
Its second value could be changed with the line:
change next colors 'yellow
And it would become:
The block would now become:
probe colors
[blue yellow green gold]
The poke function allows you to specify that the change
occur at a particular position relative to the colors variable. The
poke function is similar to the pick function
described earlier.
poke colors 3 'red
The block is now:
As proven by:
probe colors
[blue yellow red gold]
The change function has additional refinements that are
described later in this chapter.
2. Series Functions
Here is a summary of the functions that operate on series. Most of these were
described in detail in the previous section. Others will be covered in more
detail in this section.
2.1 Creation Functions
Function
|
Description
|
---|
make
|
Makes a new series of the given type.
|
copy
|
Copies a series.
|
2.2 Navigation Functions
Function
|
Description
|
---|
next
|
Returns the next position in a series.
|
back
|
Returns the previous position in a series.
|
head
|
Returns the head position of a series.
|
tail
|
Returns the tail position of a series.
|
skip
|
Returns the position plus or minus an integer.
|
at
|
Returns the position plus or minus an integer, but uses the same
indexing as pick.
|
2.3 Information Functions
Function
|
Description
|
---|
head?
|
Returns true if at the head of the series.
|
tail?
|
Returns true if at the tail of the series.
|
index?
|
Returns the offset from the head of the series.
|
length?
|
Returns the length of a series from the current position.
|
offset?
|
Returns the distance between two series positions.
|
empty?
|
Returns true if the series is empty from this position.
|
2.4 Extraction Functions
Function
|
Description
|
---|
pick
|
Extracts a single value from a position in a series.
|
copy/part
|
Extracts a sub-series from a series.
|
first
|
Extracts the first value from a series.
|
second
|
Extracts the second value from a series.
|
third
|
Extracts the third value from a series.
|
fourth
|
Extracts the fourth value from a series.
|
fifth
|
Extracts the fifth value from a series.
|
last
|
Extracts the last value from a series.
|
2.5 Modification Functions
Function
|
Description
|
---|
insert
|
Inserts values into a series.
|
append
|
Appends values to the tail of a series.
|
remove
|
Removes values from a series.
|
clear
|
Clears values to the tail of a series.
|
change
|
Changes values in a series.
|
poke
|
Changes values at a position in a
series.
|
2.6 Search Functions
Function
|
Description
|
---|
find
|
Finds a value in a series.
|
select
|
Finds an associated value in a series.
|
replace
|
Searches and replaces values in a series.
|
parse
|
Parses values in a series.
|
2.7 Ordering Functions
Function
|
Description
|
---|
sort
|
Sorts the values in a series into an order.
|
reverse
|
Reverse the order of values in a
series
|
2.8 Data Set (Group) Functions
Function
|
Description
|
---|
unique
|
Returns a unique set of values, removing duplicates.
|
intersect
|
Returns only the values found in both series.
|
union
|
Returns the combined values from two series.
|
exclude
|
Returns one series less another.
|
difference
|
Returns the values not found in either
series.
|
3. Series Datatypes
All series datatypes can be divided into two broad classes. Each includes a
datatype value and a type test function.
3.1 Block Types
Block Type
|
Description
|
---|
Block!
|
Blocks of values
|
Paren!
|
Blocks of values enclosed in parentheses
|
Path!
|
Paths of values
|
List!
|
Linked lists
|
Hash!
|
Associative arrays
|
3.2 String Types
String Type
|
Description
|
---|
String!
|
Character strings
|
Binary!
|
Byte strings
|
Tag!
|
HTML and XML tags
|
File!
|
File names
|
URL!
|
Internet uniform resource locators
|
Email!
|
Email names
|
Image!
|
Image data
|
Issue!
|
Sequence codes
|
3.3 Pseudo-types
Series datatypes are grouped into a few pseudo-types that make function
argument and type testing easier:
Pseudo-type
|
Description
|
---|
Series!
|
A series datatype
|
Any-block!
|
Any of the block datatypes
|
Any-string!
|
Any of the string datatypes
|
3.4 Type Test Functions
Block type tests:
Block? Paren? Path? List? Hash?
String type tests:
String? Binary? Tag? File? URL?
Email? Image? Issue?
Other series type tests:
Series? Any-block? Any-string?
4. Series Information
4.1 Length?
The length of a series is the number of items (values for a block or
characters for a string) from the current position to the
tail. If the current position is the head of the series, then
the length is the number of items in the entire series.
The length? function returns the number of items to the
tail.
colors: [blue red green]
print length? colors
3
All three values are part of the length:
If the position of the color variable is advanced to the next
value:
color: next color
print length? color
2
the length becomes two:
Other examples of length?:
print length? "Ukiah"
5
print length? []
0
print length? ""
0
data: [1 2 3 4 5 6 7 8]
print length? data
8
data: next data
print length? data
7
data: skip data 5
print length? data
2
4.2 Head?
The head of a series is the position of its first value. If a series is at
its head, the head? function returns true:
data: [1 2 3 4 5]
print head? data
true
data: next data
print head? data
false
4.3 Tail?
The tail of a series is the position immediately following
the last value. If a series variable is at the tail, the
tail? function returns true:
data: [1 2 3 4 5]
print tail? data
false
data: tail data
print tail? data
true
The empty? function is equivalent to the
tail? function.
print empty? data
true
If empty? returns true, it means there are no
values between the current position and the tail; however,
there still may be values in the series. Values can still be present
before the current position. If you need to determine if the series is empty
from head to tail, use:
print empty? head data
false
4.4 Index?
The index is the position in a series relative to the head of
the series. To determine the index position for a series variable, use the
index? function:
data: [1 2 3 4 5]
print index? data
1
data: next data
print index? data
2
data: tail data
print index? data
6
4.5 Offset?
The distance between two positions in a series can be determined with the
offset? function.
data: [1 2 3 4]
data1: next data
data2: back tail data
print offset? data1 data2
4
In this example, the offset is the difference between position 2 and position
4:
5. Making and Copying Series
New series are created with the make and
copy functions.
Use the make function to create a new series from a series
datatype and an initial size. The size is an estimate of the size needed for
the series. If the initial size is too small, the series will automatically
expand, but at a slight performance cost.
block: make block! 50
string: make string! 10000
list: make list! 128
file: make file! 64
The copy function creates a new series by copying an
existing series:
string: copy "Message in a bottle"
new-string: copy string
block: copy [1 2 3 4 5]
new-block: copy block
Copying is also important for use with functions that modify the contents of
a series. For instance, if you want to change the case of a string without
modifying the original, use the copy:
string: uppercase copy "Message in a bottle"
5.1 Partial Copies
The copy function /part refinement takes a
single argument, which is either an integer specifying the number of items to
copy or a position within the series indicating the last position to copy.
str: "Message in a bottle"
print str
Message in a bottle
print copy/part str find str " "
Message
new-str: copy/part (find str "in") (find str "bottle")
print new-str
in a
blk: [ages [10 12 32] sizes [100 20 30]]
new-blk: copy/part blk 2
probe new-blk
[ages [10 12 32]]
5.2 Deep Copies
Many blocks contain other blocks and strings. When such a block is copied,
its sub-series are not copied. The sub-series are referred to directly and are
the same series data as the original block. If you modify any of these
sub-series, you modify them in the original block as well.
The copy/deep refinement forces a copy of all series values
within a block:
blk-one: ["abc" [1 2 3]]
probe blk-one
["abc" [1 2 3]]
The next example assigns a normal copy of blk-one to
blk-two:
blk-two: copy blk-one
probe blk-one
["abc" [1 2 3]]
probe blk-two
["abc" [1 2 3]]
If either the string or block contained in blk-two is modified, the
series values in blk-one are also modified.
append blk-two/1 "DEF"
append blk-two/2 [4 5 6]
probe blk-one
["abcDEF" [1 2 3 4 5 6]]
probe blk-two
["abcDEF" [1 2 3 4 5 6]]
Using copy/deep makes a copy of all series values found in
the block:
blk-two: copy/deep blk-one
append blk-two/1 "ghi"
append blk-two/2 [7 8 9]
probe blk-one
["abcDEF" [1 2 3 4 5 6]]
probe blk-two
["abcDEFghi" [1 2 3 4 5 6 7 8 9]]
5.3 Initial Copies
When initializing a string or block series, use copy on the
value to make is a unique series:
str: copy ""
blk: copy []
Using copy assures that a new series is created for the word
every time the word is initialized. Here is an example of why this is
important.
print-it: func [/local str] [
str: ""
insert str "ha"
print str
]
print-it
ha
print-it
haha
print-it
hahaha
In this example, because copy wasn't used, the empty string
series is modified with every call of print-it. The string series
ha is inserted into str each time print-it is
called.
Examining the source of the function as it now exists exposes the root of the
problem:
source print-it
print-it: func [/local str] [
str: "hahaha"
insert str "ha"
print str
]
Although str is a local variable, its string value is global. To
avoid this problem, the function should copy the empty string
or use make on the string.
print-it: func [/local str] [
str: copy ""
insert str "ha"
print str
]
print-it
ha
print-it
ha
print-it
ha
6. Series Iteration
You can use a loop to traverse a series. There are a few loop functions that
can help automate the iteration process.
6.1 Foreach Loop
The foreach loop moves through a series setting a word or
multiple words in to the values in the series.
The foreach loop takes three arguments: a word or a block of
words that holds the values for each iteration, a series, and a
block to evaluate for each iteration.
colors: [red green blue yellow orange gold]
foreach color colors [print color]
red
green
blue
yellow
orange
gold
foreach [c1 c2] colors [print [c1 c2]]
red green
blue yellow
orange gold
foreach [c1 c2 c3] colors [print [c1 c2 c3]]
red green blue
yellow orange gold
This is very useful with blocks that contain related values:
people: [
"Bob" bob@example.com 12
"Tom" tom@example.net 40
"Sam" sam@example.org 22
]
foreach [name email age] people [
print [name email age]
]
Bob bob@example.com 12
Tom tom@example.net 40
Sam sam@example.org 22
Note that the foreach loop does not advance the current index
through the series, so there is no need to reset its series
variable.
6.2 While Loop
The most flexible approach is to use a while loop, which
allows you to do just about anything to the series without problems.
colors: [red green blue yellow orange]
while [not tail? colors] [
print first colors
colors: next colors
]
red
green
blue
yellow
orange
The method shown below allows you to insert values without
hitting a value twice:
colors: head colors
while [not tail? colors] [
if colors/1 = 'yellow [
colors: insert colors 'blue
]
colors: next colors
]
This example illustrates that the insert returns the
position immediately following the insertion.
To remove a value without accidentally skipping a value, use
the following code:
colors: head colors
while [not tail? colors] [
either colors/1 = 'blue [
remove colors
][
colors: next colors
]
]
Notice that if a removal is done, the next function is not
performed.
6.3 Forall Loop
The forall loop is similar to the while
loop, but eliminates some of the effort required. The forall
loop starts from the current index and advances through a series to its tail
evaluating a block for every value.
The forall loop takes two arguments: a series variable and a
block to evaluate for each iteration.
colors: [red green blue yellow orange]
forall colors [print first colors]
red
green
blue
yellow
orange
The forall advances the variable position through the
series, so when it returns the variable is left at its tail:
print tail? colors
true
Therefore, the variable must be reset before it is used again:
colors: head colors
Also, if the block modifies the series, be careful to avoid missing or
repeating a value. The forall loop works in some cases;
however, if you are uncertain, use the while loop instead.
forall colors [
if colors/1 = 'blue [remove colors]
print first colors
]
red
green
yellow
orange
6.4 Forskip Loop
Similar to forall, the forskip loop
advances through a series starting at the current position, but skips the
specified number of values each time.
The forskip loop takes three arguments: a series variable,
the skip between each iteration, and a block to evaluate for each iteration.
colors: [red green blue yellow orange]
forskip colors 2 [print first colors]
red
blue
orange
The forskip loop leaves the series at its tail, requiring you to reset
it.
print tail? colors
true
colors: head colors
6.5 The Break Function
Any of the loops can be stopped at any time by evaluating the
break function from within the evaluation block. See the Expressions Chapter for more
information about the break function.
7. Searching Series
The find function searches through
block or string series for a value or pattern. This function has many
refinements that permit a wide range of variations in search parameters.
7.1 Simple Find
The simplest and most common use of find is to search a
block or string for a value. In this case, find requires only
two arguments: the series to search and the value to find.
An example of using find on a block is:
colors: [red green blue yellow orange]
where: find colors 'blue
probe where
[blue yellow orange]
print first where
blue
The find function can also search for values by datatype.
This can be quite useful.
items: [10:30 20-Feb-2000 Cindy "United"]
where: find items date!
print first where
20-Feb-2000
where: find items string!
print first where
United
An example of using find on a string is:
colors: "red green blue yellow orange"
where: find colors "blue"
print where
blue yellow orange
When a search fails, none is returned.
colors: [red green blue yellow orange]
probe find colors 'indigo
none
7.2 Refinement Summary
Find has many refinements that support a wide variety of search
parameters:
Refinement
|
Description
|
---|
/part
|
Limits a search on a series to a given length or ending
position.
|
/only
|
Treats a series value as a single value.
|
/case
|
Uses case-sensitive string comparison.
|
/any
|
Allows the use of pattern wildcards that allow matches to be made with
any character. An asterisk (*) in the pattern matches any string, and a
question mark (?) in the pattern matches any character.
|
/with
|
Allows pattern wildcards with different characters other than asterisk
(*) and (?). This allows a pattern to contain asterisks and question
marks.
|
/match
|
Matches a pattern beginning at the current series position, rather than
finding the first occurrence of a value or string. Returns the tail
position if the match is found.
|
/tail
|
Return the tail position of a match on a successful search, rather than
returning the point at which the match was found.
|
/last
|
Searches backwards for the match, starting at the tail of the
series.
|
/reverse
|
Searches backwards for the match, starting at the current
position.
|
7.3 Partial Searches
The /part refinement allows a search to be confined to a
specific portion of a series. For instance, you may want to restrict a search to
a given line or section of text.
Similar to insert/part and remove/part,
find/part takes either a count or an ending position. The
following example uses a count and restricts the search to the first three
items:
colors: [red green blue yellow blue orange gold]
probe find/part colors 'blue
[blue yellow blue orange gold]
The next search is restricted to the first 15 characters:
text: "Keep things as simple as you can."
print find/part text "as" 15
as simple as you can.
The next example uses an ending position. The search is restricted to a
single line of text:
text: {
This is line one.
This is line two.
}
start: find text "this"
end: find start newline
item: find/part start "line" end
print item
line one.
7.4 Tail Positions
The find function returns the position in the series where
an item was found. The /tail refinement returns the position
immediately following the item that was found. Here's an example:
filename: %script.txt
print find filename "."
.txt
print find/tail filename "."
txt
clear change find/tail filename "." "r"
print filename
script.r
In this example, clear is necessary to remove xt,
which follows t.
7.5 Backward Searches
The last example in the previous section would fail if the filename had more
than one period. For instance:
filename: %new.script.txt
print find filename "."
.script.txt
In this example we want the last occurrence of the period in the string,
which can be found using the /last refinement. The
/last refinement searches backward through a series.
print find/last filename "."
.txt
The /last refinement can be combined with/tail to produce:
print find/last filename "."
txt
If you want to continue to search backward through the string, you need the
/reverse refinement. This refinement performs a search from the
current position backward toward the head, rather than forward toward the
tail.
where: find/last filename "."
print where
.txt
print find/reverse where "."
.script.txt
Notice that /reverse continues the search just before the
position of the last match. This prevents it from finding the same period
again.
7.6 Repeated Searches
You can easily repeat the find function to search for
multiple occurrences of a value or string. Here is an example that would print
all the strings found in a block:
blk: load %script.r
while [blk: find blk string!] [
print first blk
blk: next blk
]
The next example counts the number of new lines in a script. It uses the
/tail refinement to prevent an infinite loop and returns the
position immediately following the match.
text: read %script.r
count: 0
while [text: find/tail text newline] [count: count + 1]
To perform a repeated search in reverse, use the /reverse
refinement. The following example prints all of the index positions in reverse
order for the text of a script.
while [text: find/reverse tail text newline] [
print index? text
]
7.7 Matching
The /match refinement modifies the behavior of find to
perform pattern matching on the current position of a series.
This refinement allows parsing operations to be performed by
matching the next part of a series with expected patterns. See
the chapter on Parsing for
another way to match series.
A simple example of matching is as follows:
blk: [1342 "Franklin Pike Circle"]
probe find/match blk integer!
["Franklin Pike Circle"]
probe find/match blk 1432
["Franklin Pike Circle"]
probe find/match blk "test"
none
str: "Keep things simple."
probe find/match str "keep"
" things simple."
print find/match str "things"
none
Notice in the example that a search is not performed. The beginning of the
series either matches or it does not. If it does match, the series is advanced
the position immediately following the match point, allowing you to match the
next sequence.
Here is a simple parser written with find/match:
grammar: [
["keep" "make" "trust"]
["things" "life" "ideas"]
["simple" "smart" "happy"]
]
parse-it: func [str /local new] [
foreach words grammar [
foreach word words [
if new: find/match str word [break]
]
if none? new [return false]
str: next new ;skip space
]
true
]
print parse-it "Keep things simple"
true
print parse-it "Make things smart"
true
print parse-it "Trust life well"
false
Matching can be made case-sensitive with the /case
refinement.
The capability of /match can be greatly extended with the
addition of the /any refinement as discussed below.
7.8 Wildcard Searches
The /any refinement enables wildcard pattern matching. The
question mark (?) and asterisk (*) characters act as wildcards
for matching any single character or any number of characters
respectively. The /any refinement can be used in conjunction
with find with or without the /match refinement.
Examples:
str: "abcdefg"
print find/any str "c*f"
cdefg
print find/any str "??d"
bcdefg
email-list: [
mack@rebol.dom
judy@somesite.dom
jack@rebol.dom
biff@rebol.dom
jenn@somesite.dom
]
foreach email email-list [
if find/any email *@rebol.dom [print email]
]
mack@rebol.dom jack@rebol.dombiff@rebol.dom
The next example uses the /match refinement to attempt to
match the pattern to the next part of the series:
file-list: [
%rebol.exe
%notes.html
%setup.html
%feedback.r
%nntp.r
%rebdoc.r
%rebol.r
%user.r
]
foreach file file-list [
if find/match/any file %reb*.r [print file]
]
rebdoc.rrebol.r
If either of the wildcard characters are part of what is to be matched,
substitute wildcard characters can be provided using the /with
refinement.
7.9 Select
A useful variation of the find function is the
select function, which returns the value following the one
found. The select function is often used to lookup a value in
tagged blocks of data. The select function takes the same
arguments as find: the series to search and the value find.
However, unlike find, which returns a series position, the
select function returns the value that follows the match.
colors: [red green blue yellow orange]
print select colors 'green
blue
Given a simple database, the select function can be used to
access its values:
email-book: [
"George" harrison@guru.org
"Paul" lefty@bass.edu
"Ringo" richard@starkey.dom
"Robert" service@yukon.dom
]
The following code locates a specific email address:
print select email-book "Paul"
lefty@bass.edu
Use the select function to find a block of expressions to
evaluate. For example, given the following data:
cases: [
10 [print "ten"]
20 [print "twenty"]
30 [print "thirty"]
]
a block can be evaluated based on a selector:
do select cases 10
ten
do select cases 30
thirty
7.10 Search and Replace
To replace values throughout a series, you can use the
replace function. This function searches for a specific value
in a series, then replaces it with a new value.
The replace function takes three arguments: the series to
search, value to replace, and the new value.
str: "hello world hello"
probe replace str "hello" "aloha"
"aloha world hello"
data: [1 2 8 4 5]
probe replace data 8 3
[1 2 3 4 5]
probe replace data 4 `four
[1 2 3 four 5]
probe replace data integer! 0
[0 2 3 four 5]
Use the /all refinement to replace all occurrences of the
value from the current position to the tail.
probe replace/all data integer! 0
[0 0 0 four 0]
code: [print "hello" print "world"]
replace/all code 'print 'probe
probe code
[probe "hello" probe "world"]
do code
helloworld
str: "hello world hello"
probe replace/all str "hello" "aloha"
"aloha world aloha"
8. Sorting Series
The sort function offers a simple, quick method of sorting
series. It is most useful for blocks of data, but can also be used on strings of
characters.
8.1 Simple Sorting
The simplest examples of sort are:
names: [Eve Luke Zaphod Adam Matt Betty]
probe sort names
[Adam Betty Eve Luke Matt Zaphod]
print sort [321.3 78 321 42 321.8 12 98]
12 42 78 98 321 321.3 321.8
print sort "plosabelm"
abellmops
Notice that sort is destructive to its argument series. It
reorders the original data. To prevent this, use copy, as in
the following example:
probe sort copy names
By default, sorting is case insensitive:
print sort ["Fred" "fred" "FRED"]
Fred fred FRED
print sort "G4C28f9I15Ed3bA076h"
0123456789AbCdEfGhI
Providing the /case refinement makes sorting case
sensitive:
print sort/case "gCcAHfiEGeBIdbFaDh"
ABCDEFGHIabcdefghi
print sort/case ["Fred" "fred" "FRED"]
FRED Fred fred
print sort/case "g4Dc2BI8fCF9i15eAd3bGaE07H6h"
0123456789ABCDEFGHIabcdefghi
Many other datatypes can be sorted:
print sort [1.3.3.4 1.2.3.5 2.2.3.4 1.2.3.4]
1.2.3.4 1.2.3.5 1.3.3.4 2.2.3.4
print sort [$4.23 $23.45 $62.03 $23.23 $4.22]
$4.22 $4.23 $23.23 $23.45 $62.03
print sort [11:11:43 4:12:53 4:14:53 11:11:42]
4:12:53 4:14:53 11:11:42 11:11:43
print sort [11-11-1999 10-11-9999 11-4-1999 11-11-1998]
11-Nov-1998 11-Apr-1999 11-Nov-1999 10-Nov-9999
print sort [john@doe.dom jane@doe.dom jack@jill.dom]
jack@jill.dom jane@doe.dom john@doe.dom
print sort [%user.r %rebol.r %history.r %notes.html]
history.r notes.html rebol.r user.r
8.2 Group Sorting
Often it is necessary to sort a data set that has more than one value per
record. The /skip refinement supports this for sorting records
that have a fixed length. The refinement takes one additional argument: an
integer specifying length of each record.
Here is an example that sorts a block that contains first name, last name,
ages, and emails. The block is sorted by its first column, first-name.
names: [
"Evie" "Jordan" 43 eve@jordan.dom
"Matt" "Harrison" 87 matt@harrison.dom
"Luke" "Skywader" 32 luke@skywader.dom
"Beth" "Landwalker" 104 beth@landwalker.dom
"Adam" "Beachcomber" 29 adam@bc.dom
]
sort/skip names 4
foreach [first-name last-name age email] names [
print [first-name last-name age email]
]
Adam Beachcomber 29 adam@bc.dom
Beth Landwalker 104 beth@landwalker.dom
Evie Jordan 43 eve@jordan.dom
Luke Skywader 32 luke@skywader.dom
Matt Harrison 87 matt@harrison.dom
8.3 Comparison Functions
The /compare refinement allows you to perform custom
comparisons on the data being sorted. This refinement takes an additional
argument, which is the comparison function to use for ordering the data.
A comparison function is written as a regular function that takes two
arguments. These arguments are the values to be compared. A comparison function
returns true if the first value should be placed before the second
value and false if the first value should be placed after the second
value.
A normal comparison places data in ascending order:
ascend: func [a b] [a < b]
If the first value is less than the second, then true is returned
from the function and the first value is placed before the second value.
data: [100 101 -20 37 42 -4]
probe sort/compare data :ascend
[-20 -4 37 42 100 101]
Similarly:
descend: func [a b] [a > b]
If the first value is greater than the second value, then true is
returned and the data is sorted with greater values first. The sort will descend
from greater values.
probe sort/compare data :descend
[101 100 42 37 -4 -20]
Notice that in both cases the comparison function is passed by providing its
name preceded with a colon. The name preceded with a colon causes the function
to be passed to sort without first being evaluated. The
comparison function could also be provided directly with:
probe sort/compare data func [a b] [a > b]
[101 100 42 37 -4 -20]
9. Series as Data Sets
There are a few functions that operate on series as data sets. These
functions allow you to perform operations such as finding the union or
intersection between two series.
9.1 Unique
The unique function returns a unique set that contains no
duplicate values.
Examples:
data: [Bill Betty Bob Benny Bart Bob Bill Bob]
probe unique data
[Bill Betty Bob Benny Bart]
print unique "abracadabra"
abrcd
9.2 Intersect
The intersect function takes two series and returns a series
that contains the values that are present in both series.
Examples:
probe intersect [Bill Bob Bart] [Bob Ted Fred]
[Bob]
lunch: [ham cheese bread carrot]
dinner: [ham salad carrot rice]
probe intersect lunch dinner
[ham carrot]
print intersect [1 3 2 4] [3 5 4 6]
3 4
string1: "CBAD" ; A B C D scrambled
string2: "EDCF" ; C D E F scrambled
print sort intersect string1 string2
CD
The intersection can be found between bitsets:
all-chars: "ABCDEFGHI"
charset1: charset "ABCDEF"
charset2: charset "DEFGHI"
charset3: intersect charset1 charset2
print find charset3 "E"
true
print find charset3 "B"
false
The /case refinement allows case-sensitive intersection:
probe intersect/case [Bill bill Bob bob] [Bart bill Bob]
[bill Bob]
9.3 Union
The union function takes two series and returns a series
that contains all the values from both series, but no duplicates.
Examples:
probe union [Bill Bob Bart] [Bob Ted Fred]
[Bill Bob Bart Ted Fred]
lunch: [ham cheese bread carrot]
dinner: [ham salad carrot rice]
probe union lunch dinner
[ham cheese bread carrot salad rice]
print union [1 3 2 4] [3 5 4 6]
1 3 2 4 5 6
string1: "CBDA" ; A B C D scrambled
string2: "EDCF" ; C D E F scrambled
print sort union string1 string2
ABCDEF
The union function can also be used on bitsets:
charset1: charset "ABCDEF"
charset2: charset "DEFGHI"
charset3: union charset1 charset2
print find charset3 "C"
true
print find charset3 "G"
true
The /case refinement allows case-sensitive unions:
probe union/case [Bill bill Bob bob] [bill Bob]
[Bill bill Bob bob]
9.4 Exclude
The exclude function takes two series and returns a series
that contains all the values of the first series, less the values of the
second.
probe exclude [1 2 3 4] [1 2 3 5]
[4]
probe exclude [Bill Bob Bart] [Bob Ted Fred]
[Bill Bart]
lunch: [ham cheese bread carrot]
dinner: [ham salad carrot rice]
probe exclude lunch dinner
[cheese bread]
string1: "CBAD" ; A B C D scrambled
string2: "EDCF" ; C D E F scrambled
print sort difference string1 string2
AB
The /case refinement allows case-sensitive exclusion:
probe exclude/case [Bill bill Bob bob] [Bart bart bill Bob]
[Bill bob]
9.5 Difference
The difference function takes two series and returns a
series that contains all of the values not in common with both series.
Examples:
probe difference [1 2 3 4] [1 2 3 5]
[4 5]
probe difference [Bill Bob Bart] [Bob Ted Fred]
[Bill Bart Ted Fred]
lunch: [ham cheese bread carrot]
dinner: [ham salad carrot rice]
probe difference lunch dinner
[cheese bread salad rice]
string1: "CBAD" ; A B C D scrambled
string2: "EDCF" ; C D E F scrambled
print sort difference string1 string2
ABEF
The /case refinement allows case-sensitive differences.
probe difference/case [Bill bill Bob bob] [Bart bart bill Bob]
[Bill bob Bart bart]
9.6 Exclude
A variation of the difference function is the exclude function.
It returns the values that are in the first series but not found in
the second series. For example:
probe exclude [1 2 3 4] [1 2 3 5]
[4]
Notice that the above result does not contain 5, as was the case
with difference in the prior section.
probe exclude [Bill Bob Bart] [Bob Ted Fred]
[Bill Bart]
probe exclude "abcde" "ace"
"bd"
10. Multiple Series Variables
Multiple variables can refer to the same series. For instance:
data: [1 2 3 4 5]
start: find data 3
end: find start 4
print first start
2
print first end
4
Both the start and end variables refer to the series. They
have different positions, but the series they reference is the same.
If an insert or remove function is
performed on a series, the values in the series will shift and the
start and end variables may no longer refer to the same
values. For instance, if a value is removed from the series at the start
position:
remove start
print first start
3
print first end
5
The series has shifted to the left and the variables now refer to different
values.
Notice that the index positions of the variables have not changed, but the
values in the series have changed. The same situation would occur when using
insert.
Sometimes this side effect will work to your advantage. Sometimes it will
not, and you will need to correct for it in your code.
11. Modification Refinements
The change, insert, and remove functions can take
additional refinements to modify their operation.
11.1 Part
The /part refinement accepts a count or a position in the
series and uses it to limit the effect of the function.
For example, using the following series:
str: "abcdef"
blk: [1 2 3 4 5 6]
you can change part of str and blk using
change/part:
change/part str [1 2 3 4] 3
probe str
1234def
change/part blk "abcd" 3
probe blk
["abcd" 4 5 6]
You can insert part of a series into the tail of str and
blk using insert/part.
insert/part tail str "-ghijkl" 4
probe str
1234def-ghi
insert/part tail blk ["--" 7 8 9 10 11 12] 4
probe blk
["abcd" 4 5 6 "--" 7 8 9]
To remove part of the str and blk series, use
remove/part. Note how find is used to obtain
the series position:
remove/part (find str "d") (find str "-")
probe str
1234-ghi
remove/part (find blk 4) (find blk "--")
probe blk
["abcd" "--" 7 8 9]
11.2 Only
The /only refinement changes or inserts a block as a block,
rather than its individual values.
Examples:
blk: [1 2 3 4 5 6]
You can replace the 2 in blk with the block [a
b c] and insert the block [$1 $2
$3] at the position of the 5.
change/only (find blk 2) [a b c]
probe blk
[1 [a b c] 3 4 5 6]
insert/only (find blk 5) [$1 $2 $3]
probe blk
[1 [a b c] 3 4 [$1.00 $2.00 $3.00] 5 6]
11.3 Dup
The /dup refinement changes or inserts a value a specified
number of times.
Examples:
str: "abcdefghi"
blk: [1 2 3 4 5 6]
You can change the first four values in a string or block series to an
asterisk(*) with:
change/dup str "*" 4
probe str
****efghi
change/dup blk "*" 4
probe blk
["*" "*" "*" "*" 5 6]
To insert a dash (-) four times before the last value in a string or
block:
insert/dup (back tail str) #"-" 4
probe str
****efgh----i
insert/dup (back tail blk) #"-" 4
probe blk
["*" "*" "*" "*" 5 #"-" #"-" #"-" #"-" 6]
|