REBOL 3 Docs Guide Concepts Functions Datatypes Errors
  TOC < Back Next >   Updated: 3-Aug-2010 Edit History  

REBOL 3 Functions: split

split  series  dlm  /into

Split a series into pieces; fixed or variable size, fixed number, or at delimiters

Arguments:

series [series!] - The series to split

dlm [block! integer! char! bitset! any-string!] - Split size, delimiter(s), or rule(s).

Refinements:

/into - If dlm is an integer, split into n pieces, rather than pieces of length n.

See also:

extract   parse  

Contents

Description

The split function is used to divide a series into subcomponents. It provides several ways to specify how you want the split done.

Split into equal segments:

Given an integer as the dlm parameter, split will break the series up into pieces of that size.

probe split "1234567812345678" 4
["1234" "5678" "1234" "5678"]
["1234" "5678" "1234" "5678"]

If the series can't be evenly split, the last value will be shorter.

probe split "1234567812345678" 3
["123" "456" "781" "234" "567" "8"]
["123" "456" "781" "234" "567" "8"]
probe split "1234567812345678" 5
["12345" "67812" "34567" "8"]
["12345" "67812" "34567" "8"]

Split into N segments:

Given an integer as dlm, and using the /into refinement, it breaks the series into n pieces, rather than pieces of length n.

probe split/into [1 2 3 4 5 6] 2
[[1 2 3] [4 5 6]]
[[1 2 3] [4 5 6]]
probe split/into "1234567812345678" 2
["12345678" "12345678"]
["12345678" "12345678"]

If the series can't be evenly split, the last value will be longer.

probe split/into "1234567812345678" 3
["12345" "67812" "345678"]
["12345" "67812" "345678"]
probe split/into "1234567812345678" 5
["123" "456" "781" "234" "5678"]
["123" "456" "781" "234" "5678"]

Split into uneven segments:

If dlm is a block containing only integer values, those values determine the size of each piece returned. That is, each piece can be a different size.

probe split [1 2 3 4 5 6] [2 1 3]
[[1 2] [3] [4 5 6]]
[[1 2] [3] [4 5 6]]
probe split "1234567812345678" [4 4 2 2 1 1 1 1]
["1234" "5678" "12" "34" "5" "6" "7" "8"]
["1234" "5678" "12" "34" "5" "6" "7" "8"]
probe split first [(1 2 3 4 5 6 7 8 9)] 3
[[1 2 3] [4 5 6] [7 8 9]]
[(1 2 3) (4 5 6) (7 8 9)]
probe split #{0102030405060708090A} [4 3 1 2]
[#{01020304} #{050607} #{08} #{090A}]
[#{01020304} #{050607} #{08} #{090A}]

If the total of the dlm sizes is less than the length of the series, the extra data will be ignored.

probe split [1 2 3 4 5 6] [2 1]
[[1 2] [3]]
[[1 2] [3]]

If you have extra dlm sizes after the series data is exhausted, you will get empty values.

probe split [1 2 3 4 5 6] [2 1 3 5]
[[1 2] [3] [4 5 6] []]
[[1 2] [3] [4 5 6] []]

If the last dlm size would return more data than the series contains, it returns all the remaining series data, and no more.

probe split [1 2 3 4 5 6] [2 1 6]
[[1 2] [3] [4 5 6]]
[[1 2] [3] [4 5 6]]

Negative values can be used to skip in the series without returning that part:

probe split [1 2 3 4 5 6] [2 -2 2]
[[1 2] [5 6]]
[[1 2] [5 6]]

Simple delimiter splitting:

Char or any-string values can be used for simple splitting, much as you would with [bad-link:functions/parseall.txt], but with different behavior for strings that have embedded quotes.

probe split "abc,de,fghi,jk" #","
["abc" "de" "fghi" "j"]
["abc" "de" "fghi" "jk"]
probe split "abc<br>de<br>fghi<br>jk" <br>
["abc" "de" "fghi" "j"]
["abc" "de" "fghi" "jk"]

If you want to split at more than one character value, you can use a [bad-link:functions/charsetbitset.txt].

probe split "abc|de/fghi:jk" charset "|/:"
["abc" "de" "fghi" "j"]
["abc" "de" "fghi" "jk"]

Note that for greater control, you can use simple parse rules:

probe split "abc     de fghi  jk" [some #" "]
["abc" "de" "fghi" "j"]
["abc" "de" "fghi" "jk"]


  TOC < Back Next > REBOL.com - WIP Wiki Feedback Admin