Parse-xml - Function Summary
Summary:
Parses XML code and returns a tree of blocks.
Usage:
parse-xml code
Arguments:
code - XML code to parse (must be: string)
Description:
A limited XML parser is provided with every REBOL/Core and
REBOL/View. The parser will convert simple XML expressions into
REBOL blocks that can be processed more easily within REBOL.
xml: {
<PERSON>
<NAME>Fred</NAME>
<AGE>24</AGE>
<ADDRESS>
<STREET>123 Main Street</STREET>
<CITY>Ukiah</CITY>
<STATE>CA</STATE>
</ADDRESS>
</PERSON>
}
data: parse-xml xml
probe data
[document none [["PERSON" none ["^/ " ["NAME" none ["Fre
d"]] "^/ " ["AGE" none ["24"]] "^/ " ["ADDRESS" no
ne ["^/ " ["STREET" none ["123 Main Street"]] "^/
" ["CITY" none ["Ukiah"]] "^/ " ["STATE" none ["C
A"]] "^/ "]] "^/ "]]]] |
The XML above is semantically equivalent to writing the REBOL
block:
[
PERSON [
NAME "Fred"
AGE 24
ADDRESS [
STREET "123 Main Street"
CITY "Ukiah"
STATE "CA"
]
]
]
|
Here is a small REBOL function that converts the above XML into
such a REBOL block:
to-rebol-data: func [block /local out] [
out: copy []
foreach [tag attr body] block [
append out to-word tag
foreach item body [
either block? item [
append/only out to-rebol-data item
][
if not empty? trim item [append out item]
]
]
]
out
]
probe to-rebol-data data
[document [PERSON [NAME "Fred"] [AGE "24"] [ADDRESS [STREET "123 Ma
in Street"] [CITY "Ukiah"] [STATE "CA"]]]] |
Note that the function strips extra whitespace from the XML
(using the TRIM function).
If you wish to modify or expand the XML parser for your own
purposes, you can obtain its source code with these lines:
source parse-xml
parse-xml: func [
"Parses XML code and returns a tree of blocks."
code [string!] "XML code to parse"][
xml-language/parse-xml code] |
probe xml-language
make object! [
verbose: false
joinset: func [cset chars][insert copy cset chars]
diffset: func [cset chars][remove/part copy cset chars]
error: func [msg arg][print [msg arg] halt]
space: make bitset! #{
0026000001000000000000000000000000000000000000000000000000000000
}
char: make bitset! #{
00260000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
letter: make bitset! #{
0100000000000000FEFFFF07FEFFFF070000000000000000FFFF7FFFFFFF7F01
}
digit: make bitset! #{
000000000000FF03000000000000000000000000000000000000000000000000
}
alpha-num: make bitset! #{
010000000000FF03FEFFFF07FEFFFF070000000000000000FFFF7FFFFFFF7F01
}
name-first: make bitset! #{
0100000000000004FEFFFF87FEFFFF070000000000000000FFFF7FFFFFFF7F01
}
name-chars: make bitset! #{
010000000060FF07FEFFFF87FEFFFF070000000000000000FFFF7FFFFFFF7F01
}
data-chars: make bitset! #{
00260000FFFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
qt1: "'"
qt2: {"}
data-chars-qt1: make bitset! #{
002600007FFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
data-chars-qt2: make bitset! #{
00260000FBFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
name: [name-first any name-chars]
sp: [some space]
sp?: [any space]
parents: []
new-node: func [name][
if verbose [print ["New tag:" name]]
insert/only tail parents parent
parent: add-kid copy reduce [name none none]]
end-node: func [name][
while [name <> first parent] [
if verbose [print ["unterminated tag:" first parent]]
if empty? parents [error "End tag error:" name]
pop-parent]
pop-parent]
pop-parent: func [][
parent: last parents
remove back tail parents]
add-kid: func [kid][
if none? third parent [parent/3: make block! 1]
insert/only tail third parent kid
kid]
add-attr: func [name value][
if none? second parent [parent/2: make block! 2]
insert insert tail second parent name value]
check-version: func [version][print ["XML Version:" version]]
document: [prolog sp? content to end]
prolog: [sp? xml-decl? any [sp? doc-type-decls]]
xml-decl?: ["<?xml" version-info thru "?>" | none]
version-info: [sp "version" eq [qt1 version-num qt1 | qt2 versi
on-num qt2]]
version-num: [copy temp some name-chars (check-version temp)]
doc-type-decls: [cmt | "<!" thru ">" | "<?" thru "?>"]
element: [cmt | s-tag ["/>" (pop-parent) | #">" any content e-t
ag]]
s-tag: [#"<" tag (node: new-node tag-name) any [sp attribute] s
p?]
e-tag: ["</" tag (end-node tag-name) sp? #">"]
tag: [copy tag-name name]
content: [element | copy data some data-chars (add-kid data)]
attribute: [copy attr-name name eq attr-value (add-attr attr-na
me attr-data)]
eq: [sp? #"=" sp?]
attr-value: [
[qt1 copy attr-data any data-chars-qt1 qt1] |
[qt2 copy attr-data any data-chars-qt2 qt2]
]
cmt: ["<!--" thru "-->"]
parse-xml: func [str][
paroot: parent: copy reduce ['document none none]
parse/case/all str document
paroot]
] |
Related:
build-tag - Generates a tag from a composed block. parse - Parses a series according to rules.
|