REBOL 3 Docs | Guide | Concepts | Functions | Datatypes | Errors |
TOC < Back Next > | Updated: 9-Nov-2010 Edit History |
The previous sections defined extensions as a package that can contain both REBOL and C code and data, and showed you how to use them. This section explains how to make your own extensions.
An extension is a type of module.
Extensions easy to create. They can contain both C and REBOL code, and in less than one page of C you can create a useful extension.
Before you can use an extension, you must import it. That loads the module's code and data, including both REBOL or C.
There are four main concepts that you need to know to write your own extensions:
DLL functions | three standard functions for initializing the extension, dispatching functions, and cleanup. |
init block | a text string that defines the extension and its options, variables, exports, and initialization. |
commands | the native functions provided by your extension. |
reb-lib | a function API for accessing REBOL datatypes, structures, and services. |
Each of these will be explained in detail below.
To give you a general idea for what a extension looks like, here is an example (written in the C language but, a similar technique can be done in any compiled language.)
#include "reb-c.h" #include "reb-ext.h" const char *init_block = "REBOL [\n" "Title: {Example Extension Module}\n" "Name: example\n" "Type: module\n" "Exports: [add-mul]\n" "]\n" "add-mul: command [{Add and multiply integers.} a b c]\n" ; RXIEXT const char *RX_Init(int opts, RL_LIB *lib) { RXI = lib; if (!CHECK_STRUCT_ALIGN) exit(100); return 0; } RXIEXT int RX_Call(int cmd, RXIFRM *frm) { RXA_INT64(frm, 1) = (RXA_INT64(frm, 1) + RXA_INT64(frm, 2)) * RXA_INT64(frm, 3); return RXR_VALUE; }
After compiling that code into a DLL, you can use it in your REBOL code:
import %example.dll
print add-mul 1 2 3
9
The speed of function evaluation is about the same as other REBOL native functions. (Normally within 5%.)
As shown above, a extension is a dynamically loaded library (DLL). When the extension is loaded by REBOL, it expects to find one or more pre-defined function names.
RX_Init | called when the extension has been loaded. The purpose is to provide any special option flags as well as a pointer to the extension library (RL_LIB). |
RX_Quit | called when the extension is no longer needed. This is optional. |
RX_Call | dispatches the native command functions defined by the extension. This function is passed the command number and an array that holds the command's arguments (called the command frame.) |
After the DLL has been loaded, its RX_Init function will be called. If the RX_Init function cannot be found in the DLL, REBOL will throw an error that the extension is not valid.
The RX_Init function will perform these actions:
The init string is REBOL source similar to that used to define modules. It can define functions (both internal and exported), variables, strings, or other data used by your extension.
In the code example above, the init_block holds this source:
REBOL [ Title: {Example Extension Module} Name: example Type: module Exports: [add-mul] ] add-mul: command [{Add and multiply integers.} a b c]
Although we use the quote mechanism of C to embed it, you can use any technique you want, as long as what is returned is a valid ASCII or UTF-8 string.
The native functions defined within a extension are called commands. They are similar to the native functions found in REBOL, and evaluate at the full speed of the CPU.
Each command has two parts:
spec | the interface specification (in REBOL format) that provides a help string (title) and lists the arguments for the function. |
body | the C code that makes the command do its job. |
In the example above, the spec for the add-mul command was defined by this line:
add-mul: command [{Add and multiply integers.} a b c]
You will note that this is identical to the function definition methods used throughout REBOL. And, it should be noted that the command word is a specially defined function itself, similar to func and function used for defining other functions. More information about how command works is described below.
The body of the add-mul function is found in this code:
RXIEXT int RX_Call(int cmd, RXIFRM *frm, REBCEC *ctx) { RXA_INT64(frm, 1) = (RXA_INT64(frm, 1) + RXA_INT64(frm, 2)) * RXA_INT64(frm, 3); return RXR_VALUE; }
The details of the RXIFRM structure will be explained below. Also, this example is a bit simplistic because the extension only handles a single command (add_mul). More examples will be shown below.
In the code above, the add-mul command arguments have no datatype qualifier; however, for most code you will want to provide a list of one or more valid datatypes. This makes it possible for the datatype to be verified prior to calling your native code. It also makes error messages easier to understand.
For example, here is a better definition for the add-mul command:
add-mul: command [ {Add and multiply integers.} a [integer!] b [integer!] c [integer!] ]
If an attempt is made to pass a datatype other than integer, the normal error message will be thrown.
You can also accept multiple datatypes for the arguments of your function. For example, if you want to accept integer and decimal:
add-mul: command [ {Add and multiply integers.} a [integer! decimal!] b [integer! decimal!] c [integer! decimal!] ]
Of course, now the C code body of your function will need to check which datatype is being passed.
The datatypes allowed for commands are listed in the Datatypes section below.
Within the DLL, the RX_Call function dispatches command functions. For extensions with only a few commands, all of the related code can be put into the same RX_Call function. For extensions with many commands, you may want to build a function table and redirect to sub-functions.
In the arguments to RX_Call the cmd arg provides the index number for the command, and you can use if or switch statements to process the correct command. If you only have a few commands, if is probably faster. If you have several commands, switch will be faster.
RXIEXT int RX_Call(int cmd, RXIFRM *frm, REBCEC *ctx) { if (cmd == 0) { } else if (cmd == 1) { } ... } RXIEXT int RX_Call(int cmd, RXIFRM *frm, REBCEC *ctx) { switch (cmd) { case 0: <command code> break; case 1: <command code> break; case 2: ... } }
If you have a larger number of commands, you will want to create an enum to help relate command numbers to their function names.
Command arguments are passed to RX_Call in an argument frame (a structure) accessed via the frm pointer which is of the RXIFRM type.
A frame consists of two parts:
types | a byte array of datatypes. The zeroth byte provides the number of arguments. The size of this array is the number of arguments rounded up to a multiple of eight. Normally, it only occupies 64 bits (enough to support seven function arguments.) |
values | 64 bit values. The format of each value is dependent on the argument's datatype. For example, if the datatype is an integer, it's value is a 64 bit integer. If the datatype is a decimal, the value is a 64 bit IEEE float (double). The RXIARG typedef provides a union to properly access each type of value. |
Graphically, a frame looks like this:
Command frame |
---|
type array (64 bits) |
argument 1 (64 bits) |
argument 2 (64 bits) |
argument 3 (64 bits) |
... |
To make it easier to access argument related information, macros are provided:
RXA_COUNT(frm) returns the arg count RXA_TYPE(frm,n) returns the datatype for the n-th arg
To access a specific argument, such as an integer, you write:
RXA_INT64(frm, n)
Where the value of n normally begins with 1 (because the 0 slot is the type array)
Here is a list of these datatype specific macros:
RXA_INT64(f,n) integer! RXA_DEC64(f,n) decimal! and percent! RXA_LOGIC(f,n) logic! RXA_CHAR(f,n) char! (32 bits) RXA_TIME(f,n) time! RXA_DATE(f,n) date! (encoded) RXA_WORD(f,n) word! (all) RXA_PAIR_X(f,n) pair! RXA_PAIR_Y(f,n) pair! RXA_TUPLE(f,n) tuple! RXA_SERIES(f,n) series! (reference) RXA_INDEX(f,n) series! (index) RXA_HANDLE(f,n) any pointer (32 bit address)
In addition, this macro is provided:
RXA_REF(f,n) refinement flag
Refinements are discussed below.
Similar to other functions, commands can accept multiple datatypes for a single argument. Within your C code you will need to be able to detect which datatype has been passed, and access its value properly.
Here is an example command that allow both an integer and a decimal for its argument:
cmd: command [n [integer! decimal!]]
The body code would be something like:
RXIEXT int RX_Call(int cmd, RXIFRM *frm, REBCEC *ctx) { i64 i; d64 d; if (cmd == 1) { if (RXA_TYPE(frm, 1) == RXT_INTEGER) { i = RXA_INT64(frm, 2); } else { d = RXA_DEC64(frm, 1); } ... } }
Note that the i64 and d64 are general typedefs used to abstract compiler differences (e.g. on older MSVC the use of _int64 for 64 bit integers.)
Many examples are provided in the extensions: example extensions section.
As with other functions, commands are allowed to accept refinements are arguments. Such refinements are passed as normal arguments with a value of none or true. A simple test will determine if the refinement has been specified.
For example, if you write a special trigonometric function, you may want to provide a refinement to specify either radians rather than degrees:
hyper-sine: command [d [decimal!] /radians]
This code will handle the refinement flag:
d = RXA_DEC64(frm, 1); if (RXA_REF(frm, 2)) rads = TRUE; ...
Note that you do not need to check the datatype of the argument. The REBOL extension caller will assure that none has a zero 32 bit value, and that true has a non-zero 32 bit value.
The integer return code from RX_Call determines what the command returns. Like other functions, a command can return none, one, or multiple results.
An enum of results is defined. The constants are:
RXR_UNSET | Do not return a value. |
RXR_NONE | A shortcut for returning NONE. |
RXR_TRUE | A shortcut for returning TRUE. |
RXR_FALSE | A shortcut for returning FALSE. |
RXR_VALUE | Return a single value (that found in the arg[1] position). |
RXR_BLOCK | A shortcut method to return multiple values. See below. |
RXR_ERROR | Return an error (special case.) |
RXR_BAD_ARGS | Throws the error: Bad command arguments. This is a generic result you can return for errors in simple functions. |
RXR_NO_COMMAND | Throws the error: The command at that index is not implemented. |
The first few are shortcuts to make your code simpler and smaller for such cases.
RXR_VALUE indicates that you want to return the first argument of the frame as the result using its indicated datatype.
For example, take this code that adds the first and second argument, then returns the first:
RXA_INT64(frm, 1) += RXA_INT64(frm, 2); return RXR_VALUE;
As a variation, in this code the arguments are integers, but it returns a decimal result:
RXA_DEC64(frm, 1) = (d64)(RXA_INT64(frm, 1) + RXA_INT64(frm, 2)); RXA_TYPE(frm, 1) = RXT_DECIMAL; return RXR_VALUE;
When multiple results are needed, the command must return a block. Often, you command will only need to return just a few values, so a shortcut technique is provided.
If you store your results within the argument slots of the frame, and also set their datatypes within the type array, they will be considered a block if you return the RXR_BLOCK return code. You must also indicate how many values are within the block.
Here's an example that returns three values, an integer, decimal, and a time:
RXA_COUNT[frm] = 3; RXA_INT64(frm, 1) = 1; RXA_TYPE(frm, 1) = RXT_INTEGER; RXA_INT64(frm, 3) = 2.2; RXA_TYPE(frm, 2) = RXT_DECIMAL; RXA_INT64(frm, 3) = 1200000000; RXA_TYPE(frm, 3) = RXT_TIME; return RXR_BLOCK;
You can only return up to seven values in this way. Beyond that, you must use the RL_Make_Block function and append each value into the new block.
The examples shown above are valid for the most common command frame, those with less than seven arguments. It is very rare to require more than seven arguments to a function, and in general programming practice, if you find that necessary, then it may be better to pass your arguments encapsulated within a block.
Although the initial implementation of commands does not support extended frames, we may add it in the future if it seems important for some reason.
For frames larger than seven arguments, the type array is expanded in increments of 8 bytes. This means that argument references would be shifted by the appropriate amount. To better abstract such offsets, new macros would be provided to account for those offsets.
As described above the command word is a special function that creates new command functions within an extension module.
Basically, command calls make on the command! datatype, in the general form:
make command! reduce [args module index]
where:
args | is the argument spec for the new function. |
module | is the extension module context and is used to reference back to the extension dispatcher. |
index | is the dispatch index for a specific command. |
You can directly create commands using this make method; however, in addition to the argument spec, you will need to provide the module and correct dispatch index each time.
To make extension modules easier to read, the command function method was created. This function is defined within the context of the extension module allowing the module argument to be implied (with self.) In addition, the command dispatch index can be a module local variable that is auto-incremented for each new command.
This mechanism simplifies command definitions and requires very little code to do so.
Here's the code that the module system automatically inserts into each module:
cmd-index: 0 command: func [ "Define a new command for an extension." args [block!] ][ make command! reduce [args self ++ cmd-index] ]
To work properly, this code must be bound to the context of the module. That is why it resides within the module itself.
It should be noted that other fields are also inserted into the extension module. The system/standard/extension object defines those fields and is used by the system's [bad-link:functions/load-extension.txt] native function.
These datatypes are currently supported for commands.
Name | Description |
---|---|
logic | An integer representing TRUE and FALSE. |
integer | A 64-bit integer. |
decimal | 64-bit IEEE floating point (double). |
percent | 64-bit IEEE floating point (double). |
char | A character as a 32 bit code point. |
pair | Two 32 bit signed integers for x and y. |
tuple | A length byte followed by seven bytes. (Note truncation.) |
time | A 64 bit time in nano-seconds. |
date | A 32 bit encoded date and time zone. |
word | A 32 bit identifier for a word. |
set-word | A 32 bit identifier for a word. |
get-word | A 32 bit identifier for a word. |
lit-word | A 32 bit identifier for a word. |
refinement | A 32 bit identifier for a word. |
The series datatypes are indirect datatypes and can be divided into these general groups:
Group | Description |
---|---|
strings | Including: string, file, email, url, tag, and issue. |
blocks | Including: block, paren, path, set-path, get-path, and lit-path. |
special | Including: binary, bitset, image, and vector. |
A few special datatypes are also allowed:
Name | Description |
---|---|
unset | Means that a variable is not initialized or a function returned no result. |
none | No value. (For example, a find found no match.) |
handle | A way to store code and data pointers. |
Within extensions it can be quite useful to access words as symbols. For example, if you are writing an extension that has it's own special control dialect, you will want to easily handle the words that are part of it. (If you were familiar with AREXX in AmigaOS, then you know what can be done with just little programming effort.)
There are generally two ways to use a word! type:
symbols | words that represent themselves (the word itself is the meaning) |
variables | words used to represent storage |
In the R3 1.0 extension interface, words are supported as symbols only.
When you specify your extension, within its module initialization, define a block of words. Later within your code, the word will be indicated by its index within that block.
For example, if within your init block you define:
words: [jpeg mpeg gif tiff] resize-image: command [img [image!] 'action [word!]]
then you can use this C code to determine which word was passed:
switch (RXA_WORD(frm, 2)) { case 1: // jpeg ... case 2: // mpeg ... case 3: // gif ... case 4: // tiff ... }
This same technique can be used for words found in blocks. (See block value access below.)
Now, writing:
resize-image data 'gif
will enter the case 3 code above. (Of course, this can also be done using [bad-link:datatypes/refinements.txt], see earlier notes.)
The extension library provides functions for accessing and creating strings and blocks. These functions are access via macros that use the library pointer passed in RX_Init.
RL_MAKE_BLOCK | make a new block of given length |
RL_MAKE_STRING | make a new string of given length and width |
RL_MAP_WORDS | map a block of words to their canonical symbol identifiers |
RL_FIND_WORD | find word in an array of symbol identifiers |
RL_SERIES_INFO | get series info: length, size, etc. |
RL_GET_CHAR | get a char from a string |
RL_SET_CHAR | set a char in a string |
RL_GET_VALUE | get a value from a block |
RL_SET_VALUE | set a value in a block |
RL_GET_STRING | get string as an array |
It is likely that more functions will be added as needed.
Note: allocation GC concerns
If you write an extension to accesses external APIs including standard OS libraries, you will need to be careful. R3 in general uses an asynchronous model for I/O. If you call APIs that perform I/O which may block, then your REBOL process will also block during that I/O. This cause your GUI to block or for other pending I/O operations to overflow or fail.
If the external API does not block, then it's probably fine to call it. However, for blocking functions, a better solution is to write them as an asynchronous R3 device. This is a special type of extension. (As of this 1.0 draft release, this is not available, we want to make you aware of it.)
RX_ | the main functions of the DLL itself (not the API) |
RL_ | functions in the reb-lib (REBOL library) |
RXT_ | type (datatype) identifiers for command arguments |
RXA_ | command argument access macros |
RXR_ | command return codes |
Editor note: pending
Editor note: mention make-host-ext tool for building the body of the module
dealing with handles
Pending features: codecs, devices
Output a DLL.
equivalent to:
make command! [specs extension-handle func-num]
More information about this will be available in the advanced section.
TOC < Back Next > | REBOL.com - WIP Wiki | Feedback Admin |