R3 Exception/Error Mechanism
Contents | ||
Temporary page: thoughts about the exception/error mechanism.
Updated 11-Apr-2024 - Hit F5 to be sure.
You can post related comments here.
Types of Error Exceptions
There are two flavors of exceptions related to the error! datatype in R3. By combining both of these into a single mechanism we minimize the time required for internal checks that need to be performed within the evaluation pipeline.
Unwinds | These are exceptions, not really errors, where the interpreter unwinds the environment back to a marker that was setup earlier during the process of evaluation. For example exit and return unwind back to just before their function entry points. Unwinding in this way allows the opportunity for proper exiting and cleanup at each layer. These exceptions are represented by error codes < 100. |
Throws | These are harsh exceptions where the interpreter throws itself back to a prior marked environment (code block and index, code and data stacks). They do not exit through the evaluation layers, so any pending states from those layers must be handled in GC (and we super try to minimize that). For example a zero-divide will throw back to a catch marker. The marker is an internal state object created by try (and a few others natives.) These are represented by error codes >= 100. |
Unwinds are intended to be hidden from non-expert users in the form of common functions like exit, return, break, and continue . Their error! values are minimal non-allocating structures because not a lot of information needs to be passed back. In addition, they are passed back as the return value of all functions they encounter on their path back.
Throws, on the other hand, are intended for users to capture. Their error! values point to dynamically allocated objects because user code may want to examine the details of the exception.
Should we CATCH Unwinds?
Currently, unwinds propagate backwards as function return values. When they hit a try or other error catcher, they simply pass through it like any other non-error value.
For example, in this line:
val: try [if true [return 10]]
the return function builds an error! value and returns it, which returns from if, and then returns from try. No throw has occurred and the try error handling mechanism does nothing.
This is the correct behavior. The try should not interfere with unwind exceptions.
Exception Marker Scoping
For unwinds, REBOL uses a form of dynamic scoping. This can easily be seen here:
f1: does [do b1]
b1: [return 10]
f1
10
The return function does not know anything about context, it only understands function frames. All it does is unwind to the closest function frame.
This behavior is not really consistent with the definitional scoping ("runtime lexical scoping") rules of REBOL because function exiting becomes non-lexical, it's run-time based.
This example code (simplistic) does not behave as expected:
c1: func [arg] [do arg]
f1: does [c1 [return 10] return 20]
f1
20
Normal lexical scoping rules make us think that the return 10 should apply to the f1 function, but actually it applies to the c1 function; thus, the result of 20 is returned. So, in this one line, within the same lexical context, the return applies to different functions.
One way to solve this problem is to bind the return function into the current function context. This requires that the function's context object maintain a field for the current return function. The cost of this method is one extra value to be stored in the function's context (memory cost) and the extra binding time for the function (processing cost.)
I believe that these are minimal costs compared to the benefits of a function-scoped return mechanism.
Possible Return Method
Since before the start of the R3 project, I've been considering how best to implement a function-scoped return. My plan was (and is) to implement a function attributes mechanism that adds more than just this one feature. In summary, it would support:
- Function-scoped return and exit
- Optional specification of datatype returned
- Common method for other attributes.
The format would be simple: allow set-words within function specifications. For example:
f1: func [arg [integer!] return: [integer!]]
The advantages of this approach are:
- Clear that it means something different (from other arguments)
- Obvious what it means (related to return action)
- A nice way to indicate datatype of the return.
- Easy to bind, the word is in the spec block.
- Allows return as a dynamic function to still work (when return not in function spec.)
The disadvantages are:
- Needs to bind an extra word on function creation.
- Extra value on function frame.
- User must remember to do it.
- Efficient evaluation implementation will be tricky.
True Error Capture
There is a minor disadvantage of the try method of error handling. It's a data-return mechanism, not an evaluation entry mechanism. In other words, you get back a value that tells you there was an error, but you don't get sent to a function to handle it. So, you don't in fact know that an error happened, you only know that you got an error value returned.
The solution is to evaluate a function, but only if an error occurred. Try supports this with a refinement:
try/except [block] [error handler]
The error handler will run only if an error occurs.
In addition, you can use a function and pass it the error value:
try/except [block] func [error] [error handler]
Note that for an error, the value returned from try will be that returned from the error handler.