REBOL Technologies

Seek mode added for random access to large files

Carl Sassenrath, CTO
REBOL Technologies
31-Aug-2005 16:55 GMT

Article #0199
Main page || Index || Prior Article [0198] || Next Article [0200] || Post Comments || Send feedback

A new seek mode has been added to the open function, allowing you to rapidly access large files. This feature has been needed for quite some time (because /direct does too much buffering, slowing down access). Now you can read and write those huge files like photos, music, video, logs, and others that have been a problem in the past.

Here's a brief summary of the new seek mode. We will be providing more detailed documentation soon.

Opening File for Seek

To open a file in random-access mode (with no buffering other than that done at the OS level):

port: open/seek/binary %bigfile.bin

Although you can drop the /binary refinement and open the file in string mode, we suggest that you use binary mode because we might still make a few changes to the string mode (for line termination processing).

Reading From the File

To read 1000 bytes from file position 1234:

data: copy/part at port 1234 1000

Remember that index positions are one-based in REBOL. That is, the first byte is at position one, not position zero.

To read the entire file in 10K chunks:

size: 10240

while [not tail? port] [
    data: copy/part port size
    port: skip port size
]

Of course, if you're paying attention you will know that forskip can be used instead of the while above, just like any series:

forskip port size [data: copy/part port size]

Writing to the File

To overwrite part of the file:

change at port 1234 "This data is written into the file."

As with all series, the change function returns the next position, allowing you to write code like this:

port: at port 1234
port: change port "This string is first. "
port: change port "This string is second. "

To append, you can use change above, or you can also use insert:

insert tail port "This is the end of the file."

Insert also returns the tail position, so you can use that in loops, etc.

The append function can also be used:

append port "End of file."

Closing File

Be sure to close your file when you are done. The normal close function does the job:

close port

A File Join Function

Here is an example that joins two files -- even very large files. The append function is used (because it calls insert on the tail of the port.)

join-files: func [
    file1 "Target file"
    file2 "Source file (to copy to end of target file)"
    /size num "Optional transfer size"
    /local port1 port2
][
    port1: open/seek/binary file1
    port2: open/seek/binary file2
    if not num [num: 10000]
    forskip port2 num [
        append port1 copy/part port2 num
    ]
    close port1
    close port2
]
Performance Warning

When using /seek, the file is opened without buffering at the REBOL layer. Make wise choices regarding the sizes of your reads and writes. For example, it's not a good idea to read a file one byte at a time. Instead, read 10000 bytes into your own buffer, then access that buffer a byte at a time if necessary.

Ok... so finally... if you think you want to give this new file mode a try, a REBOL/Core beta test version of it has been uploaded to the REBOL Interim Builds area of this web site. If you find any problems, please let us know right away. Thanks.

Post Comments

Updated 18-Nov-2024   -   Copyright Carl Sassenrath   -   WWW.REBOL.COM   -   Edit   -   Blogger Source Code