User Tools

Site Tools


transio

transio (transparent file I/O access class)

Motivation

Neuroimaging datasets can easily reach very large sizes (e.g. running a study with 2 groups of 24 subjects each and a 3x2x2 within-subject design requires to store a total of at least 48 * 13 = 624 maps, if each of these maps fully covers the brain at a 3mm resolution this amounts to up to 200,000 voxels per map requiring 800,000 bytes of diskspace leading to a total of about 500MByte of required storage capacity).

Working with such large datasets can become difficult if several larger datasets must be “kept in memory” (e.g. time course data being to produce the regression result on top of the regression result), which is why transparent I/O access to files allows Matlab to only have those data in memory that are required for the current task in hand.

To simplify this concept, the class is highly integrated into the xff class, which allows binary data file to be read with “transio access” enabled. To do so, the following syntax can be used (globally) to switch on/off transio access:

xff_transio.m
% enable transio access for all arrays larger than 500k
xff(0, 'transiosize', 5e5);
 
% disable transio access
xff(0, 'transiosize', Inf);
 
% enabled transio access for VTCData elements only
xff(0, 'transiosize', 'vtc', 5e5);
 
% get current xff/transio configuration settings
xff_tio_config = xff(0, 'transiosize');
 
% restore configuration
xff(0, 'transiosize', xff_tio_config);

Requirements

Currently, the transio access is limited to the following conditions:

  • data is stored at a known and fixed “position” in the file (which can be determined at run-time, but must remain the same while the object is used)
  • size of accessed array does not change (other than regular arrays, transio access references do not allow changes in size at run-time)
  • if complex indexing is performed, each index is only used once

Class reference ('help transio')

  transio (Object Class)
 
  FORMAT:       tio_obj = transio(file, endian, class, offset, size);
 
  Input fields:
 
        file        filename where data is stored in
        endian      endian type (e.g. 'le', or 'ieee-be')
        class       numerical class (e.g. 'uint16', 'single')
        offset      offset within file (in bytes)
        size        size of array in file
 
  Output fields:
 
        tio_obj     transio object that supports the methods
                    - subsref  : tio_obj(I) or tio_obj(IX, IY, IZ)
                    - subsasgn : same as subsref
                    - size     : retrieving the array size
                    - end      : for building relative indices
                    - display  : showing information
 
  Note 1: enlarging of existing files (if the array is the last element
          in the file) can be done by adding a (class-independent) sixth
          parameter to the call.
 
  Note 2: both subsref and subsasgn will only work within the existing
          limits; growing of the array as with normal MATLAB variables
          is *NOT* supported--so tio_obj(:,:,ZI) = []; will *NOT* work!

Syntax overview

Creating a transio object

Creating a transio object is done by a call to the constructor (@transio/transio) of the class:

transio_createobj.m
% create a transio object for access of data in a NII file)
niidata = transio('largedata.nii', 'le', 'single', 352, [81, 75, 75, 5780]);

If the file does not exist or is not “large enough” to accommodate the data (filesize < offset + typefactor * product-of-sizes), a sixth argument can be given to the constructor:

transio_extentobj.m
% create a transio object, allowing the underlying file to be grown
niidata = transio('largecopy.nii', 'le', 'single', 352, [81, 75, 75, 5780], true);

Accessing a transio object

In principle, data access works seemlessly, just as with a regular matlab variable…

transio_access.m
% retrieving one volume of data
niivol = niidata(:, :, :, 1581);
 
% setting one volume of data
niidata(:, :, :, 2711) = newvol;

As this class is integrated into the xff class, it can be directly used, for instance to read only time courses of a VTC that fall within a mask:

transio_read_masked_vtc.m
% enable transio for VTC data
xff(0, 'transiosize', 'vtc', 5e5);
 
% load MSK and VTC
msk = xff('*.msk', 'Please select a mask file...');
vtc = xff('*.vtc', 'Please select a VTC file...');
 
% check objects
if isxff(msk, 'msk') && isxff(vtc, 'vtc')
 
    % get mask indices
    maski = find(msk.Mask(:));
 
    % read only the data we need
    maskedvtcdata = vtc.VTCData(:, maski);
end
 
% clear objects
clearxffobjects({msk, vtc});

The referencing of vtc.VTCData(:, maski); issues an overloaded call to @transio/subsref, which then resolves the indices into file positions (handled fairly elegantly, with as little overhead as possible).

And the same syntax can also be used for write access “into a transio” object (reference).

Additional notes

Some of the more typical functions applied to numerical data (plus, minus, times, mtimes, etc.) have been overloaded so that transio objects can potentially be used in expressions in a formula. But as it might be more prudent to use a double-precision version for complex computations, the two functions double and single are implemented as well.

transio.txt · Last modified: 2011/04/04 19:12 by jochen