plp.MKDA

Performs a multi-level kernel density analysis.

Motivation

Single studies can give strong evidence for a certain hypothesis (such as that a specific area of the brain is involved in processing a subset of stimuli from a greater class), but usually it is required to replicate findings for the scientific community to agree that the findings are not spurious and conclusive. Unfortunately, it is very rare that study protocols are applied in different settings (different population sample and experimenter/investigator) without additional modifications to the methodology (slightly different stimuli or timing, different focus of interest, refined hypothesis, etc.). And additionally, mostly driven by a relatively protectionist attitude towards ones data and methods, it is not overly common to share raw data sets among competing labs, which makes it difficult to perform “mega-analyses” (pooling actual data from different experiments to strengthen common conclusions).

Instead, several scientists have suggested and subsequently demonstrated the power of performing meta analyses, which work on the results (mostly in the form of tables containing so-called peak coordinates) of studies found in the literature to test and reject the null-hypothesis that those peak coordinates are distributed randomly across the search space (usually a set of coordinates defined by all gray matter locations throughout a standard brain space).

This is done by computing a summary statistic for each voxel that indicates how many (and possibly also how close) peak coordinates of respective contrasts have been reported in close proximity to any given location. To threshold the resulting 3D dataset (image), a Monte-Carlo simulation with many (commonly at least 5,000) iterations is performed that scrambles the locations of coordinates and then re-computes the statistic to estimate a null-distribution.

Requirements

To use this method, a dataset containing at least the (X/Y/Z) coordinates as well as a unique study identifier must be converted into a PLP object of NeuroElf using the importplpfrommkdadb function, which supports reading both tabular text files (one line per peak coordinate) or standardized MAT files which stem from the MKDA Matlab toolbox written by Tor Wager.

Method reference (plp.Help('MKDA'))

 PLP::MKDA  - perform a multi-level kernel density analysis
 
 FORMAT:       mvmp = plp.MKDA([opts])
 
 Input fields:
 
       opts        1x1 struct with options
        .applymask apply mask to output results (default: false)
        .asimiter  number of (false-positive) simulation iterations (5000)
        .asimkeep  boolean flag, keep (average of) simulated maps (false)
        .asimmask  mask object (HDR, MSK or VMR, default: colin brain)
        .asimsmpl  random coordinates sampling, either {'full'} or 'near'
        .asimthr   thresholding for group maps, one of
                   'fprXX'   - reject XX% of false positive voxels under 0
                   'fweXX'   - reject XX% of maps with any false positives
                   'rescale' - rescale (each voxel!) to fit 0 distribution
        .bbox      bounding box (passed into newnatresvmp, default: [])
        .cond      conditional statement, e.g.
                   '$Study >= 1 & $Study <= 3 & $Type == 2'
                   whereas names are replaced by their column data
        .contcomp  contrast computation, 'diff', 'excl', or {'wexcl'}
        .contexclw exclusion weight (default 0.5)
        .contnames cell array with names for contrast terms
        .contnull  null-distribution of contrasts either of
                    'full'      - shuffle assignments overall and within
                    'labels'    - shuffle labels within studies only
                   {'spatial'}  - construct null distribution by random
                                  spatial resampling within .asimmask
        .contrasts contrast definition cell array, default: {[1]}
        .grpmeth   group statistics method, one of 'ost', 'sum', {'wsum'}
        .indivmaps keep individual (study-specific) maps (default: true)
        .jbmeth    join-blobs method, either of 'max', {'rsum'}
        .pbar      progress bar object (either xfigure or xprogress, [])
        .res       VMP resolution (passed into newnatresvmp, default: 3)
        .scale     scaling flag, either of 'indic', {'toone'}
        .smkern    single or multiple smoothing kernels in mm (default: 8)
        .smkinterp smoothing kernel interpolation (default: linear)
        .smkmdist  distance from peak where the value is max (default: 0)
        .smkres    smoothing kernel resolution (default: 1)
        .studysel  selection of studies (default: all)
        .stwf      per-study weight formula, e.g.
                   'sqrt($GroupSize) .* (0.75 + 0.25 .* ($RFX==1))'
                   whereas names are replaced by their column data
        .stwp      relative points-per-study weight for testing, one of
                    'confidence'  - weigh 0.5 + gammapdf((1:maxp)./meanp)
                    'logpoints'   - weigh by 1 + log(nr of points)
                   {'none'}       - assume all studies contribue equally
                    'points'      - weigh by number of points per study
                    'sqrtpoints'  - weigh by sqrt(nr of points)
        .unique    only use unique points (within study, default: true)
        .usecons   flag, if given must be either the name or number of
                   the column containing the contrast identifier
        .usesize   flag, if given must be either the name or number of
                   the column containing the (original) cluster size
        .usevalue  flag, if given must be either the name of number of
                   the column containing the (original) peak value
 
 Output fields:
 
       mvmp        VMP container with one (set of) map(s) for each contrast
 
 Notes: all points must be in the same coordinate space (so any conversion
        should occur prior to storing the coordinates in the PLP object)
 
        the .asimsmpl parameter either samples the coordinates from all
        voxels in the .asimmask or, alternatively, it adds a displacement
        of between 0.5 and 1.5 times smpkern to each coordinate, checking they
        remain within the mask
 
        the .contcomp parameter defined whether contrast (within study)
        are a straight difference between the blob maps ('diff') or
        whether if both values of the term are ~= 0, a zero values is used
 
        the non-spatial .contnull settings ('full' and 'labels') make
        most sense when used with a rescaling approach (setting .asimthr
        to 'rescale'), as otherwise the FWE clearly will be too stringent
 
        the .contrasts cell array should contain cells that give the
        identifiers for the positive and negative terms in the contrast,
        e.g. in a meta analysis of three contrasts, this list could be
        {3, [3, -1], [3, -2]}, which would then compute three contrasts:
        3 (on its own, i.e. vs. baseline), 3 > 1, and 3 > 2
 
        the .scale parameter either creates a smooth gaussian kernel
        around the peak, fixing the ceiling value to 1 ('toone'), or sets
        voxels above .5 (after toone scaling) to 1 (indicator);
        if .scale is set to 'indic' (pure indicator), the kernel size is
        interpreted as a radius!
 
        the .smkmdist parameter can be used to have the initial blobs be
        either complete gaussian blobs (0, default) or flattened in a way
        such that within a specific range the value is actually the max
        value, after which it drops off using the gaussian smoothing;
        this allows to create "smooth" but still "solid" blobs, e.g. by
        setting the smkmdist to the same value(s) as smkern
 
        the .smkres parameter can be set to a smaller resolution in case
        the coordinates (plp.Points) contain non-rounded entries (e.g.
        after MNI->TAL or vice versa transformation)
 
        the .stwf (per-study weighting formula) field will default to '1'
        unless a valid 'GroupSize' or 'N' column is found, in which case
        the .stwf field will be set to 'sqrt($GroupSize)' or 'sqrt($N)'
 
        the .stwp parameter sets an additional weight on studies based
        on how many points are available for the study (at any given
        contrast iteration); the 'confidence' setting tries to determine
        a "sweet spot", the number of points that a "good" (average)
        study should report, and, at most, reduces the weight by 0.5 for
        studies that have either no points or the maximum number of points

Options

.applymask

This is a boolean flag (true or false) with default false. If set to true, the resulting summary statistical map will be masked with the dataset used to determine the randomization space. Please be aware that this sets all voxels outside the mask to 0 which makes interpolated images have a “smaller” fringe given the fact that interpolated values will be lower.

.asimiter

The number of Monte-Carlo simulation iterations, default value 5000.

.asimkeep

This is a boolean flag (true or false) with default false. If set to true, the resulting VMP object not only contains the (thresholded) map of the actual (true) coordinates but also the average map of the Monte-Carlo simulated maps (which can be useful to understand why certain locations might be missing in the output).

.asimmask

This specifies the mask from which the randomization space is estimated. By default, the _files/colin/colin_brain_ICBMnorm.vmr dataset is used (which also contains cerebellum and brain stem; in case you wish for a more stringent mask, you can use the drawing capabilities of NeuroElf to further restrict the search space).

.asimsmpl

This flag can be set to either 'full' or 'near', with 'full' being the default. When 'full' is selected, the Monte-Carlo simulation draws the null-distribution coordinates from the entire search space, regardless of where the actual (true, reported) coordinates are located. This is clearly the default, as without further information the assumption can only be that, in absence of additional restrictions, false-positive peak coordinates would be distributed throughout the entire search space (usually the brain/gray matter voxels). However, given that subsequent publications are often required to already reconcile their findings with prior reported data, a considerable (but hard to estimate!) bias might already be in the data (such that coordinates which don't fit into the already existing view of a phenomenon might be under represented, leading to a non-uniform null-distribution for false-positives). This can be addressed using the 'near' setting, for which random coordinates are drawn from the mask but, for each reported coordinate, come from a sub-set of this mask which is defined as the sphere of twice the kernel radius, excluding all coordinates that are up to half the kernel radius away. In short, setting this flag to 'near' uses a more stringent null distribution.

.asimthr

This flag selects the thresholding method and is one of

'fweXX' - which sets a family-wise error criterion, such that the highest summary statistic value is found in at most 5% of the simulated maps
'rescale' - this approach stores all simulated maps and, at the end, for each voxel converts the summary statistic into a Z-score which represents the proportion (stochastic probability) of maps for which this voxel had a higher value under the null distribution

The 'rescale' approach subsequently allows to use arbitrary thresholding techniques.

.bbox

This optional setting is passed into the newnatresvmp function and defines the maximal space for which the computation is performed. The default value is the empty array.

.cond

A conditional statement can be given, which restricts the selection of coordinates (e.g. to specific types of contrasts, populations, fMRI methods, timing types, etc.). The statement must be made up of logical expressions that, for each row of data, evaluate to either true (row is included in the selection) or false (row is not included). The syntax for each such expression is by using the Dollar sign ($) followed by the name of the header field (e.g. Year or ContrastType) and a numerical expression (at the moment string constants cannot be used). Parentheses can be used to group expressions accordingly.

.contcomp

This flag sets the computation type for cases where a differential contrast is investigated (e.g. “negative > neutral images”). The possible values are

'diff' - a straight difference between the maps is computed
'excl' - after computing the difference an exclusion is performed (all voxels that are implicated, even if only by a fraction, in both contrasts are set to 0)
'wexcl' - a weighted exclusion, such that voxels for which both contrasts yield values greater than a threshold are set to 0); this is the default

This computation is performed within study.

.contcompw

Threshold for the 'wexcl' setting of the .contcomp flag. The default value is 0.5.

.contnames

A cell array describing the names used in the output VMP object. Must match the number of defined contrasts.

.contnull

This flag sets the type of null distribution for contrasts (where two sets of coordinates are compared). Possible values are

'full' - this shuffles the assignments of contrasts to studies and labels of coordinates to contrasts
'labels' - this shuffles only the labels of coordinates to contrasts within studies
'spatial' - this constructs a null distribution by randomized spatial sampling

The difference between spatial and label randomization is that the null distribution is of a different kind. Whereas the spatial null distribution tests whether the given coordinates form a non-randomly observable pattern, the label null distribution tests whether the assignment of coordinates to a specific kind of stimulus or process is random. Given that some areas in the brain are repeatedly implicated in various tasks (within a sub-class of tasks, naturally), the spatial null distribution might not be the most appropriate type of test, particularly if the number of contrasts/studies reporting the results from certain tasks is imbalanced.

.contrasts

A cell array defining the contrasts for which coordinates will be selected (for positive and negative terms). Each cell must be a 1xC double array containing at least one contrast identifier. Positive numbers refer to contrasts which will be used for as positive terms, negative numbers refer to contrasts which will be used as negative terms. E.g. the array [1, -2] would specify a contrast where the first and positive term is the contrast identified by 1 and the second and negative term is the contrast identified by 2.

.grpmeth

This option sets the method used to compute the summary statistic. Possibly values are

'ost' - performs a one-sample t-test (highly experimental and not thoroughly tested!)
'sum' - computes the plain sum of the study-based maps
'wsum' - computes a weighted sum of the study-based maps; this is the default

.indivmaps

This is a boolean flag (true or false) which defaults to true. If set to true, the output VMP will contain one map per contrast and study representing the actual coordinates (which allows to inspect those maps for errors and further to visualize several maps at once for additional visualization options).

.jbmeth

This option defines how the “blobs” (kernel-filled spheres) are combined across coordinates, within study. If set to 'max', the maximum value will be used; if set to 'rsum' the sum is computed (but restricted to a maximum value of 1). This setting is experimental and the 'max' value should only be used in special cases and for testing.

.pbar

An optional progress bar object, which then visualizes the progress across the Monte-Carlo simulations.

.res

Resolution of the output VMP, default value is 3. Please note that coordinates are sub-voxel exact computed and that using higher resolutions (smaller values for this option) are only required if the data needs to be available in higher resolutions, e.g. for ROI generation, etc.

.scale

This flag determines whether the kernel is interpreted as an indicator sphere ('indic' setting) or as a gaussian kernel (which will be scaled to a maximum of 1, 'toone' setting, which is the default).

.smkern

Smoothing kernel(s). If multiple values are given, the MKDA analysis is performed for several kernels. The default is to use a single value of 8.

.smkinterp

Smoothing kernel interpolation setting. Any value available for flexinterpn_method can be used. This determines what interpolation kernel will be used to determine the sub-voxel data. Given that the spatial frequency of the smoothing kernel is usually larger than twice the resolution, 'linear' is sufficient to prevent additional smoothing.

NeuroElf wiki

Table of Contents