plp.mkda

Performs a multi-level kernel density analysis.

Single studies can give strong evidence for a certain hypothesis (such as that a specific area of the brain is involved in processing a subset of stimuli from a greater class), but usually it is required to replicate findings for the scientific community to agree that the findings are not spurious and conclusive. Unfortunately, it is very rare that study protocols are applied in different settings (different population sample and experimenter/investigator) without additional modifications to the methodology (slightly different stimuli or timing, different focus of interest, refined hypothesis, etc.). And additionally, mostly driven by a relatively protectionist attitude towards ones data and methods, it is not overly common to share raw data sets among competing labs, which makes it difficult to perform “mega-analyses” (pooling actual data from different experiments to strengthen common conclusions).

Instead, several scientists have suggested and subsequently demonstrated the power of performing meta analyses, which work on the results (mostly in the form of tables containing so-called peak coordinates) of studies found in the literature to test and reject the null-hypothesis that those peak coordinates are distributed randomly across the search space (usually a set of coordinates defined by all gray matter locations throughout a standard brain space).

This is done by computing a summary statistic for each voxel that indicates how many (and possibly also how close) peak coordinates of respective contrasts have been reported in close proximity to any given location. To threshold the resulting 3D dataset (image), a Monte-Carlo simulation with many (commonly at least 5,000) iterations is performed that scrambles the locations of coordinates and then re-computes the statistic to estimate a null-distribution.

To use this method, a dataset containing at least the (X/Y/Z) coordinates as well as a unique study identifier must be converted into a PLP object of NeuroElf using the importplpfrommkdadb function, which supports reading both tabular text files (one line per peak coordinate) or standardized MAT files which stem from the MKDA Matlab toolbox written by Tor Wager.

PLP::MKDA - perform a multi-level kernel density analysis FORMAT: mvmp = plp.MKDA([opts]) Input fields: opts 1x1 struct with options .applymask apply mask to output results (default: false) .asimiter number of (false-positive) simulation iterations (5000) .asimkeep boolean flag, keep (average of) simulated maps (false) .asimmask mask object (HDR, MSK or VMR, default: colin brain) .asimsmpl random coordinates sampling, either {'full'} or 'near' .asimthr thresholding for group maps, one of 'fprXX' - reject XX% of false positive voxels under 0 'fweXX' - reject XX% of maps with any false positives 'rescale' - rescale (each voxel!) to fit 0 distribution .bbox bounding box (passed into newnatresvmp, default: []) .cond conditional statement, e.g. '$Study >= 1 & $Study <= 3 & $Type == 2' whereas names are replaced by their column data .contcomp contrast computation, 'diff', 'excl', or {'wexcl'} .contexclw exclusion weight (default 0.5) .contnames cell array with names for contrast terms .contnull null-distribution of contrasts either of 'full' - shuffle assignments overall and within 'labels' - shuffle labels within studies only {'spatial'} - construct null distribution by random spatial resampling within .asimmask .contrasts contrast definition cell array, default: {[1]} .grpmeth group statistics method, one of 'ost', 'sum', {'wsum'} .indivmaps keep individual (study-specific) maps (default: true) .jbmeth join-blobs method, either of 'max', {'rsum'} .pbar progress bar object (either xfigure or xprogress, []) .res VMP resolution (passed into newnatresvmp, default: 3) .scale scaling flag, either of 'indic', {'toone'} .smkern single or multiple smoothing kernels in mm (default: 8) .smkinterp smoothing kernel interpolation (default: linear) .smkmdist distance from peak where the value is max (default: 0) .smkres smoothing kernel resolution (default: 1) .studysel selection of studies (default: all) .stwf per-study weight formula, e.g. 'sqrt($GroupSize) .* (0.75 + 0.25 .* ($RFX==1))' whereas names are replaced by their column data .stwp relative points-per-study weight for testing, one of 'confidence' - weigh 0.5 + gammapdf((1:maxp)./meanp) 'logpoints' - weigh by 1 + log(nr of points) {'none'} - assume all studies contribue equally 'points' - weigh by number of points per study 'sqrtpoints' - weigh by sqrt(nr of points) .unique only use unique points (within study, default: true) .usecons flag, if given must be either the name or number of the column containing the contrast identifier .usesize flag, if given must be either the name or number of the column containing the (original) cluster size .usevalue flag, if given must be either the name of number of the column containing the (original) peak value Output fields: mvmp VMP container with one (set of) map(s) for each contrast Notes: all points must be in the same coordinate space (so any conversion should occur prior to storing the coordinates in the PLP object) the .asimsmpl parameter either samples the coordinates from all voxels in the .asimmask or, alternatively, it adds a displacement of between 0.5 and 1.5 times smpkern to each coordinate, checking they remain within the mask the .contcomp parameter defined whether contrast (within study) are a straight difference between the blob maps ('diff') or whether if both values of the term are ~= 0, a zero values is used the non-spatial .contnull settings ('full' and 'labels') make most sense when used with a rescaling approach (setting .asimthr to 'rescale'), as otherwise the FWE clearly will be too stringent the .contrasts cell array should contain cells that give the identifiers for the positive and negative terms in the contrast, e.g. in a meta analysis of three contrasts, this list could be {3, [3, -1], [3, -2]}, which would then compute three contrasts: 3 (on its own, i.e. vs. baseline), 3 > 1, and 3 > 2 the .scale parameter either creates a smooth gaussian kernel around the peak, fixing the ceiling value to 1 ('toone'), or sets voxels above .5 (after toone scaling) to 1 (indicator); if .scale is set to 'indic' (pure indicator), the kernel size is interpreted as a radius! the .smkmdist parameter can be used to have the initial blobs be either complete gaussian blobs (0, default) or flattened in a way such that within a specific range the value is actually the max value, after which it drops off using the gaussian smoothing; this allows to create "smooth" but still "solid" blobs, e.g. by setting the smkmdist to the same value(s) as smkern the .smkres parameter can be set to a smaller resolution in case the coordinates (plp.Points) contain non-rounded entries (e.g. after MNI->TAL or vice versa transformation) the .stwf (per-study weighting formula) field will default to '1' unless a valid 'GroupSize' or 'N' column is found, in which case the .stwf field will be set to 'sqrt($GroupSize)' or 'sqrt($N)' the .stwp parameter sets an additional weight on studies based on how many points are available for the study (at any given contrast iteration); the 'confidence' setting tries to determine a "sweet spot", the number of points that a "good" (average) study should report, and, at most, reduces the weight by 0.5 for studies that have either no points or the maximum number of points

This is a boolean flag (`true`

or `false`

) with default `false`

. If set to true, the resulting summary statistical map will be masked with the dataset used to determine the randomization space. Please be aware that this sets all voxels outside the mask to `0`

which makes interpolated images have a “smaller” fringe given the fact that interpolated values will be lower.

The number of Monte-Carlo simulation iterations, default value `5000`

.

This is a boolean flag (`true`

or `false`

) with default `false`

. If set to true, the resulting VMP object not only contains the (thresholded) map of the actual (true) coordinates but also the average map of the Monte-Carlo simulated maps (which can be useful to understand why certain locations might be missing in the output).

This specifies the mask from which the randomization space is estimated. By default, the `_files/colin/colin_brain_ICBMnorm.vmr`

dataset is used (which also contains cerebellum and brain stem; in case you wish for a more stringent mask, you can use the drawing capabilities of NeuroElf to further restrict the search space).

This flag can be set to either `'full`

' or `'near`

', with `'full`

' being the default. When `'full`

' is selected, the Monte-Carlo simulation draws the null-distribution coordinates from the entire search space, regardless of where the actual (true, reported) coordinates are located. This is clearly the default, as without further information the assumption can only be that, in absence of additional restrictions, false-positive peak coordinates would be distributed throughout the entire search space (usually the brain/gray matter voxels). However, given that subsequent publications are often required to already reconcile their findings with prior reported data, a considerable (but hard to estimate!) bias might already be in the data (such that coordinates which don't fit into the already existing view of a phenomenon might be under represented, leading to a non-uniform null-distribution for false-positives). This can be addressed using the `'near`

' setting, for which random coordinates are drawn from the mask but, for each reported coordinate, come from a sub-set of this mask which is defined as the sphere of twice the kernel radius, excluding all coordinates that are up to half the kernel radius away. **In short, setting this flag to 'near' uses a more stringent null distribution.**

This flag selects the thresholding method and is one of

`'fweXX`

' - which sets a family-wise error criterion, such that the highest summary statistic value is found in at most 5% of the simulated maps`'rescale`

' - this approach stores all simulated maps and, at the end, for each voxel converts the summary statistic into a Z-score which represents the proportion (stochastic probability) of maps for which this voxel had a higher value under the null distribution

The `'rescale`

' approach subsequently allows to use arbitrary thresholding techniques.

This optional setting is passed into the newnatresvmp function and defines the maximal space for which the computation is performed. The default value is the empty array.

A conditional statement can be given, which restricts the selection of coordinates (e.g. to specific types of contrasts, populations, fMRI methods, timing types, etc.). The statement must be made up of logical expressions that, for each row of data, evaluate to either `true`

(row is included in the selection) or `false`

(row is not included). The syntax for each such expression is by using the Dollar sign ($) followed by the name of the header field (e.g. `Year`

or `ContrastType`

) and a numerical expression (at the moment string constants cannot be used). Parentheses can be used to group expressions accordingly.

This flag sets the computation type for cases where a differential contrast is investigated (e.g. “negative > neutral images”). The possible values are

`'diff`

' - a straight difference between the maps is computed`'excl`

' - after computing the difference an exclusion is performed (all voxels that are implicated, even if only by a fraction, in both contrasts are set to`0`

)`'wexcl`

' - a weighted exclusion, such that voxels for which both contrasts yield values greater than a threshold are set to`0`

); this is the default

This computation is performed **within study**.

Threshold for the `'wexcl`

' setting of the `.contcomp`

flag. The default value is 0.5.

A cell array describing the names used in the output VMP object. Must match the number of defined contrasts.

This flag sets the type of null distribution for contrasts (where two sets of coordinates are compared). Possible values are

`'full`

' - this shuffles the assignments of contrasts to studies and labels of coordinates to contrasts`'labels`

' - this shuffles only the labels of coordinates to contrasts within studies`'spatial`

' - this constructs a null distribution by randomized spatial sampling

The difference between spatial and label randomization is that the null distribution is of a different kind. Whereas the spatial null distribution tests whether the given coordinates form a non-randomly observable pattern, the label null distribution tests whether the assignment of coordinates to a specific kind of stimulus or process is random. Given that some areas in the brain are repeatedly implicated in various tasks (within a sub-class of tasks, naturally), the spatial null distribution might not be the most appropriate type of test, particularly if the number of contrasts/studies reporting the results from certain tasks is imbalanced.

A cell array defining the contrasts for which coordinates will be selected (for positive and negative terms). Each cell must be a `1xC`

double array containing at least one contrast identifier. Positive numbers refer to contrasts which will be used for as positive terms, negative numbers refer to contrasts which will be used as negative terms. E.g. the array `[1, -2]`

would specify a contrast where the first and positive term is the contrast identified by `1`

and the second and negative term is the contrast identified by `2`

.

This option sets the method used to compute the summary statistic. Possibly values are

`'ost`

' - performs a one-sample t-test (highly experimental and not thoroughly tested!)`'sum`

' - computes the plain sum of the study-based maps`'wsum`

' - computes a weighted sum of the study-based maps; this is the default

This is a boolean flag (`true`

or `false`

) which defaults to `true`

. If set to true, the output VMP will contain one map per contrast and study representing the actual coordinates (which allows to inspect those maps for errors and further to visualize several maps at once for additional visualization options).

This option defines how the “blobs” (kernel-filled spheres) are combined across coordinates, within study. If set to `'max`

', the maximum value will be used; if set to `'rsum`

' the sum is computed (but restricted to a maximum value of `1`

). This setting is experimental and the `'max`

' value should only be used in special cases and for testing.

An optional progress bar object, which then visualizes the progress across the Monte-Carlo simulations.

Resolution of the output VMP, default value is `3`

. Please note that **coordinates are sub-voxel exact computed** and that using higher resolutions (smaller values for this option) are only required if the data needs to be available in higher resolutions, e.g. for ROI generation, etc.

This flag determines whether the kernel is interpreted as an indicator sphere (`'indic`

' setting) or as a gaussian kernel (which will be scaled to a maximum of `1`

, `'toone`

' setting, which is the default).

Smoothing kernel(s). If multiple values are given, the MKDA analysis is performed for several kernels. The default is to use a single value of `8`

.

Smoothing kernel interpolation setting. Any value available for flexinterpn_method can be used. This determines what interpolation kernel will be used to determine the sub-voxel data. Given that the spatial frequency of the smoothing kernel is usually larger than twice the resolution, `'linear`

' is sufficient to prevent additional smoothing.

plp.mkda.txt · Last modified: 2011/08/24 19:47 by jochen