### Table of Contents

# Multi-level Kernel Density Analysis (MKDA)

The term Multi-level Kernel Density Analysis (MKDA) was coined by Tor Wager, and his own implementation is available at his lab's website for download.

In short, the general idea is to perform the following steps

- literature review and selection of articles that report spatial locations (coordinate tables) of a common neuropsychological function (or set of functions)
- creation of a compound table containing all found coordinates (possibly matching further selection criteria)
- performing the MKDA (or a similar algorithm, such as found in the GingerALE package), which tests the coordinates in the table against a Monte-Carlo random sampling (empirical null)
- interpreting the results of the MKDA (making statistical inferences based on areas where the null hypothesis of interest can be safely rejected, followed by bringing those inferences into context of the literature and existing models)

## Motivation

For various reasons, many neuroimaging experiments (where data is collected and spatial maps are created, allowing functional representations to be located across the brain) and their results as reported in journals (in this context that means tables linking specific spatial locations, i.e. coordinates, to certain functions/phenomena) are, by themselves, not well suited to generate “factual knowledge”:

- without a clear model that underlies and fits the observed spatial representation (networks subserving the experimentally manipulated function), the results do not represent “accepted knowledge” (strong inference) but rather new “hypotheses” (potential explanations for observed patterns, oftentimes based on reverse inference)
- the choice of subjects, stimuli, experimentation design, etc. could have biased the results to make them less informative for the more general population case (false-positive identification and false-negative masking)
- noise components in the data (on all levels) could have masked important aspects (locations, usually as false-negatives)

One possible way to overcome these problems (to some extent at least) is to aggregate coordinates from several (as a rule of thumb at least ten to 15, with most published meta analyses drawing from at least 40) studies (or rather contrasts from those studies) and then test whether certain spatial locations are implicated more often in the examined brain function than warranted by chance (Monte-Carlo null distribution via simulating data drawn from, say, a gray matter mask).

However, there are some additional problems that are only partially addressable with meta analyses of any kind, such as:

- the file-drawer problem (null findings are not reported, which for specific locations means that if an assumed cluster is not part of a spatial map, the analysis seems “unpublishable”)
- the researcher-degrees-of-freedom problem (during the collection, processing, and analyses stages, many choices can be/have to be made by the researcher, sometimes heavily influencing the results without apparent reason to reject either possible outcome as false)
- the fact that certain fields (e.g. negative emotion in the context of autobiographic memories) might be dominated by few research labs (potentially leading to an imposed “worldview” of those labs' researchers' beliefs on the results)

And it must be noted that even meta analyses cannot, per se, create “knowledge” (strong inferences) in absence of a model that explains and fits the observed patterns. Still, by summarizing several independent datasets into a single spatial map (e.g. via MKDA), the likelihood of making certain types of mistakes is highly reduced!

## Practical outline

The following steps, in detail, have to be performed to run an MKDA in NeuroElf:

- creation of a database (tabular format, one row per coordinate, with identifying columns/fields for study, contrast, x/y/z coordinate, as well as any other fields)
- possibly saving the database in a text-based format (e.g. when using Microsoft Excel for the first step, you should save the database as a CSV file)
- importing the database into NeuroElf (either using the command line or the MKDA UI)
- deciding on settings for the MKDA (e.g. smoothness of underlying indicator maps representing each statistical unit)
- if necessary, configuring one or several contrasts of interest
- running the analysis/analyses
- thresholding the resulting maps
- drawing inferences

## Requirements

### Creation of database

Following the introduction, the first step is to look through the literature and select articles you wish to include in the MKDA. Next, you need to create a tabular representation of all coordinates found in the tables (or text) of those articles, such as the following example demonstrates:

- MKDA_sample.txt
Study;x;y;z;CoordSys;N;Contrast; Ochsner_et_al_2008;15;24;21;MNI;21;LookNeg>LookNeu; Ochsner_et_al_2008;-15;-15;-18;MNI;21;LookNeg>LookNeu; Ochsner_et_al_2008;15;-18;-15;MNI;21;LookNeg>LookNeu; Ochsner_et_al_2008;15;33;48;MNI;21;LookNeg>LookNeu; Lieberman_et_al_2010;36;21;15;T88;16;Negative>Neutral;

If you wish to use this table in Tor Wager's MKDA tool as well, the first row should contain a single line with the number of fields in the first column:

- MKDA_sample_with_fields.txt
7;;;;;;; Study;x;y;z;CoordSys;N;Contrast; Ochsner_et_al_2008;15;24;21;MNI;21;LookNeg>LookNeu; Ochsner_et_al_2008;-15;-15;-18;MNI;21;LookNeg>LookNeu; Ochsner_et_al_2008;15;-18;-15;MNI;21;LookNeg>LookNeu; Ochsner_et_al_2008;15;33;48;MNI;21;LookNeg>LookNeu; Lieberman_et_al_2010;36;21;15;T88;16;Negative>Neutral;

This first step can be performed in a variety of programs with Microsoft Excel being very suitable for this task. Usually it would seem most appropriate to first setup the columns (field names), followed by copying and pasting the coordinates into the table and setting all desired columns to their appropriate values. Eventually, the table must be available as a text-based (ASCII) file with a row of field names at the top followed by the actual data, one coordinate per row.

### Importing the database into NeuroElf

In case you wish to perform this step on the command line (which might be particularly helpful if an error occurs to pinpoint the problem), you can use the following syntax:

- importplp.m
plp = importplpfrommkdadb('MKDA_sample.txt');

This will create a PLP object containing the coordinates as well as all other columns in a numeric representation. **Each non-numeric string will be converted to a unique number** such that, for instance, each unique study label will be stored by its numeric index into the `Labels`

property of the PLP object.

To then save the PLP object, please use the following syntax:

Alternatively, you can use the MKDA dialog to import the database.

## Running the analysis

For the actual procedure of running the MKDA, please refer to the MKDA UI article.

## Algorithm description

The general algorithm works as follows:

- potentially, a sub-selection of peak coordinates is made (based on conditional statements, e.g. to remove points that do not adhere to specific criteria)
- for each study or contrast (whatever is used as statistical unit) included in a given analysis a weight is computed (i.e. normally this weight is equal across peaks within a study/statistical unit, and the weight is based on an expression that may contain, for instance, the number of subjects that was included in the study for which the peaks are reported and included)
- next, a voxel based volume is initialized (filled with zeros) for each of the statistical units, and for each point in any given study a blob (with configurable size and value distribution, e.g. a 0-or-1 indicator sphere or a gaussian kernel) is added to the corresponding volume at/around the peak coordinate
- eventually, these volumes are combined using the weights for each of the statistical units (weighted sum along the dimension of the statistical unit, resulting in a 3-dimensional spatial map)
- to draw inferences, an empirical null distribution is derived by either spatially scrambling coordinates within a reasonable mask, such as a grey matter volume in the same space (in which case the null hypothesis tests whether the observed summary statistic of the weighted sum of blobs in any given location is higher than warranted by chance for that particular location if the reported peaks didn't carry any information on the actual spatial location/distribution of a neuropsychological state/mode) or by scrambling the labels of peaks across units (i.e. in a differential contrast where two sets of peaks reported for different activation states/modes are to be compared, and the inferential test determines whether for any given spatial location the observed summary statistic is significantly outside the empirical null distribution under the assumption that labels for reported points do not carry information about the activation state/mode;
**Note: in case a differential contrast, e.g. task A > task B, is to be examined, it is recommended to also establish that any locations showing a significant difference are also spatially selective of the tasks, i.e. having a value outside of the spatial-scrambling null distribution for task A + task B!**) - to allow fMRI-typical inference (uncorrected thresholding, FWE-thresholding and cluster-size based thresholding), each summary statistical map (under the null hypothesis) is added to an overall null-distribution (uncorrected thresholding), its most extreme value is recorded (FWE thresholding), and is clustered at various uncorrected thresholds to determine the cluster size threshold to get to a FWE corrected threshold using both height and cluster size thresholds