splitclustercoords

Motivation

Whenever clusters are reported in tables, it frequently happens that distinct areas are grouped into one cluster, e.g. due to the fact that the selected threshold simply makes them appear connected whereas there is clear a priori evidence (or even knowledge, as in cases where for instance the left and right hemispheric activations in visual cortex as connected and appear as one cluster) that those clusters represent separate areas (activation foci). In such cases, one would want to report several activation sites.

Instead of having to manually alter the threshold (and thus enforce the peak locations to be associated with or grouped into separate clusters), this function allows to search within any given cluster for local maxima/minima that would come out if this manual approach were to be used.

Function reference ('help splitclustercoords')

  splitclustercoords  - split coords of one cluster to subclusters
 
  FORMAT:       [cs, cv, cl, cc] = splitclustercoords(c, v [, k [, d]])
 
  Input fields:
 
        c           coordinates of values
        v           values
        k           size threshold for sub-clusters (default: 3)
        d           optional 1x3 minimum distance [default: [2, 2, 2]]
 
  Output fields:
 
        cs          list of cluster sizes
        cv          Cx1 cell array with lists of values
        cl          Vx4 lister of cluster voxels
        cc          Cx1 cell array with lists of coordinates

Arguments

c

This is a list of coordinates (Cx3 double array), which must be given in the underlying voxel resolution (e.g. the value of c(C).coords in the first output of vmp.ClusterCoords).

v

Associated values (same order, naturally). These values must be given as absolute values (sorting will occur from highest to lowest!)

k

If given, this overrides the default, whereby identified sub-compartments must have at least this number of associated voxels. This parameter was introduced to avoid finding peaks at the edge of a mask, where values might be increasing again towards the edge.

d

If given, this overrides the default, which means that also (in the default case) second-degree neighbors (neighbors of neighbors) are considered; this is to avoid that remaining noise in the map leads to virtual (but practically meaningless) sub-clustering.

Outputs

The outputs have been parallelized with the output of clustercoordsc, to simplify merging outputs into one common format.

cs

This is simply a list of cluster sizes, whereas the first number always represents the cluster with the highest value (main peak).

cv

This is a Cx1 cell array with cluster values. Note: as the input is not a volume but rather a list of coordinate and values, performance would suffer too much to return a potentially large volume each time this function is called.

cl

This is a combined list of cluster coordinates and sub-cluster index values (by extending the c input with a fourth column and setting this column to the appropriate index, beginning by 1).

cc

As an alternative, a Cx1 cell array with separate lists of coordinates can be requested (this argument is not created in case only three outputs are used).

Algorithm

The algorithm works as follows:

all voxel values within the given cluster coordinates are sorted from highest to lowest
the highest voxel coordinate is marked as “subcluster 1”
beginning with the next highest voxel value, each voxel is, in turn, tested for whether it is linked to one of the subclusters (by determining the minimal distance to voxels that are already marked); this means all voxels directly connected without interruption are marked belonging to the same subcluster – in case one voxel is connected to several subclusters (where sub-clusters meet), it picks the subcluster with the direct neighbor that has the highest value
this continues (step three) until all voxels are marked
finally the pseudo-size of all subclusters is determined

The last step is done to remove “miniature subclusters” (which sometimes happens at the very edge of a cluster, when a mask has been applied).

While the size of any given subcluster gives an indication about how big of a “hump” (additional cluster) in the larger cluster the subcluster represents, it does not have any practical relevance as to whether or not a sub-cluster/peak should be reported! This is particularly true if the sub-cluster peak is of a priori interest!

A suggested wording for a manuscript would be:

All clusters reported survived a combined heigth-and-size threshold
(statistic value above threshold for all considered voxels, minimum
clustersize in voxels). Local maxima (activation or correlation peaks)
are given whenever values within a cluster were were found to be not
connected to the already considered (central) mass in a higher-values-first
watershed searching algorithm.

Note: This last term, “higher-values-first watershed searching algorithm”, is NOT a term you are likely to find in the literature yet, but it describes in a brief, compound term the idea of the watershed method as applied to gradient values so that values with a negative gradient are being marked as belonging to the same area, see http://en.wikipedia.org/wiki/Watershed_%28image_processing%29 for instance.

NeuroElf wiki

Table of Contents