Config Variables

Config Settings

Documenting the variables used in the options struct, ops. Each variable here is assigned using: ops.<variableName>

(see StandardConfig_MOVEME.m).

Overview of how KiloSort uses the settings

Data is processed in batches, NT samples long, pre-processed via: filtering, median subtraction (common average referencing) and data whitening across channels (removes correlated noise - e.g. due to far away neurons). The whitened data is then scaled down, division by scaleproc.

KiloSort thresholds the scaled down pre-processed data with, spkTh, to identify initial spikes, with nt0 samples on either side of the minimum of the spike. For each threshold crossing +-loc_range(1) samples and +-loc_range(2) channels are checked to find the minimum. These spikes are clustered on the 7 dimensional PC space, wPCA, to identify potential templates. Nfilt number of templates are initially found, these templates are then run through your data in batches (think convolution). During a potential match the degree of similarity between the current match and the template is compared with a threshold, Th. The match is compared to the mean of the waves, for lower lam values the current match is allowed to be scaled more to match the template; in other words large lam values force waves to be closer to the mean of the current template's waveforms. A certain amount of noise/uncertainty is allowed, larger values of momentum(1) allow for more noise/variabilitiy in the waveforms for a given template.

After a set number of batches, 400, templates are re-evaluated. If the distance between clusters is less than mergeT these clusters and hence the templates are averaged together, if the score of the split between clusters is greater than splitT the cluster is marked for splitting. Splitting is performed after merging, and contains a hidden test for number of spikes to allow overwriting small clusters [?].

Parallel Matching Pursuit occurs during the final pass of the data. This approach looks for the best matching template and subtracts it from the waveform, the residual waveform is then compared with other templates in a similar fashion until the amount of explained varience below a threshold.

KiloSort projects a candidate spike waveform onto each template to assess how much of the variance of that spike in the waveform can be explained by the template. This threshold allows sets how much of the variance needs to be explained to consider the waveform part of the template. In other words, the threshold is for how much variance is allowed around the template, a small value indicates a large amount of variance is allowed - allowing this template's cluster to accumulate more waveforms that vary from the template. There are 3 elements. The first 2 elements are used to create a linspace() between anneal 1 and the anneal final (nannealpasses*NBatch). e.g. 1 and 5 for 10 anneals: linspace(1,5,10). This effectively creates an increasingly harder threshold to cross for each anneal pass. The final element is used during the final template matching pass - i.e. the pass that goes through each batch sequentially and performs parallel matching.

Relevant References: #122, #146(isolated_peaks implementation notes)

lam - Penalty for Amplitudes Different to the Template

A large value of lam means that if the template needs to be scaled to match the candidate waveform, there is a large penalty associated with that. The penalty is referring to the value of similarity between the waveform and the template, hence a large penalty will cause a reduction in the similarity value. The threshold for similarity is set by Th

Nfilt - Starting Number of Clusters

Nfilt sets the initial number of clusters to find. This mean the output (before any auto-merging) will usually have this many clusters, but if shuffle_clusters = 1, you may find the output deviates from this value. number
Typically you want this variable to be 2-4 times the number of recording sites (i.e. channels, Nchan) you have. However, the lower the input impedance of your recording sites, the lower you can set this value. A low input impedance indicates that you will still receive large amplitude signals relatively further away from the recording site, hence if all your recording sites were low impedance you might find that they essentially record the same signal - KiloSort will therefore not be able to cluster signals base on a waveform signature that spans multiple channels.

nt0 - Waveform Window (Samples)

Sets the number of samples to use for templates and hence the extracted waveform. The peak of the template / extracted waveform is located at sample nt0min. Should always be an odd number. It also cannot exceed 80, #30, as there is a hardcoded maximum in the GPU code.

Excellent examples here: #177, #171

nt0min - Peak Location used for PC's -1

Informs the algorithm where in your PC's is the peak location. The -1 is required because MATLAB uses 1-indexing, e.g. waveform centre at 21: 1+20, instead of 0+21.

Key References: #177, #169

wPCA - The Principle Component Matrix

wPCA should contain the first 7 principle components from some sample data.

Wi = pca(waveFormArray);
% waveFormArray is:
% rows    x    columns
% spikes  x    samples of the spike waveform
imagesc(Wi) % visualise the output
wPCA = Wi(:,1:7) % KiloSort only uses the first 7

To compute the xth PC value for a waveform you multiply the xth column of wPCA with a spike waveform, e.g. a row of waveFormArray. For multi-channel data, the waveforms used can only be the channels with the largest amplitude.

Key References: #169, #32(multi-channel)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config Variables

Config Settings

Overview of how KiloSort uses the settings

Table of Contents

spkTh - Threshold for Identifying Spikes to Make Templates

Th - Threshold for Comparing Spike to Template

lam - Penalty for Amplitudes Different to the Template

Nfilt - Starting Number of Clusters

nt0 - Waveform Window (Samples)

nt0min - Peak Location used for PC's -1

wPCA - The Principle Component Matrix

Clone this wiki locally