What Is Spike Sorting?
One Electrode, Many Voices
Imagine you're standing in a dark room with a single microphone. There are four people talking. They're all standing at slightly different distances from the mic. Their voices have different pitches, different timbres, different rhythms. The microphone captures one continuous audio signal, all four voices mixed together. Your job is to figure out, for every syllable in the recording, which person said it.
That's spike sorting.
Except the "room" is the living tissue of the brain. The "microphone" is a hair-thin metal electrode implanted into cortical gray matter. The "voices" are neurons, and instead of words, they're producing action potentials: brief, stereotyped voltage spikes that last about one millisecond. Each neuron's spike has a slightly different shape because of its size, distance from the electrode, and the geometry of its axon. And the electrode picks them all up at once, superimposed into a single voltage trace.
Spike sorting is the computational process of untangling that mess. Of taking the raw recording and saying: this spike came from neuron A, that spike came from neuron B, and those three overlapping spikes are neurons A, C, and D firing nearly simultaneously.
It's one of the most fundamental problems in invasive neuroscience. And if you're a BCI developer who's wondered how systems like BrainGate or the Utah array actually decode neural signals, spike sorting is where it all starts.
Why Would Anyone Need to Do This?
Here's the thing about neurons. They speak in spikes.
A neuron either fires or it doesn't. There's no "volume knob." When the electrical potential across a neuron's membrane hits about -55 millivolts (the threshold), voltage-gated sodium channels slam open, positive ions rush in, and the neuron produces a sharp electrical pulse that peaks at roughly +40 millivolts, then crashes back down within about a millisecond. This all-or-nothing event is the action potential. The spike.
The information isn't in the spike itself. It's in the pattern of spikes over time. A neuron might fire 5 times per second when nothing interesting is happening, then ramp up to 200 spikes per second when the thing it cares about (a specific direction of arm movement, a particular visual edge, a certain sound frequency) is present. The "firing rate" is the neural code.
But here's the catch. When you stick an electrode into the cortex, you don't hear just one neuron. The tip of a typical microelectrode can detect spikes from anywhere within about 50 to 150 micrometers. In cortical gray matter, that sphere contains roughly 5 to 20 neurons. Their spikes all show up on the same recording channel, all mixed together in one messy voltage trace.
If you want to know what any individual neuron is saying, you have to separate them first.
What Is the Shape of a Spike Is Its Fingerprint?
The reason spike sorting is even possible comes down to physics. Each neuron's spike looks slightly different on the recording, and that difference is predictable.
Three things determine the shape of a spike as it appears on the electrode:
Distance. A neuron sitting 20 micrometers from the electrode tip produces a large spike. One sitting 100 micrometers away produces a smaller one. Electrical signals attenuate with distance through brain tissue, roughly following an inverse-square relationship. So amplitude is the first distinguishing feature.
Cell morphology. Neurons come in different shapes and sizes. A large pyramidal neuron with a long apical dendrite produces a different extracellular voltage field than a small inhibitory interneuron. These differences show up in the spike's waveform: its width, its symmetry, the shape of its afterhyperpolarization (the dip that follows the main peak).
Electrode geometry. The precise angle and position of the electrode relative to the neuron's axon and dendrites affects how the spike appears. Two identical neurons at the same distance but different orientations will produce distinct waveforms.
The result: each neuron near the electrode leaves a characteristic "signature" in the voltage trace. A template. If neuron A consistently produces tall, narrow spikes with a short afterhyperpolarization, and neuron B produces shorter, wider spikes with a longer afterhyperpolarization, you can learn to tell them apart.
This is the core assumption of spike sorting. Same neuron, same shape. Different neuron, different shape.
It's an imperfect assumption (we'll get to why), but it works well enough to have powered decades of neuroscience research and the most advanced BCIs on the planet.
Step One: Finding the Spikes
Before you can sort spikes, you need to find them. This is called spike detection, and it's simpler than you might expect.
The raw voltage trace from an extracellular electrode is a noisy signal. Background activity from distant neurons, synaptic potentials, and thermal noise all contribute to a "floor" of fluctuation. Spikes are sharp, brief events that rise well above this floor.
The most common detection method is threshold crossing. You estimate the noise level of the recording (usually the median absolute deviation of the signal, which is strong to the spikes themselves), multiply it by a factor (typically 4 to 5), and flag any time the voltage crosses that threshold. Each crossing marks a candidate spike.
Once detected, you cut out a short window around each spike, typically 1 to 2 milliseconds, centered on the peak. These windows are your "spike snippets." For a recording with 10 neurons firing at average rates, you might collect hundreds of thousands of snippets per hour.
Now comes the hard part. You have a pile of spike snippets. They need to be grouped by neuron.
Step Two: Extracting Features
Each spike snippet is a short voltage waveform, maybe 40 to 60 sample points long at typical sampling rates of 20,000 to 40,000 Hz. You could try to cluster the raw waveforms directly, but 40-dimensional space is unwieldy. You'd need an enormous amount of data, and the noise in any single sample point would muddy the distinctions between neurons.
The solution is dimensionality reduction: find a compact set of features that captures the meaningful differences between spike shapes while discarding noise.
PCA: The Workhorse
Principal component analysis (PCA) is the most widely used feature extraction method in spike sorting. The idea is beautifully straightforward.
You stack all your spike snippets into a matrix (each row is one spike, each column is one time point). PCA finds the directions of maximum variance in this matrix. The first principal component (PC1) is the direction along which the data varies the most. PC2 captures the most variance that's orthogonal to PC1. And so on.
For spike sorting, the first 2 to 4 principal components typically capture over 90% of the meaningful variation between spike shapes. So you project each spike onto these components, reducing your 40-dimensional waveform to a 2D or 3D point. Plot those points, and spikes from different neurons often form distinct clusters.
| Feature Method | How It Works | Strengths | Weaknesses |
|---|---|---|---|
| PCA | Finds axes of maximum variance in the spike waveform matrix | Fast, well-understood, effective for well-separated units | Not optimized for discrimination between clusters |
| Wavelet coefficients | Decomposes each spike into time-frequency components using wavelets | Better at capturing localized waveform differences | More computationally expensive than PCA |
| Template matching | Compares each spike to a library of known waveform templates | Simple, fast, works well when templates are stable | Cannot discover new units or handle template drift |
| Autoencoder (deep learning) | Neural network learns a compressed representation of spike shapes | Can capture nonlinear features | Requires training data, less interpretable |
Wavelets: When PCA Isn't Enough
Sometimes PCA misses differences that matter. Two neurons might have similar overall variance structure but differ in a localized feature, like the width of the initial negative phase. Wavelet decomposition captures these localized time-frequency features better than PCA. The superlets or Haar wavelets decompose each spike into coefficients at different scales, and a statistical test (like the Lilliefors test for normality) selects the coefficients that best separate the spike populations.
The WaveClus algorithm by Rodrigo Quian Quiroga and colleagues, published in 2004, popularized this approach. It remains one of the most cited spike sorting methods in the field.
Step Three: Clustering
You now have each spike represented as a point in 2D or 3D feature space. Spikes from the same neuron should form a tight cluster. Spikes from different neurons should form separate clusters. The task: find the clusters.
K-Means: The Simple Approach
K-means is the most basic clustering algorithm. You specify the number of clusters (K), initialize K centroids randomly, assign each spike to the nearest centroid, recalculate centroids, and repeat until convergence. It's fast and intuitive.
The problem: you have to know K in advance. How many neurons are you recording from? You don't know. That's what you're trying to figure out. You can try different values of K and use metrics like the Bayesian Information Criterion (BIC) to choose, but this adds complexity and doesn't always give a clear answer.
Gaussian Mixture Models: A Probabilistic Upgrade
Gaussian mixture models (GMMs) assume the data is generated by a mixture of K Gaussian distributions, each representing one neuron. The algorithm estimates the parameters (mean, covariance, and mixing weight) of each Gaussian using expectation-maximization (EM).
GMMs have a significant advantage over k-means: they give you a probability of cluster membership for each spike, not just a hard assignment. If a spike sits between two clusters, GMM tells you "70% chance neuron A, 30% chance neuron B." This uncertainty estimate is valuable for downstream analysis.
The spike sorting literature contains dozens of clustering methods beyond k-means and GMMs. Density-based methods like DBSCAN find clusters of arbitrary shape. Hierarchical methods build a tree of nested clusters. Graph-based methods like spectral clustering use the eigenvalues of similarity matrices. More recently, variational inference approaches treat the entire sorting problem as Bayesian inference, estimating the posterior probability of every possible assignment of spikes to neurons.
The diversity of methods reflects the difficulty of the problem. No single algorithm works best in all situations.
The Modern Era: Automated Spike Sorters
Manual spike sorting, where a human researcher stares at cluster plots and draws boundaries by hand, was standard practice for decades. It works, but it's agonizingly slow, subjective, and doesn't scale. A single Utah array has 96 channels. Modern Neuropixels probes have 384 channels. Nobody is sorting 384 channels by hand.
The field has moved toward automated pipelines. The most influential ones:
Kilosort (developed by Marius Pachitariu and colleagues at the Howard Hughes Medical Institute) changed the game when it was released in 2016. Instead of the traditional detect-extract-cluster pipeline, Kilosort uses template matching on the raw data itself. It fits a generative model that describes the entire recording as a sum of spike templates convolved with spike times, optimized using GPUs. Kilosort can sort a Neuropixels recording (384 channels, millions of spikes) in less time than it took to record the data. Its third version, Kilosort3, is widely considered the standard for high-density probe recordings.
MountainSort takes a different approach, using isosort (an algorithm inspired by mountain terrain topology) to find clusters in feature space without requiring the user to specify the number of neurons. It's known for producing very clean, conservative sorts where cluster quality is prioritized over yield.
SpyKING CIRCUS uses a combination of template matching and online learning, adapting its templates over time as electrode drift changes the waveform shapes. This makes it particularly useful for chronic recordings that span hours or days.
Here's something that still surprises researchers who are new to the field. You can never truly verify that your spike sorting is correct. The ground truth, knowing exactly which neuron produced each spike, would require simultaneous intracellular recording from every neuron near the electrode. That's technically impossible for more than one or two cells at a time. So the field validates spike sorting algorithms using simulated data (where ground truth is known by construction) and by checking internal consistency metrics like isolation distance, L-ratio, and interspike interval violations. It's like trying to judge a translation between two languages when you only speak one of them. You can check for internal coherence, but you can't look up the answer in the back of the book.
Why Spike Sorting Matters for BCI
So why should a BCI developer care about any of this?
The short answer: if you're building invasive BCIs, spike sorting is the foundation of everything. The long answer is more nuanced and actually quite interesting.
The systems that have made headlines, BrainGate allowing paralyzed patients to control robotic arms, the Stanford lab demonstrating thought-to-text at 90 words per minute, Neuralink's monkey playing Pong, all depend on decoding the activity of individual neurons (or small populations of neurons) from implanted electrode arrays. The decoder, typically a Kalman filter or recurrent neural network, takes as input the firing rates of sorted neurons and outputs a prediction of intended movement.
Better spike sorting means better isolation of individual neurons. Better isolation means cleaner firing rate estimates. Cleaner firing rates mean more accurate decoding. A 2019 study in the Journal of Neural Engineering showed that decoders trained on well-sorted single-unit activity outperformed those trained on unsorted multi-unit activity by 15 to 25% in cursor control accuracy.
But here's the twist. Not everyone is convinced that spike sorting is worth the trouble.

The Case Against Sorting: When Brute Force Wins
A growing school of thought in the BCI community argues that spike sorting is unnecessary for many practical applications. The argument goes like this:
Modern electrode arrays (Neuropixels, Utah arrays with thousands of channels) record from so many neurons simultaneously that the sheer volume of data compensates for the imprecision of unsorted signals. Instead of carefully isolating each neuron, you can simply count how many times the voltage crosses a threshold on each channel (threshold crossings) and feed those counts into a powerful decoder.
Chethan Pandarinath and colleagues showed in 2017 that latent factor models trained on threshold crossings achieved decoding performance remarkably close to models trained on carefully sorted single units. The advantage: threshold crossing is computationally trivial. No PCA. No clustering. No quality control. Just count crossings and decode.
This matters for practical BCIs because spike sorting is computationally expensive and, more importantly, the results degrade over time. Electrodes shift. Scar tissue forms. The waveform templates that worked on day 1 may not work on day 30. Threshold crossings, by contrast, are strong to these changes.
The debate isn't settled. For applications requiring precise single-neuron resolution (like understanding how specific cell types contribute to a motor plan), spike sorting remains essential. For applications that just need reliable control signals (like moving a cursor), threshold crossings may be good enough.
| Approach | Input Signal | Computational Cost | Robustness Over Time | Best For |
|---|---|---|---|---|
| Full spike sorting | Isolated single-unit activity | High | Low (templates drift) | Neuroscience research, precise decoding |
| Threshold crossings | Multi-unit spike counts | Very low | High | Clinical BCIs, real-time control |
| Unsorted multi-unit activity | Filtered, rectified signal envelope | Low | Moderate | Quick prototyping, population-level analysis |
| Non-invasive EEG (frequency bands) | Aggregate oscillatory power | Low to moderate | High | Consumer BCI, neurofeedback, accessibility |
The Abstraction Ladder: From Spikes to Frequency Bands
This is where the picture gets really interesting if you're a BCI developer trying to figure out where you fit.
Neural signals exist on an abstraction ladder. At the bottom rung, you have the action potentials of individual neurons. Single spikes. This is where spike sorting lives. You need electrodes physically inside the brain to work at this level. The signals are rich, information-dense, and incredibly specific. You can decode the direction of an intended arm movement from 100 sorted neurons with remarkable accuracy.
One rung up, you have local field potentials (LFPs). These are slower oscillations recorded from the same implanted electrodes, reflecting the summed synaptic activity of thousands of neurons. LFPs don't require spike sorting. They carry information about the general state of a local neural population, things like movement preparation, attention, and cognitive effort.
Another rung up, you have electrocorticography (ECoG). Electrodes on the surface of the brain (under the skull, but not penetrating the tissue) pick up high-gamma activity and broadband signals that can decode speech, movement, and even imagined actions. No spike sorting here either.
And at the top of the ladder: scalp EEG. Electrodes on the outside of your skull, measuring the aggregate electrical activity of millions of neurons firing in synchrony. The signal is heavily filtered by the skull and scalp. No individual spikes. No single-neuron resolution. What you get instead are frequency bands: the delta, theta, alpha, beta, and gamma oscillations that reflect large-scale brain states.
This abstraction ladder isn't just academic. It determines what kind of BCI you can build and who can use it.
Invasive microelectrodes (BrainGate, Neuralink) operate at the bottom of the ladder. Maximum information. Maximum surgical risk. Currently limited to research participants with severe paralysis.
ECoG grids (used in some epilepsy patients) sit in the middle. Good signal quality, lower risk than penetrating electrodes, but still requires craniotomy.
Consumer EEG devices like the Neurosity Crown operate at the top. You're not decoding individual neurons. You're reading the collective rhythm of cortical populations. And that's not a limitation. It's a design choice. Because you can put the Crown on your head in 30 seconds, use it every day, and build applications with JavaScript that respond to your brain state in real time. No surgery. No hospital. No spike sorting.
What Crown Developers Work With Instead
If you're developing with the Neurosity Crown, you're working at the frequency band level. And you should understand what that means in the context of everything we've just discussed.
The Crown's 8 EEG channels sample at 256 Hz. That sampling rate is designed for oscillatory analysis, not spike detection (which typically requires 20,000 to 40,000 Hz). The Crown's N3 chipset performs Fast Fourier Transform (FFT analysis) on-device, decomposing the raw signal into power across frequency bands.
Your building blocks as a Crown developer are:
- Power spectral density across delta (0.5-4 Hz), theta (4-8 Hz), alpha (8-13 Hz), beta (13-30 Hz), and gamma (30-100+ Hz) bands
- Focus and calm scores, computed by machine learning models trained on these frequency features
- Kinesis, which uses imagined movement patterns (motor imagery) detected across channels C3 and C4 over the motor cortex
- Raw EEG at 256 Hz for custom analysis pipelines
This is a fundamentally different abstraction level from spike sorting. You're not asking "which neuron fired?" You're asking "what state is this brain region in?" And for the applications most people actually want to build, things like adaptive focus tools, meditation feedback, cognitive load monitoring, brain-responsive audio (via SDK), and AI integrations through MCP, the frequency band level is exactly the right level.
Think of it this way. If spike sorting is like transcribing individual voices in a crowd, frequency band analysis is like measuring the crowd's mood. You can't tell who said what, but you can tell whether the crowd is calm, excited, agitated, or asleep. And for a surprising number of applications, the mood of the crowd is exactly the information you need.
The Future of Spike Sorting (and Why It Might Disappear)
The spike sorting field is at an inflection point. Two forces are pushing in opposite directions.
On one side, the algorithms are getting dramatically better. Deep learning approaches are beginning to outperform traditional methods. A 2021 paper from the Flatiron Institute demonstrated a spike sorter based on neural networks that could handle electrode drift, overlapping spikes, and non-stationary noise better than any template-based method. As electrode density increases (Neuropixels 2.0 has over 5,000 recording sites), the ability to triangulate a neuron's position in 3D space from its signal across multiple nearby channels is making sorting more accurate than ever.
On the other side, there's a growing argument that we should skip spike sorting entirely. As decoders get more powerful and electrode counts climb into the thousands, the raw data may contain enough redundant information that careful sorting becomes unnecessary overhead. Why spend hours sorting when a deep learning decoder can learn to extract the relevant signals directly from the raw voltage traces?
The most likely outcome: spike sorting will become a preprocessing step that runs invisibly in the background, fully automated, rather than the manual, labor-intensive process it's been for most of its history. Kilosort is already moving in this direction. For BCI applications, real-time sorting on dedicated hardware will run continuously, keeping up with the data as it streams in.
For those of us working with non-invasive devices, the evolution of spike sorting is fascinating to watch but not something we need to wait for. The path from "understanding the brain" to "building useful BCI applications" doesn't have to go through surgical implants and spike sorting algorithms. It can go through your scalp, through eight well-placed EEG sensors, and into a JavaScript SDK that lets you build brain-aware software today.
Where to Go From Here
If spike sorting grabbed your attention and you want to go deeper, the field's key resources are freely available:
- Rodrigo Quian Quiroga's 2004 paper on WaveClus is the classic introduction to wavelet-based spike sorting
- Kilosort's GitHub repository contains code, documentation, and benchmark datasets
- The SpikeInterface project provides a unified Python framework for running and comparing different spike sorters
- Allen Brain Observatory releases massive datasets with ground-truth-validated spike sorting
If you're more interested in building BCIs that people can actually use today, without implants or spike sorting, the Neurosity developer program and Crown are where that story starts. The SDK documentation walks you through accessing raw EEG, frequency bands, focus scores, and kinesis commands. You can build your first brain-controlled application in an afternoon.
The gap between single-neuron spikes and scalp-level EEG is real. But the applications that emerge from each level of the abstraction ladder are equally real. Spike sorting enables paralyzed patients to move robotic arms. Frequency band analysis enables millions of healthy people to understand and optimize their own brains. Both are brain-computer interfaces. Both are worth building.
The difference is that one requires a neurosurgeon. The other requires a USB-C cable and npm install.

