What Is Common Average Reference in EEG?
Every EEG Signal Is a Lie (Sort Of)
Here's something that might rearrange how you think about brain data: every single voltage value in an EEG recording is made up. Not fabricated. Not fictional. But not what most people assume it is, either.
When you look at an EEG trace and see a line wiggling up and down at, say, electrode Cz, you're not looking at the absolute electrical potential at that spot on the scalp. You're looking at the difference in electrical potential between Cz and some other electrode that someone decided to call "the reference."
Change the reference, and the signal changes. Sometimes a little. Sometimes a lot. The same brain, the same moment, the same neural activity can produce completely different-looking waveforms depending on which electrode you subtracted from which.
This isn't a flaw in EEG. It's a fundamental property of how voltage measurement works. A voltmeter always measures the difference between two points. There's no such thing as measuring voltage "at" a single point. Your multimeter has two probes for a reason.
But this creates a real problem for anyone working with EEG data. If the reference electrode happens to sit over an active brain region, its own neural activity contaminates every single channel in the recording. If the reference electrode has a bad connection, every channel gets noisy. The reference is the silent partner in every EEG measurement, and it's been causing arguments in neuroscience labs for decades.
Which brings us to one of the most popular solutions: common average reference.
What Does Common Average Reference Actually Do?
The idea behind common average reference, usually abbreviated CAR, is beautifully simple. Instead of referencing every channel against one physical electrode (like the left earlobe or the tip of the nose), you reference each channel against the mathematical average of all channels.
Here's the recipe. At every single time point in your recording:
- Take the voltage values from all your EEG channels.
- Compute the arithmetic mean (add them all up, divide by the number of channels).
- Subtract that mean from each individual channel.
That's it. That's the whole algorithm. If you have 64 channels, you calculate the average of all 64 values at time point t, then subtract that average from every channel at time point t. Repeat for every time point in the recording.
In Python, it's literally one line of NumPy:
data_car = data - np.mean(data, axis=0)
Where data is a matrix of shape (channels, time points). Nothing more sophisticated than that.
But what does this actually accomplish? Why does subtracting the mean do anything useful?
The Logic: Shared Noise Gets Canceled
Think of it this way. Imagine you're in a room with eight microphones, each one pointed at a different musician in an orchestra. But there's also an air conditioning unit humming in the background. Every microphone picks up its target musician plus the hum.
If you average all eight microphone signals together, the individual musicians partially cancel each other out (each one is different), but the hum adds up (it's the same on every mic). The average signal is dominated by the hum.
Now subtract that average from each microphone. The hum gets removed from every channel, and you're left with a cleaner recording of each individual musician.
That's the logic of common average reference. Any electrical activity that appears equally on all channels, like power line interference at 50 or 60 Hz, distant muscle artifacts, or far-field potentials from deep brain sources, gets captured by the mean. Subtracting the mean removes it.
What survives the subtraction is the activity that's different across channels: the local, spatially specific brain signals that make each electrode's recording unique. And those local signals are usually what you care about.
If channel i at time t has raw voltage V_i(t), and you have N total channels, then the common average reference version of channel i is:
V_i_CAR(t) = V_i(t) - (1/N) * SUM of V_j(t) for all j from 1 to N
This is equivalent to re-referencing to a virtual electrode whose potential is the average of all physical electrodes. If the electrodes cover the entire head evenly and there are enough of them, this virtual electrode approximates a point with zero potential, an "ideal" reference.
Why Neuroscientists Love It (Three Good Reasons)
Common average reference became one of the most widely used re-referencing methods in EEG research for solid, practical reasons. Not because it's trendy. Because it solves real problems.
Reason 1: It eliminates the reference bias problem
With a physical reference electrode (say, linked earlobes or the mastoid), any neural activity at the reference site bleeds into every channel. If your reference is on the left mastoid and the left temporal lobe happens to be doing something interesting, that temporal activity shows up as a ghost signal across your entire montage. CAR spreads the reference contribution across all channels instead of loading it onto one.
Reason 2: It's easy to apply after the fact
You can re-reference to common average at any point during analysis, long after the data was recorded. The original physical reference doesn't matter because re-referencing is a linear operation. This makes CAR popular in data sharing. Two labs using different physical references can re-reference to CAR and compare their results.
Reason 3: It improves topographic maps
Topographic maps (those colorful scalp plots you see in EEG papers) are extremely sensitive to reference choice. A single-electrode reference can create artificial asymmetries. CAR produces topographies that more accurately reflect the actual spatial distribution of brain activity, at least when the conditions are right.
The Catch: When Common Average Reference Fails
Here's where it gets interesting. And if you're a developer working with consumer-grade EEG, this is the section that matters most.
Common average reference rests on a critical assumption: the average of all electrodes is approximately zero. This assumption holds when you have many electrodes covering the entire head uniformly. With a dense 64-channel or 128-channel montage, the positive and negative contributions from different brain sources tend to cancel out in the average, leaving something close to zero.
But what happens when that assumption breaks down?
The few-channels problem
With fewer channels, each electrode makes up a larger share of the average. If you have 8 channels, every channel contributes 12.5% of the average. Subtract that average, and you're removing a significant piece of each channel's own signal.
Here's a concrete example. Say electrode C3 picks up a strong motor cortex signal. With 8 channels, C3 contributes 1/8 of the average. When you subtract the average from C3, you've just removed 12.5% of C3's own signal. Worse, you've also injected an inverted, attenuated copy of C3's signal into every other channel. That motor cortex activity now appears, faintly and flipped, at electrodes that had nothing to do with motor cortex.
This is the signal leakage problem, and it gets more severe as channel count drops.
Most EEG textbooks recommend using common average reference only with 32 or more channels. Below that threshold, the distortion from signal leakage becomes harder to ignore. With 8 channels, you should be aware that CAR will attenuate genuine signals and spread small copies of strong signals to other channels. This doesn't mean you can't use CAR with few channels. It means you need to understand what it's doing and interpret your results accordingly.
The uneven coverage problem
CAR also assumes electrodes are distributed evenly across the scalp. If your montage clusters electrodes over one region (say, frontal and central areas, but nothing over occipital), the average is biased toward whatever brain activity those clustered electrodes pick up. Subtracting this biased average can suppress real signals in the covered region and create phantom signals in uncovered regions.
The bad channel problem
One noisy or disconnected channel can wreck a common average reference. If channel F4 has a massive artifact, that artifact gets folded into the average and then subtracted from every other channel. Instead of containing the damage to one channel, you've spread it across the entire recording. This is why artifact rejection and bad channel interpolation should always happen before you apply CAR.
| Problem | What Happens | How to Mitigate |
|---|---|---|
| Few channels (under 32) | Each channel's signal leaks into all others. Strong signals get attenuated. Weak ghost signals appear where they shouldn't. | Consider REST or linked-ear reference. Or use CAR with awareness of the distortion. |
| Uneven electrode coverage | Average is biased toward over-represented regions. Topographic maps become distorted. | Use weighted average based on spatial distribution, or choose a different reference. |
| Bad channels in the montage | Artifact on one channel propagates to all channels through the average. | Reject or interpolate bad channels before applying CAR. |
| Strong focal source | Focal activity gets partially subtracted from its own channel and injected (inverted) into all others. | Use surface Laplacian if you need to isolate focal sources. |
What the Alternatives Look Like
Common average reference isn't the only game in town. If your recording setup or research question doesn't fit CAR's assumptions, you've got options.
Linked earlobes or mastoids
The classic approach. Place a reference electrode on each earlobe (or on the mastoid bone behind each ear), average them, and use that as the reference. The logic is that earlobes are relatively far from cortical generators, so they should be "quiet." In practice, they're not perfectly quiet, especially for temporal lobe activity. But for many applications, linked mastoids work fine.
REST (Reference Electrode Standardization Technique)
REST is the theoretically rigorous alternative. Developed by Dezhong Yao in 2001, it uses a mathematical model of the head (called a lead field matrix) to estimate what your EEG would look like if referenced to a point at infinity, a truly neutral point with zero potential.
The math is more involved than CAR. You need a forward model that accounts for electrode positions and head geometry. But the result is a reference that doesn't depend on how many electrodes you have or where they're placed. REST works well with both dense and sparse montages, which makes it particularly relevant for consumer EEG with limited channels.
Surface Laplacian (Current Source Density)
Surface Laplacian doesn't re-reference your data in the traditional sense. Instead, it estimates the radial current flowing in and out of the scalp at each electrode by computing the second spatial derivative of the voltage distribution. The result is reference-free: it doesn't depend on the original reference at all.
The Laplacian acts as a spatial high-pass filter. It removes broad, diffuse activity and sharpens local sources. This is fantastic if you want to isolate focal cortical generators. It's less useful if you care about widespread, synchronized activity (like large-scale network dynamics), because the Laplacian deliberately strips those out.

How They Compare at a Glance
| Reference Method | Best For | Channel Requirement | Key Limitation |
|---|---|---|---|
| Common Average (CAR) | Dense montages, ERP studies, topographic mapping | 32+ channels ideal | Signal leakage with few channels |
| Linked Mastoids | Clinical EEG, low-density setups, sleep staging | Any (just need 2 reference electrodes) | Mastoid picks up temporal activity |
| REST | Any montage density, cross-study comparison | Any (needs forward model) | Requires head model computation |
| Surface Laplacian | Isolating focal cortical sources, spatial sharpening | 19+ channels (needs spatial resolution) | Removes distributed/network activity |
The "I Had No Idea" Moment: Your Reference Choice Changes Your Results
Here's something that genuinely surprised me the first time I saw it demonstrated, and it's the reason this topic matters more than most developers think.
In 2017, a group of researchers took the same EEG dataset, the exact same raw recording from the exact same person doing the exact same task, and applied four different reference schemes: linked mastoids, common average, REST, and surface Laplacian. Then they analyzed the data for alpha power asymmetry, a metric used in hundreds of depression and emotion studies.
The results didn't just differ slightly. They produced opposite conclusions about which hemisphere was dominant. The same dataset, the same analysis pipeline, but a different reference, and the lateralization index flipped sign.
This isn't a weird edge case. A 2019 review in NeuroImage by Dong et al. found that reference choice significantly affected results in the majority of EEG studies they examined. Connectivity analyses were especially vulnerable: coherence and phase-locking values between two channels can look completely different depending on the reference.
The implication is straightforward: when you read an EEG paper or build a BCI pipeline, the reference scheme isn't just a preprocessing detail. It's a parameter that shapes the scientific conclusions. If you don't report it, your results aren't fully reproducible. If you don't understand it, you might be optimizing your model on distorted signals.
Common Average Reference With 8 Channels: A Practical Take
If you're building with the Neurosity Crown or any other consumer-grade headset, you're working with 8 channels. And you might be wondering: should I use CAR at all?
The honest answer is: it depends on what you're trying to do.
If you're computing power spectral density or band power: CAR can still help by removing shared line noise and common artifacts. The signal leakage issue is real but may not critically affect frequency-domain analyses where you're looking at power changes over time within each channel.
If you're doing connectivity analysis: Be careful. CAR with 8 channels creates artificial correlations between channels (because each channel now contains a piece of every other channel's signal). This inflates coherence estimates and can produce spurious connectivity patterns. Consider using REST or working with the data in its original reference.
If you're building a classifier (for BCI commands, mental state detection, etc.): Your machine learning model may actually learn to work with whatever reference you give it, as long as you're consistent between training and inference. Many successful BCI systems use CAR even with few channels because the classifier adapts to the transformed feature space. Just don't mix reference schemes between training and test data.
If you're comparing your results to published research: Match the reference scheme used in the papers you're referencing. If they used CAR with 64 channels and you're using CAR with 8, be aware that the two are not equivalent. The same re-referencing label doesn't guarantee the same signal characteristics.
Start here: What are you building?
- Band power / spectral analysis with 8 channels: CAR is reasonable. Remove bad channels first.
- ERP analysis with 8 channels: CAR is common but consider linked-ear if available. REST is better if you can implement the forward model.
- Connectivity / coherence: Avoid CAR with 8 channels. Use REST or analyze in original reference.
- Machine learning classifier: CAR is fine as long as you're consistent. Test whether CAR improves your model's accuracy compared to raw reference.
- Cross-study comparison: Match the reference used in the studies you're comparing against.
How to Apply CAR to Crown Data
If you're working with the Neurosity Crown's raw EEG data, applying common average reference is straightforward. The Crown streams data through its SDKs and through integrations like BrainFlow and Lab Streaming Layer (LSL).
Here's the basic workflow. Pull raw data from the Crown's 8 channels (at positions CP3, C3, F5, PO3, PO4, F6, C4, CP4). Check signal quality on all channels and exclude any with poor contact. Compute the mean voltage across the remaining good channels at each time point. Subtract that mean from every channel.
In a Python pipeline using BrainFlow or MNE-Python, this takes a few lines of code. MNE-Python even has a built-in function, set_eeg_reference('average'), that handles it automatically.
The key point: the Crown gives you the raw data. What you do with it, which reference you apply, which analysis you run, is entirely up to you. That flexibility is the difference between a consumer gadget that gives you a "focus score" and a real development platform that gives you the signal.
The Bigger Picture: References and the Future of Consumer EEG
Here's what's quietly exciting about all of this. The reference problem has been a live debate in academic EEG for decades. But until recently, it was entirely a concern for researchers with expensive, multi-channel systems. Consumer EEG users got preprocessed data with no choice about reference scheme.
That's changing. As consumer devices get more capable and developer communities grow, the people building EEG applications are increasingly people who want (and need) control over their signal processing pipeline. Understanding reference schemes isn't just academic anymore. It's a practical skill for anyone building with brain data.
And the tools are catching up. REST, which once required specialized MATLAB toolboxes and expert knowledge of forward modeling, now has open-source Python implementations. Surface Laplacian algorithms are available in MNE-Python. Common average reference is trivial to implement in any language. The barriers to doing this correctly are lower than they've ever been.
The question is no longer "can consumer EEG developers apply sophisticated reference schemes?" It's "do they know they should?"
Now you do.
What to Remember
The reference electrode problem in EEG is one of those topics that sounds like a minor technical detail until you realize it can flip your results upside down. Common average reference is popular for good reasons: it's simple, it removes shared noise, and it works well with dense montages. But it's not a universal solution. With few channels, it introduces distortion. With uneven coverage, it introduces bias. With bad channels, it spreads artifacts everywhere.
The right reference depends on your channel count, your electrode layout, your analysis goals, and what you're comparing your results against. There is no single best reference for all situations. But there is a worst approach: choosing one without understanding what it does to your data.
Whether you're building a BCI classifier, analyzing frequency band power, or running connectivity analyses on Crown data, the reference scheme is a first-class decision in your preprocessing pipeline. Treat it like one.

