What Is OpenNeuro and Why Should You Care?
The Largest Library of Brain Data on Earth Is Free. Most People Don't Know It Exists.
Right now, at this exact moment, there are over 1,000 brain imaging datasets sitting on a server, free for anyone to download. EEG recordings from hundreds of subjects. fMRI scans capturing neural activity during everything from watching movies to doing mental arithmetic. MEG data with millisecond precision. All of it organized, documented, and waiting for someone to do something interesting with it.
The repository is called OpenNeuro. And if you work with brain data in any capacity, whether you're a neuroscience researcher, a BCI developer, or someone training machine learning models on neural signals, it's one of the most valuable resources you've probably never used.
Here's what makes this genuinely remarkable. In most scientific fields, data hoarding is the norm. A lab spends two years collecting brain recordings from 50 subjects, publishes a paper, and then the data sits on a hard drive in someone's office until the drive fails. Other researchers who want to verify the findings, extend the work, or train models on similar data? They have to start from scratch. Recruit new subjects. Apply for new IRB approval. Spend months collecting what already exists somewhere else.
OpenNeuro was built to break that cycle. And it's working.
The Problem That Made OpenNeuro Necessary
Neuroscience has a dirty secret, and it's not the one you're thinking of. Yes, there's the replication crisis. Yes, sample sizes in brain imaging studies are often embarrassingly small. But the deeper problem is structural: the field has been sitting on mountains of valuable data that nobody else can access.
Think about it this way. Every year, thousands of labs around the world record brain activity from human subjects. fMRI studies, EEG experiments, MEG recordings. Each study follows its own conventions for how the data is organized. Different file formats. Different naming schemes. Different metadata structures. Even if a researcher wanted to share their data, there was no standardized way to do it. Uploading a folder of cryptically named .edf files to a university server doesn't count as sharing. It's just littering.
This created two cascading problems.
Problem one: wasted effort. A lab in Tokyo and a lab in Berlin might independently collect nearly identical EEG datasets because neither knows the other's data exists. Multiply this by thousands of labs worldwide, and the amount of duplicated work is staggering.
Problem two: small data in a big-data world. Machine learning models are hungry. A neural network trying to decode emotional states from EEG needs hundreds or thousands of recordings to generalize well. But most individual studies collect data from 20 to 40 subjects. That's not enough for ML models that generalize. The data exists across labs, in aggregate. It just isn't accessible in aggregate.
OpenNeuro's founders, led by Russell Poldrack at Stanford, looked at this landscape and asked a straightforward question: what if there were a single place where any researcher could upload their brain data, in a standardized format, and anyone else could download it for free?
The answer is a repository that now hosts over 1,000 datasets covering more than 50,000 participants across every major brain imaging modality.
What OpenNeuro Actually Is (And Isn't)
Let's get specific. OpenNeuro is a free, web-based platform for sharing brain imaging data. It lives at openneuro.org. You can browse and download datasets without an account. You need a free account to upload.
Here's what it is:
- A storage and distribution platform for brain imaging datasets
- Strictly BIDS-formatted (we'll get to why that matters in a minute)
- Modality-agnostic: it accepts EEG, fMRI, MEG, iEEG, PET, and structural MRI
- Funded primarily by the National Institute of Mental Health (NIMH)
- Built on top of DataLad and git-annex for version-controlled data management
- Free. Completely, permanently free
Here's what it is not:
- An analysis platform (you download data and analyze it with your own tools)
- A data marketplace (there are no premium tiers, no paid datasets)
- A preprint server (it hosts data, not papers, though datasets link to associated publications)
- A replacement for domain-specific repositories like PhysioNet or the Human Connectome Project (those serve different purposes and different communities)
URL: openneuro.org
Datasets: Over 1,000 publicly available
Participants: More than 50,000 across all datasets
Modalities: EEG, fMRI, MEG, iEEG, PET, structural MRI
Data format: BIDS (Brain Imaging Data Structure) required
Cost: Free for uploads and downloads
License: Most datasets use CC0 (public domain) or CC-BY
Funding: NIH/NIMH grants
The BIDS Requirement: Why It's the Best Thing About OpenNeuro
If OpenNeuro were just a file server where people dump brain data in whatever format they feel like, it would be useless. What makes it actually work is a single, ruthless requirement: every dataset must conform to the Brain Imaging Data Structure, or BIDS.
BIDS is a community standard that specifies exactly how brain imaging data should be organized. Folder hierarchy. File naming conventions. Metadata format. Everything. And "exactly" is doing serious work in that sentence.
Here's what a BIDS-formatted EEG dataset looks like:
A typical BIDS-EEG dataset follows this structure: the top level contains a dataset_description.json, a participants.tsv file, and a folder for each subject (sub-01, sub-02, etc.). Inside each subject folder, there's an eeg/ directory containing the raw EEG recording, a channels.tsv describing each channel, an events.tsv listing experimental events with timestamps, and a sidecar JSON with recording parameters like sampling rate and reference scheme.
This might look like bureaucratic overhead if you've never tried to work with someone else's brain data. But if you have, you know it's the difference between "I can start analyzing this in ten minutes" and "I spent three days figuring out which column is which channel and what the event codes mean."
BIDS solves the metadata problem that has plagued neuroscience data sharing for decades. When you download a BIDS dataset from OpenNeuro, you don't need to email the original researcher to ask what their channel labels mean, or what sampling rate they used, or how the experimental trials were structured. It's all there, in standardized files, in standardized locations, using standardized nomenclature.
And here's the part that really matters for developers: because BIDS is machine-readable, you can write code that works on any BIDS dataset without modification. Load one OpenNeuro dataset, and your pipeline works on all of them. This is what makes OpenNeuro genuinely useful for ML training, rather than being a place where you download 50 datasets and spend 50 hours writing 50 different parsers.
What's Actually in There? A Tour of the Data
OpenNeuro's collection spans every major brain imaging modality. But the distribution isn't even, and knowing what's available (and how much of it exists) will save you time.
| Modality | Approximate Datasets | Common Tasks | Typical File Formats |
|---|---|---|---|
| fMRI | 600+ | Resting state, visual perception, language, emotion, decision-making | NIfTI (.nii, .nii.gz) |
| EEG | 250+ | P300 paradigms, motor imagery, attention, sleep staging, emotion | BrainVision (.vhdr), EDF (.edf), EEGLAB (.set) |
| MEG | 50+ | Sensory processing, auditory, language, motor | CTF (.ds), Elekta (.fif) |
| iEEG | 30+ | Epilepsy monitoring, speech decoding, memory | EDF, BrainVision |
| PET | 10+ | Receptor mapping, metabolism | NIfTI |
| MRI (structural) | Included in most fMRI datasets | Anatomy, parcellation | NIfTI |
A few observations worth highlighting.
fMRI dominates. This isn't surprising. fMRI has been the workhorse of cognitive neuroscience for three decades, and the push for open data started in the fMRI community before spreading to other modalities. If you're looking for fMRI data, OpenNeuro is one of the richest sources on the planet.
EEG is growing fast. The EEG collection has expanded significantly since the BIDS-EEG extension was finalized. You'll find everything from classic P300 oddball paradigms to motor imagery datasets (useful for BCI training), to resting-state recordings, to full sleep studies. The quality varies. Some datasets have 256 channels recorded on research-grade systems. Others have 4 to 8 channels from consumer devices. Both are valuable for different purposes.
MEG is underrepresented. MEG systems cost millions of dollars and live in magnetically shielded rooms, so there are simply fewer labs generating MEG data. The datasets that are on OpenNeuro tend to be high quality and well documented.
Here's the "I had no idea" moment for most people who explore OpenNeuro for the first time: the sheer variety of experimental paradigms represented. There's a dataset where subjects listened to audiobooks while being scanned. One where they played a gambling task. One where they looked at faces displaying different emotions. One where they imagined moving their hands. One where they were just resting with their eyes closed. Each of these represents months of data collection, ethics approvals, and careful experimental design. And it's all free to download and use.

How to Actually Use OpenNeuro: A Practical Walkthrough
Browsing OpenNeuro is simple. Using it effectively takes a bit more knowledge. Here's how to go from "I want brain data" to "I'm training a model on brain data" without losing a weekend to format wrangling.
Finding Datasets
The OpenNeuro web interface lets you search and filter by modality, number of subjects, and keywords. But here's a practical tip: the search is basic. If you know the type of data you need, you're better off browsing the OpenNeuro datasets page and filtering by modality first, then scanning descriptions manually.
Each dataset page shows:
- A description of the experiment
- Number of subjects and sessions
- Modality and task information
- Dataset size
- Associated publications (if any)
- A file browser showing the full BIDS tree
- Download options
Downloading Data
You have three options for getting data off OpenNeuro:
Option 1: Web download. Click the download button. This works for small datasets but becomes impractical beyond a few gigabytes, which most fMRI datasets exceed easily.
Option 2: DataLad. OpenNeuro is built on DataLad, a version control system for data. If you install DataLad, you can clone any dataset like a git repository and selectively download only the files you need. This is the recommended approach for large datasets.
Option 3: AWS S3. All OpenNeuro datasets are mirrored to Amazon S3. If you're working in the cloud or need programmatic access, you can pull data directly from S3 using the AWS CLI or any S3-compatible tool.
Install DataLad (pip install datalad), then run datalad clone with the dataset URL from OpenNeuro. This creates a lightweight clone with metadata but no actual data files. Then use datalad get to download specific files or directories. For example, to grab only the EEG data for subject 01, you would point datalad get at the sub-01/eeg/ directory. This selective downloading means you never have to pull an entire 50GB dataset just to look at one participant.
Loading Data Into Your Analysis Pipeline
Once you've downloaded a BIDS dataset, loading it into your tools is straightforward because BIDS was designed for this.
MNE-Python has a dedicated BIDS reader (mne-bids). Call read_raw_bids() with the BIDS path, and it automatically reads the EEG data, applies the correct channel names and types from the BIDS metadata, and loads event information from the events.tsv file. Three lines of code to go from OpenNeuro to analysis-ready data.
EEGLAB supports BIDS through the bids-matlab-tools plugin. It can import an entire BIDS dataset and create an EEGLAB STUDY structure for group-level analysis across all subjects.
BrainFlow doesn't read BIDS directly, but BrainFlow is an acquisition tool. Once data is recorded and saved in BIDS format (using MNE-BIDS or bids-matlab-tools), it enters the analysis pipeline through MNE-Python or EEGLAB.
PyTorch and TensorFlow. For ML applications, you'll typically use MNE-Python to load and preprocess the BIDS data, then convert it to NumPy arrays or tensors for your training pipeline. Libraries like braindecode (built on MNE-Python and PyTorch) streamline this for neural decoding tasks.
Training Machine Learning Models on OpenNeuro Data
This is where OpenNeuro goes from "useful academic resource" to "genuinely exciting for builders."
The fundamental bottleneck in brain-computer interface development has never been algorithms. It's been data. You can build the most elegant neural decoder in the world, but if you train it on 20 subjects from a single lab with a single hardware setup, it won't generalize. It'll learn the quirks of that specific lab's recording environment instead of learning actual neural signatures.
OpenNeuro changes this equation.
By pooling EEG datasets from dozens of independent studies, you can train models on data collected across different hardware, different labs, different countries, and different experimental setups. This forces your model to find signal patterns that are genuinely neural rather than artifactual. It's the brain-data equivalent of training an image classifier on photos taken with many different cameras under many different lighting conditions, instead of one camera in one room.
Here's a concrete example. Say you're building a motor imagery classifier for a BCI application. On OpenNeuro, you can find at least a dozen motor imagery EEG datasets. Some use 64 channels. Some use 32. Some use different amplifiers, different electrode placements, different task instructions. If your model can learn to classify "imagined left hand movement" across all of these datasets, you've built something that has a real shot at working on new subjects with new hardware.
| ML Application | Useful OpenNeuro Data | Why It Matters |
|---|---|---|
| Motor imagery BCI | Multiple motor imagery EEG datasets with left/right/feet paradigms | Cross-dataset training produces models that generalize across hardware and subjects |
| Emotion detection | EEG datasets with emotional stimuli (images, video, music) | Larger training sets reduce overfitting to individual affective responses |
| Sleep staging | Full-night polysomnography and EEG sleep recordings | Clinical-grade sleep data without recruiting subjects or running sleep labs |
| Attention monitoring | Sustained attention task EEG datasets | Train focus/distraction classifiers on diverse attentional data |
| Neural decoding | fMRI and EEG datasets with rich stimulus sets | Build decoders that reconstruct perceived or imagined content |
The BIDS format is what makes this aggregation practical. Because every dataset uses the same structure, you can write one data loader that ingests any OpenNeuro EEG dataset. You don't need custom parsing code for each study. This is a subtle point, but it's the difference between "I could theoretically combine 12 datasets" and "I actually did combine 12 datasets last weekend."
How to Contribute Your Own Data
Here's where open science stops being something you benefit from and becomes something you participate in. Contributing data to OpenNeuro is straightforward, but the BIDS formatting step is where most people get stuck.
Step 1: Format Your Data as BIDS
This is the non-negotiable part. OpenNeuro runs a BIDS validator on every upload and rejects anything that doesn't pass. For EEG data, the essential components are:
- Raw EEG files in a supported format (BrainVision, EDF, or EEGLAB .set)
- A dataset_description.json with your dataset name, BIDS version, and license
- A participants.tsv listing subject demographics
- Channel descriptions in *_channels.tsv files
- Event markers in *_events.tsv files
- Recording parameters in sidecar JSON files
The fastest path from raw EEG to valid BIDS is MNE-BIDS, a Python library specifically designed for this conversion. If you're recording with BrainFlow (which supports the Neurosity Crown and 30+ other devices), you can pipe your data through MNE-Python into MNE-BIDS and get a valid BIDS directory in a handful of lines.
Step 2: Validate Locally
Before uploading, run the BIDS Validator on your formatted dataset. It's available as a web app, a command-line tool (npm install -g bids-validator), and a Docker container. Fix every error. Warnings are worth reading too, since they often point to missing metadata that would make your dataset more useful to others.
Step 3: Upload to OpenNeuro
Create a free account on openneuro.org. Click "Upload Dataset." Drag your BIDS directory into the browser. OpenNeuro runs the validator again on their side. If it passes, your dataset gets a unique accession number (like ds004893) and becomes browsable immediately. You can keep it private initially and publish it when you're ready.
Step 4: Associate with a Publication
If your dataset is linked to a paper, add the DOI. This creates a bidirectional link between the data and the publication. Reviewers increasingly expect this, and some journals (like Scientific Data) require data deposits as a condition of publication.
If you're recording EEG with the Crown and want to share data on OpenNeuro, the workflow is: record using the Neurosity SDK or BrainFlow, export to a standard format, convert to BIDS using MNE-BIDS (setting the channel positions to CP3, C3, F5, PO3, PO4, F6, C4, CP4), validate with the BIDS Validator, and upload. The Crown's 256Hz sampling rate and standard 10-20 electrode positions make BIDS conversion straightforward.
Where OpenNeuro Fits in the Larger Data Ecosystem
OpenNeuro is not the only place to find brain data. It's worth understanding how it relates to other repositories so you look in the right place.
| Repository | Focus | Data Types | Access Model |
|---|---|---|---|
| OpenNeuro | General brain imaging, BIDS-only | EEG, fMRI, MEG, iEEG, PET | Free, open access |
| PhysioNet | Physiological signals, clinical data | EEG, ECG, EMG, clinical records | Free, some datasets require credentialing |
| Human Connectome Project | Brain connectivity atlases | fMRI, diffusion MRI, MEG | Free with registration |
| UK Biobank | Large-scale population health | fMRI, structural MRI, genetics, health data | Application required, fees for access |
| BNCI Horizon 2020 | BCI-specific datasets | EEG (motor imagery, P300, SSVEP) | Free, direct download |
| MOABB | BCI benchmarking | EEG from multiple BCI paradigms | Free, Python API for direct loading |
For BCI developers specifically, here's a useful rule of thumb: OpenNeuro for diverse, well-documented datasets. MOABB for standardized BCI benchmarking. PhysioNet for clinical EEG (like seizure detection or sleep scoring). BNCI Horizon for classic BCI paradigm data.
The overlap between these repositories is growing, and that's a good thing. Some datasets exist in multiple locations. But OpenNeuro's strict BIDS requirement means its datasets tend to be better documented and more immediately usable than data found elsewhere.
The Open Science Connection That Most People Miss
There's a philosophical thread running through OpenNeuro that connects directly to how the best tools and platforms in neurotechnology are being built right now.
The BIDS specification is open. OpenNeuro is open. The analysis tools that work with BIDS data (MNE-Python, EEGLAB) are open-source. The acquisition tools that feed data into these pipelines (BrainFlow, Lab Streaming Layer) are open-source. The Neurosity SDK, which connects the Crown to this entire ecosystem, is open-source.
This isn't coincidence. It's architecture. Open science works because open tools create an interconnected ecosystem where data flows without friction. When your EEG device speaks BrainFlow, and BrainFlow speaks MNE-BIDS, and MNE-BIDS speaks OpenNeuro, and OpenNeuro speaks back to every analysis tool on the planet, you've got a pipeline with zero proprietary choke points.
That matters for reproducibility. If someone downloads your OpenNeuro dataset and processes it with the same open-source tools you used, they should get the same results. No "well, we used our proprietary algorithm" black boxes. No "you need a license for this specific software version" gatekeeping.
And it matters for progress. Every dataset uploaded to OpenNeuro becomes a building block for future work. The motor imagery data you contribute today might train the BCI decoder that helps a locked-in patient communicate next year. The resting-state EEG you share might become part of a dataset that reveals a new biomarker for early-stage neurodegeneration. You can't predict what your data will be used for. That's the point.
Getting Started Today
If you've read this far and want to actually do something with OpenNeuro this week, here are three concrete starting points based on what you're building.
If you're training ML models: Search OpenNeuro for EEG datasets matching your target paradigm. Download two or three using DataLad. Load them with MNE-BIDS into a shared preprocessing pipeline. Combine the processed data into a single training set. The BIDS consistency across datasets will save you more time than you expect.
If you're doing research: Before collecting new data, check whether a suitable dataset already exists on OpenNeuro. Even if it's not a perfect match, it might serve as a pilot dataset to validate your analysis pipeline before you invest months in data collection. And when your study is done, upload your data. It's increasingly expected by journals and funding agencies.
If you're building with the Neurosity Crown: Record EEG data using the Neurosity SDK or BrainFlow. Convert it to BIDS using MNE-BIDS. Upload it to OpenNeuro. You'll be contributing consumer-grade EEG data to an ecosystem that has historically been dominated by lab-grade equipment. That diversity makes the whole repository more valuable, because it pushes analysis tools and ML models to handle real-world recording conditions, not just idealized lab settings.
The brain is the most complex object we've ever tried to understand. No single lab, no single company, no single dataset is going to crack it. But a thousand datasets, organized consistently, accessible freely, analyzed openly? That's how you build a map of something this complicated. One shared recording at a time.

