Home / Work / eeg-seizure-detection
EEG Seizure Detection
Multi-Architecture Benchmark on Pediatric EEG

Problem
Most published seizure-detection models report numbers on a single architecture or subject — leaving a critical question unanswered: which model class actually generalizes across patients in real conditions?
Approach
Built a unified preprocessing pipeline on the CHB-MIT corpus (24 patients, 916 hours of recording) using MNE for filtering, artifact handling, and windowing. So every architecture trains on byte-identical inputs.
Benchmarked 15+ architectures across four families. LSTM/GRU recurrent, Transformer, Mamba/state-space, and Mixture-of-Experts. With patient-disjoint splits to measure cross-subject generalization, not memorization.
Containerized training with version-pinned environments and tracked every run. Swapping an architecture is one config change and a full re-evaluation is reproducible end-to-end.
Results
AUROC 0.740 across 15+ architectures
- AUROC 0.740 (best family, patient-disjoint evaluation)
- 15+ architectures benchmarked under identical preprocessing
- 916 hours of CHB-MIT EEG processed across 24 pediatric subjects
- Reproducible: any run can be re-executed from the locked environment
Stack
What I learned
Architecture choice matters less than preprocessing rigor and patient-disjoint evaluation. Several models that look state-of-the-art on shuffled splits collapse on held-out patients. The infrastructure to run a fair comparison is the actual scientific contribution; the model rankings are downstream of that.