Skip to content

DOC: Add epoch quality example#13710

Open
aman-coder03 wants to merge 8 commits intomne-tools:mainfrom
aman-coder03:enh-epoch-score-quality
Open

DOC: Add epoch quality example#13710
aman-coder03 wants to merge 8 commits intomne-tools:mainfrom
aman-coder03:enh-epoch-score-quality

Conversation

@aman-coder03
Copy link
Contributor

Reference issue (if any)

Closes #13676

What does this implement/fix?

Adds a score_quality() method to Epochs that scores each epoch on a 0 and 1 scale based on how much of an outlier it is relative to the rest of the recording. It uses peak-to-peak amplitude, variance, and kurtosis, z-scored robustly using median absolute deviation, no new dependencies.
The idea is to give users a quick, data-driven starting point before calling drop_bad(), instead of guessing thresholds from scratch. It's not trying to replace autoreject, just fill the gap for users who want something lightweight and built-in.

Additional information

Happy to adjust the API or scoring logic based on feedback. The main open question from the issue whether suggest_reject=True is worth adding, I've left out for now to keep the initial PR focused.

@CarinaFo
Copy link
Contributor

CarinaFo commented Mar 2, 2026

Hi,
I fully agree that this is a neat feature, but I am not sure about the use case.

I intuitively thought about the reject parameter in the epochs class. Here epochs are being rejected based on maximum peak-to-peak signal amplitude (PTP).

From my experience most users play around with this threshold to get a feeling for the amount of epochs being rejected. The function you implemented can inform the user of a threshold for rejecting bad epochs based on PTP etc., but I do think that it won't be useful to inform a threshold for epochs reject or autoreject.

It seems to me that it adds a layer of abstraction on rejection of noisy epochs, but maybe I misunderstood the use case you had in mind?

@aman-coder03
Copy link
Contributor Author

thanks for the feedback @CarinaFo
You are right that the use case isn't clear enough. The score isn't meant to directly inform the reject= threshold (since those are in physical units like µV and the score is just a unitless 0–1 ranking). It's more of an exploratory tool, a quick way to see which epochs stand out before deciding what to do with them, without having to scroll through everything manually or set up autoreject.

Think of it as answering "which epochs should I look at first?" rather than "what threshold should I use?". Especially handy for large datasets where manual inspection isn't realistic.

Happy to make this clearer in the docstring if that helps. And if the general feeling is that this doesn't add enough on top of what's already there, I'm open to that too

@tsbinns
Copy link
Contributor

tsbinns commented Mar 2, 2026

I agree with @CarinaFo's comment about the abstraction. Aggregating PTP, variance, and kurtosis like this is something I wouldn't have an intuition for what a good/bad score is.

That interpretation's also a bit tricky since these are z-scored values, so if you have a lot of bad epochs in your data, these might not show as being strong outliers in the scoring. This also means having a value for a 'bad score' that would define what epochs are worth inspecting could show you epochs of very different quality for each recording.

Is this scoring method coming from a paper? Are there some examples of how common artefacts would impact these scores? E.g., in the autoreject paper, they have some examples of what this catches and benchmarking for comparisons to other cleaning methods.

@larsoner
Copy link
Member

larsoner commented Mar 2, 2026

Moreover I think there are a lot of potential statistics/ways of deciding what is good and bad. autoreject has one published, packaged way of doing it. Given the potential ways of thinking about this problem, I'm not sure we want to add another that isn't "established best practice" at least in some subset of the (published) community.

One option would be to show some of these stats for the sample or some other dataset and put them an example instead. That way people can adapt the metrics to their own needs, even potentially easily adding their own once they see how it's done

@aman-coder03
Copy link
Contributor Author

on the paper question:the features aren't arbitrary. Kurtosis and variance z-scored across epochs come from Delorme et al(2007) and FASTER(Nolan et al., 2010), both fairly established in the EEG artifact detection literature. you're both right that the weighting and what counts as a "bad" score is still dataset dependent, which is a real limitation.
@larsoner 's suggestion of an example makes a lot more sense given that. It shows the approach without implying there's one universal way to do it, and lets users adapt the metrics to their own data
@larsoner should I convert this PR into an example instead, or would you prefer I close this and open a fresh one?
also would "Exploring epoch quality before rejection" fit in the preprocessing examples section, or is there a better place for it?

@larsoner
Copy link
Member

larsoner commented Mar 2, 2026

would "Exploring epoch quality before rejection" fit in the preprocessing examples section

Yeah I think a new example in

https://github.com/mne-tools/mne-python/tree/main/examples/preprocessing

could make sense. And adding refs to FASTER etc. could be good, too

@larsoner
Copy link
Member

larsoner commented Mar 2, 2026

(you can push commits to this PR to do this and edit the title, no need for a separate PR, but can do it that way instead if you want)

@aman-coder03 aman-coder03 changed the title ENH: Add Epochs.score_quality() for data-driven epoch quality scoring ENH: Add example for exploring epoch quality before rejection Mar 3, 2026
@aman-coder03
Copy link
Contributor Author

@larsoner i have pushed the example to this PR along with refs to FASTER (Nolan et al., 2010) and Delorme et al. (2007). Let me know if anything needs adjusting!!

@aman-coder03 aman-coder03 changed the title ENH: Add example for exploring epoch quality before rejection ENH: Add Epochs.score_quality() for epoch outlier scoring with preprocessing example Mar 3, 2026
@tsbinns
Copy link
Contributor

tsbinns commented Mar 3, 2026

on the paper question:the features aren't arbitrary. Kurtosis and variance z-scored across epochs come from Delorme et al(2007) and FASTER(Nolan et al., 2010)

Ah fair enough, didn't realise that's what they were using. Sorry for the confusion!

@larsoner
Copy link
Member

larsoner commented Mar 4, 2026

I think instead of a score_quality function, we can/should just show how to compute some of these things in the example. So no new epochs method, "just" an example that does epcohs.get_data() then computes some measures of interest

@aman-coder03
Copy link
Contributor Author

updated the PR @larsoner !!

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsbinns want to review and merge when you're happy?

@larsoner larsoner changed the title ENH: Add Epochs.score_quality() for epoch outlier scoring with preprocessing example ENH: Add epoch quality example Mar 6, 2026
@larsoner larsoner changed the title ENH: Add epoch quality example DOC: Add epoch quality example Mar 6, 2026
@CarinaFo
Copy link
Contributor

CarinaFo commented Mar 8, 2026

Very nice example. I think this would be a great starting point for a how-to guide. It is mainly a bit of restructuring of the way the example is introduced to the user.

@aman-coder03
Copy link
Contributor Author

thanks @CarinaFo that's a helpful distinction. The current example reads more like a tutorial, walking through everything from scratch. A how-to guide should assume the user already knows MNE and just wants to solve a specific problem 'I have epochs, how do I quickly identify the bad ones before rejection?'

i can restructure it to be more task oriented, drop the hand-holding, lead with the goal, and let the code speak for itself. Should I keep it in examples/preprocessing or move it to a how-to section if one exists?

@CarinaFo CarinaFo assigned CarinaFo and unassigned CarinaFo Mar 8, 2026
Comment on lines +17 to +24
References
----------
.. [1] Nolan, H., Whelan, R., & Reilly, R. B. (2010). FASTER: Fully Automated
Statistical Thresholding for EEG artifact Rejection.
Journal of Neuroscience Methods, 192(1), 152-162.
.. [2] Delorme, A., Sejnowski, T., & Makeig, S. (2007). Enhanced detection of
artifacts in EEG data using higher-order statistics and independent
component analysis. NeuroImage, 34(4), 1443-1449.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing, but please add these to the bibliography (doc/references.bib) and cite them in the text with :footcite:. Then at the end of the example, generate the references with:

# References
# ----------
# .. footbibliography::

This example demonstrates the use: https://github.com/mne-tools/mne-python/blob/main/examples/preprocessing/eeg_bridging.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: Add epochs.score_quality() native data-driven epoch quality scoring

4 participants