Using the coffeine filterbank for dedoding motor imagery#

This notebook is based on the MNE example and illustrates the construction of the filterbank models.

First we load the data as in the original example.

[1]:
import numpy as np
import pandas as pd

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import ShuffleSplit, cross_val_score

import mne
from mne import Epochs, pick_types, events_from_annotations
from mne.io import concatenate_raws, read_raw_edf
from mne.datasets import eegbci

from coffeine import compute_coffeine, make_filter_bank_classifier
[2]:
mne.set_log_level('critical')
pd.set_option("large_repr", "info")
[3]:
tmin, tmax = -1.0, 4.0
event_id = dict(hands=2, feet=3)
subject = 1
runs = [6, 10, 14]  # motor imagery: hands vs feet
raw_fnames = eegbci.load_data(subject, runs)
raw = concatenate_raws([read_raw_edf(f, preload=True) for f in raw_fnames])
eegbci.standardize(raw)  # set channel names
[4]:
# Apply band-pass filter
raw.filter(4.0, 35.0, fir_design="firwin", skip_by_annotation="edge")

events, _ = events_from_annotations(raw, event_id=dict(T1=2, T2=3))
picks = pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False, exclude="bads")

# Read epochs (train will be done only between 1 and 2s)
# Testing will be done with a running classifier
epochs = Epochs(
    raw,
    events,
    event_id,
    tmin,
    tmax,
    proj=True,
    picks=picks,
    baseline=None,
    preload=True,
)

labels = epochs.events[:, -1] - 2
conditions = ['feet', 'hand']

Building a coffeine data frame of covariances per frequency#

In the following, we compute covariances based on pre-defined frequencies and show how to make a coffeine data frame from them. This was previously complicated, now coffeine provides the API for it.

As this is event-related data and not subject-level data as in Sabbagh et al 2020, we need to loop over epochs. Luckily, coffeine does this for us. We now get the pandas data frame where each columns is an object array of covariances, which is represented as a list of covariances, leading to an object array type.

We can now make the pandas data frame where each columns is an object array of covariances, which we can represent as a list of covariances.

[5]:
X_df, feature_info = compute_coffeine(epochs, frequencies=('ipeg', ['alpha1', 'alpha2']))
X_df.head()
[5]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   alpha1  5 non-null      object
 1   alpha2  5 non-null      object
dtypes: object(2)
memory usage: 208.0+ bytes

Now we can call the model constructor and select our preferred covariance vectorizer, which is the Riemannian tangent-space embedding. As this makes the assumption of full-rank data, it can be worthwhile to inspect the rank of the data. As we will see, a rank of 64 seems to be a safe assumption althought it could also be ~60 if on some epochs the rank is lower. If the rank of covariance is different, it makes sense to take the smallest common rank.

[6]:
mne.compute_covariance(epochs).plot(epochs.info)
../_images/tutorials_filterbank_classification_bci_10_0.png
../_images/tutorials_filterbank_classification_bci_10_1.png
[6]:
(<Figure size 380x370 with 2 Axes>, <Figure size 380x370 with 1 Axes>)
[7]:
fb_model = make_filter_bank_classifier(
    names=list(X_df.columns),
    method='riemann',
    projection_params=dict(scale=1, n_compo=60, reg=0),
    estimator=LogisticRegression(solver='liblinear', C=1e7)
)
[8]:
cv = ShuffleSplit(5, test_size=0.2, random_state=42)
[9]:
scores = cross_val_score(estimator=fb_model, X=X_df, y=labels, cv=cv, n_jobs=2)
[10]:
print(f'Mean classification accuracy: {np.mean(scores):0.2f}')
Mean classification accuracy: 0.84