The common spatial patterns (CSP) algorithm [1, 2] is a popular supervised decomposition method for the EEG signal analysis which is used to distinguish between two classes (conditions). It finds spatial filters that maximize the signal variance for one class, while simultaneously minimizing the signal variance for the opposite class. Here, we review the CSP algorithm and its two main implementation approaches.

We assume that the EEG is already band-pass filtered and centered. Let X_i \in \mathbb{R}^{C \times T} be the EEG signal of trial i where C is the number of channels and T is the number of samples per trial. We compute the spatial covariance R_1 \in \mathbb{R}^{C \times C} by averaging over trials of class 1:

(1)   \begin{equation*}R_1 = \frac{1}{|\mathcal{I}_1|} \sum_{i \in \mathcal{I}_1} \frac{X_i X_i^T}{\trace(X_i X_i^T)}\end{equation*}

where \mathcal{I}_1 is the set of indices corresponding to trials belonging to class 1, |\mathcal{I}_1| denotes the size of the set \mathcal{I}_1, and \trace is the trace of a matrix, and spatial covariance R_2 equivalently for class 2. In the following derivations of CSP, we assume that R_1 and R_2 have full rank (i.e, \rank(R_1) = \rank(R_2) = C).

The goal of CSP is to find a decomposition matrix W \in \mathbb{R}^{C \times C} that projects the signal x(t) \in \mathbb{R}^C in the original space to x_{CSP}(t) \in \mathbb{R}^C as follows:

(2)   \begin{equation*}x_{CSP}(t) = W^T x(t)\end{equation*}

with the following properties:

(3)   \begin{equation*}W^T R_1 W = D_1\end{equation*}

(4)   \begin{equation*}W^T R_2 W = D_2\end{equation*}

and scaling such that

(5)   \begin{equation*}D_1 + D_2 = I_C\end{equation*}

where I_C \in \mathbb{R}^{C \times C} is the identity matrix. In other words, R_1 and R_2 share the same eigenvectors and the sum of corresponding eigenvalues is always 1. The eigenvector with the largest eigenvalue for class 1 has the smallest eigenvalue for class 2 and vice-versa.
Columns of W are spatial filters. Columns of a matrix A = (W^T)^{-1} represent spatial patterns.

Geometric approach

We determine whitening transformation matrix U for composite spatial covariance R_1 + R_2 such as

(6)   \begin{equation*}U (R_1 + R_2) U^T = I_C\end{equation*}

We factorize composite spatial covariance

(7)   \begin{equation*}R_1 + R_2 = E F E^T \end{equation*}

where E is the orthogonal matrix of eigenvectors (in columns) and F is the diagonal matrix of the corresponding eigenvalues of R_1 + R_2. We define whitening transformation U as

(8)   \begin{equation*}U = F^{-1/2} E^T\end{equation*}

and transform matrices R_1 and R_2

(9)   \begin{equation*}\begin{aligned}S_1 = U R_1 U^T \\S_2 = U R_2 U^T\end{aligned}\end{equation*}

We factorize matrix S_1

(10)   \begin{equation*}S_1 = P D_1 P^T\end{equation*}

where P is the orthogonal matrix of eigenvectors and D_1 is the diagonal matrix of the corresponding eigenvalues of S_1. We define decomposition matrix W^T as

(11)   \begin{equation*}W^T = P^T U\end{equation*}

Then this W satisfy 3

(12)   \begin{equation*}W^T R_1 W = P^T S_1 P = D_1\end{equation*}

and also 4 using 5

(13)   \begin{equation*}W^T R_2 W = P^T U R_2 U^T P = P^T (I_C - U R_1 U^T) P = I_C - D_1\end{equation*}

Generalized eigenvalue problem approach

We can directly solve W by getting W^T from 5 [3]:

(14)   \begin{equation*}D_1 + D_2 = I_C = W^T (R_1 + R_2) W\end{equation*}


(15)   \begin{equation*}W^T = W^{-1} (R_1 + R_2)^{-1}\end{equation*}

and by inserting this into 3

(16)   \begin{equation*}W^{-1} (R_1 + R_2)^{-1} R_1 W = D_1\end{equation*}

we get

(17)   \begin{equation*}R_1 W = (R_1 + R_2) W D_1\end{equation*}

which is an equation of generalized eigenvalue problem. Or equivalently, by inserting W_T into 4 we get the following generalized eigenvalue problem

(18)   \begin{equation*}R_2 W = (R_1 + R_2) W D_2\end{equation*}

Another solution

We also mention another solution W_{g} (with different diagonal values D_1 and D_2), which is often present in the literature, that satisfies only 3 and 4 but not 5. We get W_g^T from 3 as

(19)   \begin{equation*}W_g^T = D_1 W_g^{-1} R_1^{-1}\end{equation*}

and by inserting this into 4

(20)   \begin{equation*}D_1 W_g^{-1} R_1^{-1} R_2 W_g = D_2\end{equation*}

we get

(21)   \begin{equation*}R_2 W_g = R_1 W_g (D_1^{-1} D_2)\end{equation*}

which is a generalized eigenvalue problem. This solution has different eigenvalues. W_g differs from W only by a diagonal scaling matrix G

(22)   \begin{equation*}W_g = G^{1/2} W\end{equation*}

with G = D_1 + D_2 to satisfy 5.

[1] [doi] J. Müller-Gerking, G. Pfurtscheller, and H. Flyvbjerg, “Designing optimal spatial filters for single-trial EEG classification in a movement task,” Clinical neurophysiology, vol. 110, iss. 5, p. 787–798, 1999.
[2] [doi] B. Blankertz, R. Tomioka, S. Lemm, M. Kawanabe, and K. Muller, “Optimizing Spatial filters for Robust EEG Single-Trial Analysis,” Ieee signal processing magazine, vol. 25, iss. 1, p. 41–56, 2008.
[3] [doi] L. C. Parra, C. D. Spence, A. D. Gerson, and P. Sajda, “Recipes for the linear analysis of EEG,” Neuroimage, vol. 28, iss. 2, p. 326–341, 2005.
