Motor Imagery BCI Classification with Frequency and Time-Frequency Features by Using Different Dimensions of the Feature Space Using Autoencoders
Esra KAYA
Selcuk University, Faculty of Technology, Electrical and Electronics Engineering, Konya, Turkiye
Ismail SARITAS
Selcuk University, Faculty of Technology, Electrical and Electronics Engineering, Konya, Turkiye
Abstract
Brain-Computer Interfaces (BCIs) enable the users to directly communicate with machines based on various desired purposes through brain signals without moving any body parts. Thus, they have become very useful for prostheses, electric wheelchairs, virtual keyboards, and other studies like survey applications and emotion classifications. In this study, EEG signal processing was performed on the BCI Competition III-3a dataset, which contains motor imagery (MI) signals with four classes. Features of the non-stationary EEG signals belonging to three subjects were extracted using Power Spectral Density (PSD) with welch method, Wavelet Decomposition (WD), Empirical Mode Decomposition (EMD) and Hilbert-Huang Transform (HHT). From extracted 900 features, feature space dimension reduction was realized using Autoencoder, an unsupervised learning algorithm. The average accuracy obtained with Artificial Neural Network (ANN) is 74.5% for all binary classifications, which is generally a good result because of the non-stationary nature of EEG signals. 801 features yielded the best classification performance, obtained using an autoencoder with 400 hidden layer neurons.
Keywords: ANN, Autoencoder, BCI, EEG, EMD, Hilbert-Huang, Wavelet1. Introduction
The BCI is a direct communication path with or without a cable, which allows bi-directional information flow between the brain and an external device, where the brain signals are information carriers. BCIs are often used to research, map, support, increase or repair human cognitive or sensory-motor functions. BCIs are used in many applications, such as prostheses and electric wheelchairs, virtual keyboards, survey applications, and emotion classifications. Electroencephalography (EEG) is the most used for BCI applications and physiological signals, which are electrical potential representations of brain signals. EEG signal is a harmless signal to human health and is more advantageous and easier to use than other techniques that help us obtain and analyze brain signals [1-4]. Motor Imagery (MI) signals are the EEG signals that occur when the user of the BCI system imagines the necessary movement of a body part to use a specific purpose machine [5]. In literature, Royer et al. conducted a study on the control of a virtual helicopter in a three-dimensional space using MI signals with four classes: right-hand movement to go right, left-hand movement to go left, both hands up to rise, and both hands down to descend or rest. %67 of flight time was closer to the intended path while using BCI in this study [6]. In another study, Bhattacharyya et al. designed an Interval Type-2 Fuzzy classifier to classify EEG signals obtained by imagining a total of five wrist and finger movements presented with audio and visual stimulation. They extracted Extreme Energy Ratio (EER) features and obtained 86.45% and 78.44% accuracies for offline and online classifications, respectively [7]. Ang and Guan identified a strategy for detecting MI signals for control and rehabilitation purposes. 29 of 34 chronic stroke patients were suitable for BCI use. Within the calibration sessions used for training and subsequent test sessions, subjects were assigned two or more MI tasks, such as left- and right-hand movements. The accuracy rates obtained from the signal analyzed by the common spatial pattern filter bank were 79.8% for offline and 69.5% for online classifications [8]. Mahajan and Bansal developed a neuro-rehabilitation control application that processes EEG signals through Arduino. EEG signals were obtained over a total of 5 sessions which consisted of 20s periods, from each of the three male and seven female subjects. Peak amplitude values were used as features in the signals, classified as comfortable condition and blink. If the maximum peak amplitude value exceeds the threshold value, the led lamp is turned on with Arduino Uno, corresponding to the blink [9]. Mistry et al. conducted a study to realize wheelchair control using visual triggering for individuals with motor cortex disabilities. The signal intensity was determined by calculating the target frequencies, mean values, and SNR values with Fourier transform of EEG signals received from the parietal lobe and occipital lobes of 4 individuals with visual triggers consisting of 4 different vibration speeds (7 Hz, 9 Hz, 11 Hz, and 13 Hz). From 7 Hz LED for left, 9 Hz LED for forward, 11 Hz LED for right, and 13 Hz LED for backward, only two LEDs were active each time, first right and left decision, then forward and backward decision. The average accuracy was 79.4%, and a simple path follow-up using all the classes took 5 minutes and 9 seconds [10]. Athif et al. introduced a new method called WaveCSP, which combines wavelet decomposition and Common Spatial Pattern (CSP) concepts to extract features for more robust classification of EEG signals. The classification of the right-hand and left-hand MI signals has given 63.5% average accuracy with the k-Nearest Neighbor (kNN) classifier [11]. A new network structure called QNet was proposed by Fan et al., which learns the attention weight of EEG channels, time points, and feature maps. The method provided an 82.9% accuracy rate for the classification of right-hand and left-hand MI signals [12]. Xiao et al. proposed a channel selection algorithm based on coefficient-of-variation for right-hand and left-hand MI EEG signal classifications by dividing channels into different categories according to their contributions to the feature extraction process. They have achieved an average accuracy of 74.30% [13]. As seen from the literature, many studies use different features and methods for various BCI control applications. However, the accuracy results are not at the desired levels yet. Thus, BCI technology needs more effective feature extraction, feature selection, and classification algorithms for more accurate daily use. This study used the IIIa dataset consisting of four class MI EEG signals created for the BCI Competition in the BCI laboratory of Graz University of Technology, Austria. We have extracted features using Power Spectral Density (PSD) with the welch method, Wavelet Decomposition, Empirical Mode Decomposition (EMD), and Hilbert-Huang Transform (HHT). The dimensions of the feature space increase if there are more electrode channels in an EEG cap and if more feature extraction methods are used to represent a signal. Thus, in this study, the feature space was reduced using Autoencoder neural network, an unsupervised learning algorithm [14]. The classification was realized with 5-Fold Cross-Validated ANN. The autoencoder hidden layer sizes were changed to see the effects of different size feature spaces on the binary classification of EEG signals. It was seen that the Autoencoder could reduce the size of the existing feature space and represent the feature space more effectively.
2. Material and Methods
2.1. BCI Competition III-3a Dataset
BCI Competition III-3a dataset was obtained in the BCI laboratory of Graz University of Technology, Austria. The dataset contains MI signals with four classes: right-hand movement to go right, left-hand to go left, tongue movement to go up, and feet movement to go down [15, 16]. The sequence of the events in the dataset was shown to 3 subjects according to the paradigm shown in Figure 1.
Figure 1. The paradigm of BCI Competition IIIa Dataset [15, 16]
The EEG signals of the three subjects based on the designated paradigm were obtained from a 60-electrode EEG Cap [15, 16]. The electrode positions corresponding to various regions of the brain are shown in Figure 2. In this study, EEG signals that belong to 9 electrode channels were used, which were 6, 8, 20, 22, 28, 31, 34, 48, and 50 electrode placements. These electrodes correspond to the brain's frontal lobe, motor cortex, and parietal lobe, where activities are related to control applications.
Figure 2. Electrode Positions of a 60 Electrode Cap [15, 16].
The EEG segments for four classes belonging to the first, second, and third subjects were 360, 240, and 240, respectively. Half of these segments are labeled, and the other half is unlabeled [15, 16]. In this study, the labeled segments consisting of 2560 samples were used. For segments with fewer than 2560 samples, the averages of the samples were taken to complete the missing data. A total of 409 labeled segments that were used in this study were 175, 118, and 116 for the first, second, and third subjects, respectively.
2.2. EEG Signal Processing and Feature Extraction
Before we used the signals from the stated nine electrodes, we applied the Common Average Reference (CAR) method, subtracting the average value of all electrodes from the signals of all electrodes. In addition, baseline correction is applied for each electrode separately by subtracting the average of the signal from the signal itself belonging to one electrode. Then, in order to protect the signals' significant parts and reveal their outline, the 9th-degree db4 wavelet decomposition was applied, and the detail coefficients of the four lowest levels were subtracted from the main signal; thus, filtering was performed. After filtering the signals, we separated the signals into their bands (delta, theta, alpha, beta, gamma) using elliptic filters to extract features defining them. An example of EEG bands belonging to the first class (right hand) from the first electrode of the first subject is shown in Figure 3.
Figure 3. EEG Bands belonging to Class-1of the first electrode of the first subject
Welch method, which is the averaged and modified version of the periodograms [17], was applied to the signals of each electrode channel, and Power Spectral Density (PSD) values were found. These values were calculated for each EEG band separately, and the mean, standard deviation, skewness, kurtosis, and logarithmic energy entropy values of PSDs were used as frequency domain features. Then, the 9th-degree Daubechies-db4 wavelet coefficients were obtained because wavelet transform can give information about the changes in a signal or an image and the time location of the occurring changes, unlike Fourier transform, which loses the time information [18]. Mean, standard deviation, skewness, kurtosis, and logarithmic energy entropy values of wavelet coefficients were calculated separately for each EEG band and used as time-frequency features. Empirical Mode Decomposition (EMD) method considers the oscillations in signals in a very local manner [19]. The decomposition is based on the understanding that the signal can contain many simple oscillations at significantly different frequencies superimposed on each other [20]. The results of the EMD method are components defined as Intrinsic Mode Functions (IMF), which define the oscillations in a signal. These IMFs must satisfy the following conditions: The number of extremes and zero-crossings must either be equal or differ at most by 1, and the mean value of the enveloping signals defined as local maxima and local minima must be zero at any given data point [20]. After applying EMD with piecewise cubic Hermite interpolating polynomial method for envelope construction to all EEG bands, the mean, standard deviation, skewness, kurtosis, and Shannon entropy values were calculated for the resulting maximum 5 IMFs and used as time-frequency features. Hilbert-Huang Transform (HHT) reveals the Hilbert spectrum of a signal sampled at a specific frequency and specified by IMFs resulting from EMD. HHT is useful for analyzing signals that contain a mixture of signals whose spectral components changes in time. The time-frequency representation of a signal with HHT does not contain spurious oscillations. Thus, the signal is in its more natural and physically meaningful form [20]. After applying HHT to all EEG bands, the mean, standard deviation, skewness, kurtosis, and Shannon entropy values were calculated and used as time-frequency features. As a result of the feature extraction process, a total of 900 features were obtained from 5 EEG bands of 9 electrode channels belonging to 3 subjects. Five statistical calculations from 5 EEG bands of 9 channels for one feature extraction method result in 5x5x9 = 225 features, resulting in 225x4 feature extraction methods total of 900 features.
2.3. Feature Space Dimension Reduction and Classification
After obtaining the total amount of 900 features, we have decided to reduce the feature space dimension because 900 features need lots of computation time. The process should be fast and effective. So, we have used Autoencoder neural networks for feature space dimension reduction. Autoencoders are unsupervised learning algorithms that automatically learn features from unlabeled data by applying backpropagation and setting the number of target values equal to the number of inputs. Autoencoders are also useful for discovering interesting structures in the data by placing constraints on the network, such as limiting the number of hidden units [14]. Suppose the data is not reconstructed at the end of the procedure. In that case, the encoder and decoder weights and biases based on hidden layer size can be used as a reduced version of the feature space dimension. We have selected 50, 100, 150, 200, 250, 300, 350, 400, and 450 neurons for a hidden layer of Autoencoder. The autoencoders used the Scaled Conjugate Gradient function for backpropagation and Mean Squared Error with L2 and Sparsity Regularizers for performance function. The encoder weights, decoder weights, and decoder biases were used as feature space which consists of 101, 201, 301, 401, 501, 601, 701, 801, and 901 features for all hidden layers, respectively. After reducing the feature space dimension, we have classified the EEG samples belonging to 3 subjects divided into four categories: right-hand, left-hand, tongue, and feet. The classifications were realized as binary. Of the 409 labeled samples in total, there are 101 samples from the first category (right-hand), 100 samples from the second category (left-hand, 104 samples from the third category (tongue), and, finally, 104 samples from the fourth category (feet). The classifier is a pattern recognition ANN with 200 hidden neurons trained with a resilient backpropagation algorithm. The ANN was applied with 5-fold cross-validation. The 409 samples were divided randomly as 70% for training, 15% for testing, and 15% for validation. The transfer functions of the network were chosen as a logarithmic sigmoid function for the input layer and a tangent sigmoid function for the output layer. The performance function was Mean Squared Error function. The flow chart of the study is shown in Figure 4.
Figure 4. The flow chart of the study.
3. Results and Discussion
After all the binary classifications were realized, the testing accuracy rates based on different feature spaces obtained using autoencoders are given in Table I. The results are shown as a different representation in Figure 5, where the changes in results can be seen based on the change in feature space size. According to the results shown in Table I, the original feature space and feature space with 901 elements obtained with autoencoders have similar accuracy rates. The maximum accuracy rate of 74.5% was obtained with 801 features. However, features of 401 and 501 obtained with autoencoders with 200 and 250 neurons are also close to the maximum accuracy rate of 71.5% and 70.8%, respectively. On the other hand, right-hand and left-hand classification is the most classified classification pair in literature, and the maximum accuracy rate obtained for this pair is 86.7% with 401 features. The maximum accuracy of 80.6% and 90.3% for right-hand and tongue, and right-hand and feet pairs, respectively, were obtained with 801 features. For the left-hand and tongue pair, the maximum accuracy of 83.9% was obtained with 701 features from the Autoencoder's 350 neurons. Finally, 80.6% maximum accuracies were obtained with 501 features of the Autoencoder's 250 neurons for the left-hand and feet pair and tongue and feet pair.
Table 1. Testing accuracy rates (%) of binary classifications based on feature spaces obtained using autoencoders with different hidden layer sizes. (RH: Right-Hand, LH: Left-Hand, T: Tongue, F: Feet)
Classification Categories |
Feature Space Size
Created with Autoencoder |
Original Feature Space
(900) |
||||||||
101 |
201 |
301 |
401 |
501 |
601 |
701 |
801 |
901 |
||
RH-LH |
70.0 |
70.0 |
76.7 |
86.7 |
73.3 |
66.7 |
53.3 |
56.7 |
66.7 |
50.0 |
RH-T |
61.3 |
58.1 |
67.7 |
71.0 |
64.5 |
71.0 |
54.8 |
80.6 |
67.7 |
51.6 |
RH-F |
54.8 |
83.9 |
77.4 |
58.1 |
71.0 |
67.7 |
83.9 |
90.3 |
58.1 |
67.7 |
LH-T |
58.1 |
58.1 |
61.3 |
51.6 |
54.8 |
61.3 |
83.9 |
67.7 |
45.2 |
64.5 |
LH-F |
80.6 |
58.1 |
54.8 |
74.2 |
80.6 |
71.0 |
48.4 |
77.4 |
67.7 |
54.8 |
T-F |
58.1 |
51.6 |
54.8 |
87.1 |
80.6 |
80.6 |
54.8 |
74.2 |
54.8 |
77.4 |
Average Acc. (%) |
63.8 |
63.3 |
65.5 |
71.5 |
70.8 |
69.7 |
63.2 |
74.5 |
60.0 |
61.0 |
Std. Dev. |
8.9 |
10.7 |
9.3 |
13.3 |
9.1 |
5.9 |
14.8 |
10.5 |
8.3 |
9.8 |
Figure 5. Binary classification results of different feature spaces were obtained using an Autoencoder with a different number of neurons in the hidden layer.
The lowest standard deviation value of binary classifications was 5.9, obtained for 601 features, which shows that the change in accuracies for all binary classifications is less than for other feature space sizes. The average accuracy for 601 features was 69.7%, obtained using an autoencoder with 300 hidden layer neurons. The results show that the Autoencoder does not only change the size of feature space but changes the weights of the features because the average accuracy results change randomly.
4. Conclusion
The maximum average accuracy for all binary classifications was 74.5%, with 801 features obtained using an autoencoder with 400 hidden layer neurons. This result is acceptable because the nature of the signal is non-stationary, so it is hard to characterize EEG signals. However, the original feature space has 900 features, so there is not much reduction of the feature space using an autoencoder. On the other hand, the feature space with 401 features is the closest one, with 71.5% accuracy. Also, the literature's most used classification pair of right-hand and left-hand has an 86.7% accuracy rate with 401 features. This study shows that the average accuracy rates of binary classifications for features obtained using autoencoders with different size hidden layers change randomly. Thus, it can be said that an autoencoder does not change only the feature space size but the weights of features while representing the original feature space in another way. In another study, the most useful features can be found in the original feature space and use an autoencoder to represent them for a more compact and effective feature space.
Author's NotePart of this work was presented at the 9th International Conference on Advanced Technologies ICAT'2020 Istanbul, Turkiye.
References
- M. Hämäläinen, R. Hari, R. J. Ilmoniemi, J. Knuutila, and O. V. Lounasmaa, "Magnetoencephalography—theory, instrumentation, and applications to noninvasive studies of the working human brain," Reviews of modern Physics, vol. 65, no. 2, p. 413, 1993.
- R. Srinivasan, "Methods to improve the spatial resolution of EEG," International journal of bioelectromagnetism, vol. 1, no. 1, pp. 102-111, 1999.
- P. M. Vespa, V. Nenov, and M. R. Nuwer, "Continuous EEG monitoring in the intensive care unit: early findings and clinical efficacy," Journal of Clinical Neurophysiology, vol. 16, no. 1, pp. 1-13, 1999.
- F. Yasuno et al., "The PET radioligand [11 C] MePPEP binds reversibly and with high specific signal to cannabinoid CB 1 receptors in nonhuman primate brain," Neuropsychopharmacology, vol. 33, no. 2, pp. 259-269, 2008.
- J. Decety and D. H. Ingvar, "Brain structures participating in mental simulation of motor behavior: A neuropsychological interpretation," Acta psychologica, vol. 73, no. 1, pp. 13-34, 1990.
- A. S. Royer, A. J. Doud, M. L. Rose, and B. He, "EEG control of a virtual helicopter in 3-dimensional space using intelligent control strategies," IEEE Transactions on neural systems and rehabilitation engineering, vol. 18, no. 6, pp. 581-589, 2010.
- S. Bhattacharyya, M. Pal, A. Konar, and D. Tibarewala, "An interval type-2 fuzzy approach for real-time EEG-based control of wrist and finger movement," Biomedical Signal Processing and Control, vol. 21, pp. 90-98, 2015.
- K. K. Ang and C. Guan, "EEG-based strategies to detect motor imagery for control and rehabilitation," IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 4, pp. 392-401, 2016.
- R. Mahajan and D. Bansal, "Real time EEG based cognitive brain computer interface for control applications via Arduino interfacing," Procedia computer science, vol. 115, pp. 812-820, 2017.
- K. S. Mistry, P. Pelayo, D. G. Anil, and K. George, "An SSVEP based brain computer interface system to control electric wheelchairs," in 2018 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), 2018: IEEE, pp. 1-6.
- M. Athif and H. Ren, "WaveCSP: a robust motor imagery classifier for consumer EEG devices," Australasian physical & engineering sciences in medicine, vol. 42, no. 1, pp. 159-168, 2019, doi: https://doi.org/10.1007/s13246-019-00721-0.
- C.-C. Fan, H. Yang, Z.-G. Hou, Z.-L. Ni, S. Chen, and Z. Fang, "Bilinear neural network with 3-D attention for brain decoding of motor imagery movements from the human EEG," Cognitive Neurodynamics, vol. 15, no. 1, pp. 181-189, 2021, doi: https://doi.org/10.1007/s11571-020-09649-8.
- R. Xiao, Y. Huang, R. Xu, B. Wang, X. Wang, and J. Jin, "Coefficient-of-variation-based channel selection with a new testing framework for MI-based BCI," Cognitive Neurodynamics, vol. 16, no. 4, pp. 791-803, 2022, doi: https://doi.org/10.1007/s11571-021-09752-4.
- A. Ng, "Sparse autoencoder," CS294A Lecture notes, vol. 72, no. 2011, pp. 1-19, 2011.
- A. Schlögl, G. Müller, R. Scherer, and G. Pfurtscheller, "BIOSIG-an Open Source Software Package for biomedical Signal Processing," in 2nd OpenECG Workshop, 2004: . pp. 77-78.
- B. Blankertz et al., "Bci competition iii," Fraunhofer FIRST. IDA, http://ida. first. fraunhofer. de/projects/bci/competition_iii, 2005.
- S. Villwock and M. Pacas, "Application of the Welch-method for the identification of two-and three-mass-systems," IEEE Transactions on Industrial Electronics, vol. 55, no. 1, pp. 457-466, 2008.
- R. Choudhary, S. Mahesh, J. Paliwal, and D. Jayas, "Identification of wheat classes using wavelet features from near infrared hyperspectral images of bulk samples," Biosystems Engineering, vol. 102, no. 2, pp. 115-127, 2009.
- G. Rilling, P. Flandrin, and P. Goncalves, "On empirical mode decomposition and its algorithms," in IEEE-EURASIP workshop on nonlinear signal and image processing, 2003, vol. 3, no. 3: Citeseer, pp. 8-11.
- N. E. Huang and Z. Wu, "A review on HilbertâHuang transform: Method and its applications to geophysical studies," Reviews of geophysics, vol. 46, no. 2, 2008.