11
A Combinational Deep Learning Approach to Visually Evoked EEG-Based Image Classification

Nandini Kumari*, Shamama Anwar and Vandana Bhattacharjee

Department of Computer Science and Engineering, Birla Institute of Technology, Mesra, Ranchi, India

Abstract

Visual stimulus evoked potentials are neural oscillations acquired, from the brain’s electrical activity, evoked while seeing an image or video as stimuli. With the advancement of deep learning techniques, decoding visual stimuli evoked EEG (electroencephalogram) signals has become a versatile study of neuroscience and computer vision alike. Deep learning techniques have capability to learn problem specific features automatically, which eliminates the traditional feature extraction procedure. In this work, the combinational deep learning–based classification model is used to classify visual stimuli evoked EEG signals while viewing two class images (i.e., animal and object class) from MindBig and Perceive dataset without the need of an additional feature extraction step. To achieve this objective, two deep learning–based architecture have been proposed and their classification accuracy has been compared. The first proposed model is a modified convolutional neural network (CNN) model and the other one is hybrid CNN+LSTM model which can better classify and predict the EEG features for visual classification. The raw EEG signal is converted to spectrogram images as CNN-based networks learn more discriminating features from images. The proposed CNN+LSTM-based architecture uses a depthwise separable convolution, i.e., Xception Network (Extreme inception network) with a parallel LSTM layer to extract temporal, frequential and spatial features from EEG signals, and classify more precisely than proposed CNN network. The overall average accuracy achieved are 69.84% and 71.45% on CNN model and 74.57% and 76.05% on combinational CNN+LSTM model on MindBig and Perceive dataset, respectively, which shows better performance than CNN model.

Keywords: Electroencephalogram, visual stimuli, convolutional neural network, long short-term memory, Xception, depthwise separable convolution, spectrogram

11.1 Introduction

Deep learning is a sub-field of artificial intelligence which endeavors to learn significant layerwise abstracted patterns and information by using deep network models. It is an arising approach and has been broadly applied in conventional decision reasoning and machine learning areas, for example, semantic parsing [9], natural language processing [39], transfer learning [13, 46], computer vision [33], and some more. Three most significant explanations behind the booming of deep learning today: the significantly expanding and high processor abilities (e.g., GPU units), the low expense of computing hardware, and the extensive advancement of machine learning algorithms [17]. Recently, diverse deep learning strategies have been broadly implemented and examined in many application areas such as self-driving cars [25], news aggregation and fraud news detection [57], natural language processing [31], virtual assistants entertainment [30], visual recognition [50], fraud detection [47], healthcare [21], and many more. Deep learning is widely being accepted for computer vision tasks as it has shown near-human or even better capabilities to perform numerous tasks, such as object detection and sequence learning. As opposed to deep learning, traditional machine learning approaches for classification tasks require hand-crafted discriminating features which essentially contribute to design a system, for example, linear discriminant analysis (LDA), or support vector machine (SVM). The hand-crafted feature extraction techniques require detailed information of the problem domain for discovering exceptionally expressive features in complex tasks, for example, computer vision and natural language processing. Additionally, the large dimension of extracted features also prompts the curse of dimensionality. Deep learning is the best approach to move toward these complex issues. The different layers in a deep learning model reveals the complex features by progressively learning theoretical portrayals in the first few layers and more specific qualities in the accompanying layers [22]. Deep networks have been exposed to be effective for computer vision tasks such as visual recognition, object recognition [18], motion tracking [19], action recognition [36], human pose estimation [53], and semantic segmentation [37]. Since, they can separate suitable features while simultaneously performing classification or detection tasks. Through the ImageNet LargeScale Visual Recognition Challenge (ILSVRC) competitions [1], deep learning techniques have been generally embraced by various analysts and accomplished highest accuracy scores. Recently, deep learning strategies have found applications in more innovative and promising areas of research dealing with human perception. As vision is one of the main component in the human perception system, it has great potential to find applications based on deep learning techniques. At the point when the eyes get visual instigation, neural spikes are conveyed to the brain. The analysis of these neural spikes is turning into a fascinating research subject in the field of computer vision. Even though deep learning mechanism [35], for example, CNN, has accomplished a great improvement in the image classification tasks [26, 43], computer vision is as yet unapproachable with of the extent of the actual human vision. Alternatively, extricating discriminating component relying upon the train dataset, the visual stimuli evoked recognition in a human’s mind includes intellectual hypotheses as well as perceptive process [7], i.e., how the visual stimulus such as images and video invigorate the human’s brain with respect to color shading and shape, and which parts of mind visual cortex can offer reaction to them. Some intellectual neuroscience researches investigated that while recording the brain’s electrical activity when a person is seeing an image, there are some explicit pattern about visual evoked image categories produced in the brain [45]. The human cerebrum has the capacity to arrange visual improvements depending on the common and frequent patterns. This grouping and categorizations of these frequent patterns is fast and shows up in millisecond time scales [10]. Because of the splendid temporal resolutions of the electroencephalogram (EEG) signals, the visual object’s images can be successfully perceived utilizing explicit patterns generated from EEG signals that are captured when images are shown on screen before the subjects [56]. EEG signals have been used to investigate and analyze the complexity of EEG signal for object detection and for classification tasks using machine learning methods [6].

Recently, the accessibility of huge EEG datasets and advancement in machine learning techniques have both motivated the implementation of deep learning tools, particularly in analyzing EEG signals and in perceiving the brain’s functionality. At present, deep learning has likewise attracted lot of consideration in the field of EEG classification research. A few researchers have attempted to apply the deep learning strategy to EEG in order to accomplish good accuracy [15]. But deep learning strategies ar still challenging when applied to the EEG-based BCI framework because of different affecting factors, for example, the relationship among channels, artifacts, and noise while EEG recording and the high-dimensional EEG information. To address all these challenges mentioned above, the effective implementation of deep learning strategies for the characterization of EEG signals is a serious accomplishment.

Schematic illustration of the flowchart of proposed architecture.

Figure 11.1 Flowchart of proposed architecture.

This paper aims at providing a CNN- and LSTM-based framework for classification of visually evoked stimuli. Figure 11.1 describes the overall workflow of the proposed architecture. As a very initial step, we have explored the utilization of CNNs for multi-class classification of EEG signals recorded while a subject is viewing image from animal (i.e., cat, dog, panda, capuchin, and elephant) and object (i.e., mug, bike, airliner, broom, and phone) category. In the second step, the acquired EEG signals are then pre-processed using basic filtering process in order to remove artifacts, followed by a Short-Time Fourier Transform (STFT) implementation so as to get the spectrogram images of the EEG signals. Finally, these images are then used as input to the proposed CNN model for a binary classification task representing the two categories, i.e., animal and object. The paper also presents accuracy results of the implemented CNN network designed to classify stimuli evoked EEG recordings along with a comparative study of some past outcomes from CNN and combinational CNN+LSTM-based classifiers. The result shows that the proposed CNN+LSTM-based model provides better accuracy compared to some of the existing work. The following sections of the paper presents an extensive literature review, followed by the methodology section including dataset description, the proposed architecture and the implementation of the model. At last, the results are documented followed by the conclusion section.

11.2 Literature Review

There are various innovative researches based on feature extraction and machine learning techniques [14]. Priorly, the time-recurrence examination [23], nonlinear elements, unpredictability, synchronization, and additions of collected vitality [28] techniques were utilized as feature extraction strategies. Additionally, the machine learning classifier also incorporates naive Bayes, conventional neural network system, SVM, random forest classifiers [24]. Indeed, feature extraction strategies have been utilized effectively in EEG classification tasks. In past studies, different feature extraction techniques for retrieving EEG complex information have been proposed [5]. Existing strategies for the discovery of complex pattern uses hand-crafted procedures for extraction of prominent features from EEG signals, then the chosen features must be able to characterize EEG signals by utilizing a wide range of classifiers, for example, SVM, KNN, principal component analysis, independent component analysis, and LDA [51] using FFT-based dual tree complex wavelet transform [11].

Most of the literature on the traditional approaches extract features from EEG for different application domains. This results in a 1D vector representation of the extracted features and this representation may loose some of the discriminating and prominent features. Therefore, few researcher have used methods to represent an EEG signal using a 2D representation before applying a deep learning framework. In [16], STFT was used to transform time-frequency features of EEG data into 2D images. In some previous researches, feature extraction was not executed and the deep learning models were rather prepared with raw EEG signals [4, 29].

These images were then used as input neurons for two class motor imagery classifications by combining CNN and VAE (Variational Auto-Encoder) framework. Multiple motor imagery classification has also been implemented using two variant of CNN models, namely, monolithic (single CNN) and modular (four CNN) architecture by using Bayesian optimization technique. These two procedures uses features extracted by a variation of Discriminative Filter Bank Common Spatial Pattern (DFBCSP) [44].

Deep Learning techniques such as DBN (Dynamic Bayesian Network), RNN (Recurrent Neural Network), CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory), SAE (Stacked Autoencoder), and hybrid deep networks have made noteworthy leaps in seizure and emotion detection, sleep scoring, motor imagery, event related potential detection, and visual stimuli–related image classification problem [15] and pattern recognition tasks [20, 32]. Recently, EEG information used in conjugation with facial expressions has been applied in persistent emotion recognition [49].

In [55], a new 3D CNN architecture to predict different stages of EEG data for epilepsy classification from multichannel EEG signals was discussed. Each EEG channel is converted into 2D images and are then combined to form 3D images which serve as input to a 3D CNN structure for classification. According to [59], frequency-based signals have greater potential than temporal signals for CNN applications; therefore, both frequency and time domain features for EEG classification were tested using the same CNN classification architecture. The EEG signal is also utilized to screen the brain’s electrical activity to analyze diverse sleep disorders. Deep learning has been also applied for sleep disorder detection using single channel EEG [40] and speech recognition [42]. Decoding visual stimulus from EEG captured brain’s activities is an interdisciplinary investigation of neuroscience and computer vision. Novel EEG-driven automatic visual classification and regeneration techniques have likewise been proposed. Ran Manor and Amir B. Geva have presented a CNN model for single-trail EEG classification of visually evoked potential tasks [38]. Similar tasks were performed for object classification using EEG signals evoked by stimulus-related potentials but with different CNN architecture [48]. Nicholas Waytowich proposed compact CNN to extract features from raw EEG signals automatically which can be used to decode signals from a 12 class visual evoked potential dataset [54]. The combination of CNN and LSTM has also been implemented for visual evoked classification. An automated model for visual classification in which LSTM is used to extract prominent feature representation from EEG signal and further that is fed to ResNet that leads to the improvement of classification performance [58]. A deep neural network (DNN) with RNN and CNN is trained for the EEG classification task by using video as visual stimuli [52]. In [8], authors proposed a deep learning model that uses a hybrid architecture based on CNN and RNN to classify visual evoked EEG signals and achieved better accuracy than CNN only as LSTM works well with temporal data as it helps to find the prominent features, and then, CNN architecture uses those features for improved classification. Despite many examples of impressive progress of deep network for visual stimuli evoked EEG signal classification, there is still room for considerable improved performance accuracy with hybrid deep structure.

11.3 Methodology

11.3.1 Dataset Acquisition

In this experiment, two publicly accessible open datasets; MindBig dataset [2] and Perceive lab dataset [3] are considered. Table 11.1 represents the overall description of both the datasets.

In this paper, for experimentation, the EEG signals are used from two categories of visual stimuli that is object and animals. Object class consists of the EEG signals recorded while seeing images of objects like mug, bike, airliner, broom, and phone. Similarly, the animal class consists of EEG signals acquired when a subject is viewing the images of animals like cat, dog, panda, capuchin, and elephant.

Table 11.1 Dataset description.

DatasetMindBig datasetPerceive dataset
No. of subject1 subject6 subjects
No. of images used as stimulus569 image class from ImageNet ILSVRC2013 train dataset (14,012 images)40 image class from ImageNet ILSVRC2013 train dataset (2,000 images)
Available dataRaw EEG data availableRaw EEG data available
Sampling rate128 Hz128 Hz
EEG headset5-channel EEG Headset (Emotive Insight) used. Channel locations: AF3, AF3, T7, T8, Pz128-channel EEG headset (actiCAP 128ch)

11.3.2 Pre-Processing and Spectrogram Generation

The EEG signals acquired from the dataset can be decomposed into different frequency band powers, the range of which specifically depends on the motivation behind the research. A 50-Hz notch filter and a band-pass Butterworth filter between 14 and 71 Hz were set up; hence, the recorded EEG signals covered the Beta (15-31 Hz) and Gamma (32-70 Hz) frequency, as they carry information about the psychological cycles engaged with the visual recognition [41]. The time and spatial resolution of EEG data can be retained by converting the EEG data to an image like representation. Various techniques have been used to create a 2D image representation from 1D raw signals to handle classification tasks of a time series data using CNNs [27, 34]. Hence, to maintain both time and spatial resolution of the acquired EEG signal and to convert 1D EEG signals to 2D images, the STFT algorithm is applied to the filtered EEG signals. It produces a solitary 2D time-frequency domain spectrogram images for every EEG electrode position E. For instance, in this experiment, 128 EEG channels were used to create 128 2D spectrogram images.

The EEG signals in MindBig dataset were acquired from five electrode points; therefore, five 2D spectrogram images were generated for each image sample. Here, 1,250 spectrogram images with 64 × 64 image size is generated from 25 trails for each of the 10 images (5 from object group and 5 from animal group), i.e., 25 × 10 × 5 = 1,250 images. Similarly, the EEG signals from the Perceive dataset were acquired from 6 subjects with 128 electrode points; therefore, 128 2D spectrogram images can be generated of 5 trails for each of the 10 images, i.e., 10 × 10 × 128 = 12,800 images from 1 subject. A total number of 76,800 spectrogram images has been generated from 6 subjects using the STFT algorithm. The generated data was split into 80:20 ratio.

We have proposed two deep learning–based model to classify visual stimuli evoked EEG signal. The first architecture has been proposed using CNN model and then the hybrid architecture of CNN and LSTM has been proposed as hybrid deep learning model to achieve better performance. These models have been also compared with respect to accuracy, architecture, and its performance.

11.3.3 Classification of EEG Spectrogram Images With Proposed CNN Model

Since the objective is to classify spectrogram images which is generated from stimuli evoked EEG signals, a modified version of CNN model has been used to classify EEG spectrogram. The STFT generated 2D spectrogram images serve as an input vector to the CNN architecture. In this paper, the CNN architecture is composed of four Conv2D layers and one fully connected (FC) layers. All four layers use non-linear activation function ReLU (Rectified Linear Units) to convert the output between 0.01 and 1. In this paper, the CNN architecture is composed of four Conv2D layers and two FC layers. All four layers use non-linear activation function ReLU to convert the output between 0.01 and 1.

The first layer receives the spectrogram images of dimension 64 × 64 × 3 (3 represents RGB images) as input and outputs a 64 × 64 × 64 feature map by applying 64 kernels with kernel size 3 × 3 of stride 1. The second layer comprises of a pooling layer with stride 2 which has been applied to reduce the dimensionality of the 128 kernels with 3 × 3 kernel size and yields 62 × 62 × 128 feature map followed by a 2D Maxpooling layer with 2 stride rate which suppresses the output to 31 × 31 × 128. The third and fourth layers consist of 256 kernels with 3 × 3 kernel size which yields 29 × 29 × 256 feature maps. Again a 2D maxpooling layer is used to find the most relevant features with size 14 × 14 × 256. Finally, the resultant feature map from the previous layer are FC with 2 dense layer of 512 and 2 outputs neurons and derive the probabilities for 2 classes, i.e., object and animal using 50% dropout and Softmax function. A brief explanation with number of layers, input and output size, different operation along with filter size implemented in CNN model is shown in Table 11.2. This architecture has been tested using both the datasets.

Table 11.2 Architecture of proposed convolutional neural network.

LayerInputOperationFilter sizeStridesOutput
164 × 64 × 3Conv2D + ReLU64 × 3 × 3164 × 64 × 64
364 × 64 × 64Conv2D + ReLU128 × 3 × 3162 × 62 × 128
462 × 62 × 128Maxpool2 × 2231 × 31 × 128
531 × 31 × 128Conv2D + ReLU256 × 3 × 3131 × 31 × 256
631 ×31 × 256Conv2D + ReLU256 × 3 × 3129 × 29 × 256
729 × 29 × 256Maxpool2 × 2214 × 14 × 256
814 × 14×256Flatten50,176
9Dense + ReLU51225,690,624
10Dropout (0.5)
11Dense + Softmax22 × 512

Additionally, this network was trained with batch size 32 and RMSprop (Root Mean Square Propagation), a widely known gradient descent optimizer was used for 100 epochs. For the RMSprop optimizer, a learning rate of 0.0001 and decay of 1e–6 was used. This architecture has been tested using the above described dataset and acquired results and further analysis is provided in Result and Discussion section.

11.3.4 Classification of EEG Spectrogram Images With Proposed Combinational CNN+LSTM Model

The proposed network consists of two different modules: a CNN module which uses the Xception model, pretrained on ImageNet dataset implemented by Keras and another independent module is LSTM module. Each one of these modules runs parallel to each other. The first module is a CNN-based architecture, i.e., Xception, which stands for “Extreme Inception”. This is an extension of the Inception architecture which replaces the standard Inception modules with depthwise separable convolutions. The Xception architecture has 36 convolutional layers comprising the feature extraction unit of the network. The two spatial dimensions of width and height and the channel dimension helps the convolution layer to learn filters in a 3D space. Thus, the task of simultaneously mapping cross-channel features and spatial features rests with a single convolution kernel [12].

The EEG signals contain the temporal, frequential, and spatial features. These features should be extracted from the EEG signals as it consist important information regarding EEG signals. The conventional CNN do not perform convolution across all channels. That means the number of connections are fewer and the model is lighter and is unable to extract depthwise features. Therefore, we have implemented Xception network which can extract spatial features as it uses depthwise separable convolution. Initially, the entire network takes a RGB spectrogram image whose shape is 299 × 299 × 3 and passed through the pretrained Xception model until it reaches the final convolution block which has the bottleneck features, which is of batch size 2,048. The second module, i.e., LSTM layer is used to extract the temporal relationship of the input spectrogram images generated from the input EEG signals.

On the LSTM module, the 299 × 299 × 3 image is converted to grayscale image of size 299 × 299 × 1 to be able to properly split it into chunks to feed it into the LSTM. Afterward, this 299 × 299 image is reshaped into (23, 3887), where 23 is the time-step and 3887 is the dimension of each time-step. These values were chosen because 23 × 3887 = 299 × 299. The reshaped image is then passed through two LSTM layers, each of which are of (batch size, 2,048) output. The Xception network yields the depthwise features using multiple feature extraction layers from EEG signals and the other independent LSTM model process the EEG signals to yield temporal features which when combined can help to recognize the overall pattern of EEG spectogram and can improve the classification results. Next, now that we have (batch size, 2,048) from both the CNN and LSTM modules, these two outputs are merged using elementwise multiplication. The output of this multiplication is then provided to the classification layer which consists of two nodes (two classes, i.e., object and animal) and a softmax activation. Figure 11.2 shows the architecture of proposed hybrid CNN+LSTM network, and Figure 11.3 depicts the overall architecture of original Xception network [12].

Schematic illustration of the architecture of proposed combinational CNN plus LSTM model.

Figure 11.2 Architecture of proposed combinational CNN+LSTM model.

Schematic illustration of the overall xceptionnet architecture.

Figure 11.3 Overall XceptionNet architecture.

11.4 Result and Discussion

As discussed in the previous section, two different models, i.e., a modified CNN model and a hybrid CNN+LSTM model is implemented to classify the EEG-based spectrogam images. As described in dataset description section, two datasets, namely, MindBig and Perceive datasets, are used to train and test the proposed model. As clarified in the above sections, the raw EEG signals are converted into spectrogram images using STFT, which then serve as an input data for the proposed CNN- and CNN+LSTM-based techniques. Additionally, this method was processed in mini-batches of size 32 and Adam optimizer has been used for both datasets for 50 epochs. The overall accuracy for both models on both datasets are provided in Table 11.3. The proposed CNN model achieved 69.84% and 71.45% classification accuracy, and combinational CNN+LSTM model has performed better with classification accuracy of 74.57% and 76.05% on MindBig and Percieve datasets, respectively. In this current work, firstly, a CNN model was implemented but as the acquired accuracy was not very good, depthwise CNN architecture (Xception Network) with LSTM model was implemented to extract multilevel spatial and channelwise features as well as temporal and frequential features from LSTM layer which can help to improve classification accuracy. Figures 11.4a and 11.4b show the accuracy graph of training and testing of CNN model and Figures 11.5a and 11.5b present the proposed combinational CNN+LSTM model’s training and testing accuracy.

Table 11.3 Classification accuracy (%) with two proposed models on two different datasets.

ModelMindBig datasetPerceive dataset
CNN69.8471.45
CNN+LSTM74.5776.05
Graphs depict the proposed CNN model’s accuracy graph on (a) mindbig dataset and (b) Perceive dataset.

Figure 11.4 Proposed CNN model’s accuracy graph on (a) MindBig dataset and (b) Perceive dataset.

Graphs depict the proposed CNN plus LSTM model’s accuracy graph on (a) mindbig dataset and (b) Perceive dataset.

Figure 11.5 Proposed CNN+LSTM model’s accuracy graph on (a) MindBig dataset and (b) Perceive dataset.

11.5 Conclusion

In this chapter, a visual classification system is proposed utilizing EEG signals, evoked from visual stimuli while seeing an image. A CNN-based model and combinational CNN+LSTM model has been proposed to classify the images based on the recorded EEG signals. The contribution of this work is in two essential viewpoints. The first is the utilization of the STFT strategy to transform the visually evoked 1D EEG signals into 2D spectrogram images for obtaining better results. Second, this paper provides the deep learning–based network model to accomplish the problem. This paper proposes a CNN-based model and a combinational CNN+LSTM model, which is used to classify the visual evoked EEG signal and recognizes the image an individual is seeing at the time of EEG recording procedure. The proposed CNN model was able to classify the spectrogram images as inputs extracted from the MindBigData to identify two different classes (animal and object) the subject was viewing while the EEG was captured. The CNN+LSTM-based model efficiency is evident from its accuracy in comparison to CNN model. A future extension to the work may include testing the model on other datasets acquired through different visual stimulus. This architecture can also be tested for multi classification using more image classes. Application wise, this technique can find varied implementations in the field of brain fingerprinting and also early abnormality detection in children by asking them to visualize any object that is shown to them on screen.

References

1. Imagenet dataset visual challenge ilsvrc, 2014, http://www.image-net.org/challenges/LSVRC/2014/results.

2. Mindbig dataset, http://opendatacommons.org/licenses/odbl/1.0/. Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/

3. Spampinato, C., Palazzo, S., Kavasidis, I., Giordano, D., Souly, N., Shah, M., Deep learning human mind for automated visual classification, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6809–6817, 2017.

4. Acharya, U.R., Oh, S.L., Hagiwara, Y., Tan, J.H., Adeli, H., Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals. Comput. Biol. Med., 100, 270–278, 2018.

5. Acharya, U.R., Sree, S.V., Ang, P.C.A., Yanti, R., Suri, J.S., Application of non-linear and wavelet based features for the automated identification of epileptic EEG signals. Int. J. Neural Syst., 22, 02, 1250002, 2012.

6. Ahmadi-Pajouh, M.A., Ala, T.S., Zamanian, F., Namazi, H., Jafari, S., Fractal-based classification of human brain response to living and non-living visual stimuli. Fractals, 26, 05, 1850069, 2018.

7. Alpert, G.F., Manor, R., Spanier, A.B., Deouell, L.Y., Geva, A.B., Spatiotemporal representations of rapid visual target detection: A single-trial EEG classification algorithm. IEEE Trans. Biomed. Eng., 61, 8, 2290–2303, 2013.

8. Attia, M., Hettiarachchi, I., Hossny, M., Nahavandi, S., A time domain classification of steady-state visual evoked potentials using deep recurrent-convolutional neural networks, in: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), IEEE, pp. 766–769, 2018.

9. Bordes, A., Glorot, X., Weston, J., Bengio, Y., Joint learning of words and meaning representations for open-text semantic parsing. In: Artificial Intelligence and Statistics, La Palma, Canary Islands, PMLR, 127–135, 2012.

10. Cecotti, H., Eckstein, M.P., Giesbrecht, B., Single-trial classification of event-related potentials in rapid serial visual presentation tasks using supervised spatial filtering. IEEE Trans. Neural Networks Learn. Syst., 25, 11, 2030– 2042, 2014.

11. Chen, G., Xie, W., Bui, T.D., Krzyżak, A., Automatic epileptic seizure detection in EEG using nonsubsampled wavelet–fourier features. J. Med. Biol. Eng., 37, 1, 123–131, 2017.

12. Chollet, F., Xception: Deep learning with depthwise separable convolutions, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258, 2017.

13. Cireşan, D.C., Meier, U., Schmidhuber, J., Transfer learning for Latin and Chinese characters with deep neural networks, in: The 2012 international joint conference on neural networks (IJCNN), IEEE, pp. 1–6, 2012.

14. Cogan, D.L.C., Multi extracerebral biosignal analysis for epileptic seizure monitoring. ProQuest Dissertations Publishing, The University of Texas at Dallas, 2016.

15. Craik, A., He, Y., Contreras-Vidal, J.L., Deep learning for electroencephalogram (EEG) classification tasks: a review. J. Neural Eng., 16, 3, 031001, 2019.

16. Dai, M., Zheng, D., Na, R., Wang, S., Zhang, S., EEG classification of motor imagery using a novel deep learning framework. Sensors, 19, 3, 551, 2019.

17. Deng, L., A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans. Signal Inf. Process., 3, 2014.

18. Diba, A., Sharma, V., Pazandeh, A., Pirsiavash, H., Van Gool, L., Weakly supervised cascaded convolutional networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 914–922, 2017.

19. Doulamis, N., Adaptable deep learning structures for object labeling/tracking under dynamic visual environments. Multimed. Tools Appl., 77, 8, 9651– 9689, 2018.

20. Erfani, S.M., Rajasegarar, S., Karunasekera, S., Leckie, C., High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognit., 58, 121–134, 2016.

21. Faust, O., Hagiwara, Y., Hong, T.J., Lih, O.S., Acharya, U.R., Deep learning for healthcare applications based on physiological signals: A review. Comput. Methods Programs Biomed., 161, 1–13, 2018.

22. Fedjaev, J., Decoding EEG brain signals using recurrent neural networks, 2017. 23. Fiscon, G., Weitschek, E., Cialini, A., Felici, G., Bertolazzi, P., De Salvo, S., Bramanti, A., Bramanti, P., De Cola, M.C., Combining EEG signal processing with supervised methods for Alzheimer’s patients classification. BMC Med. Inf. Decis. Making, 18, 1, 35, 2018.

24. Fraiwan, L., Lweesy, K., Khasawneh, N., Wenz, H., Dickhaus, H., Automated sleep stage identification system based on time-frequency analysis of a single EEG channel and random forest classifier. Comput. Methods Programs Biomed., 108, 1, 10–19, 2012.

25. Grigorescu, S., Trasnea, B., Cocias, T., Macesanu, G., A survey of deep learning techniques for autonomous driving. J. Field Rob., 37, 3, 362–386, 2020.

26. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai, J. et al., Recent advances in convolutional neural networks. Pattern Recognit., 77, 354–377, 2018.

27. Ha, K.W. and Jeong, J.W., Motor imagery EEG classification using Capsule Networks. Sensors, 19, 13, 2854, 2019.

28. Hsu, Y.L., Yang, Y.T., Wang, J.S., Hsu, C.Y., Automatic sleep stage recurrent neural classifier using energy features of EEG signals. Neurocomputing, 104, 105–114, 2013.

29. Hussein, R., Palangi, H., Ward, R., Wang, Z.J., Epileptic seizure detection: A deep learning approach. Signal Process., 1–12, 2018.

30. Iannizzotto, G., Bello, L.L., Nucita, A., Grasso, G.M., A vision and speech enabled, customizable, virtual assistant for smart environments, in: 2018 11th International Conference on Human System Interaction (HSI), IEEE, pp. 50–56, 2018.

31. Kamath, U., Liu, J., Whitaker, J., Deep learning for (NLP) and speech recognition. vol. 84, Springer, Switzerland, 2019.

32. Krizhevsky, A., Sutskever, I., Hinton, G.E., Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol. 25, 1097–1105, 2012.

33. Krizhevsky, A., Sutskever, I., Hinton, G.E., Imagenet classification with deep convolutional neural networks. Commun. ACM, 60, 6, 84–90, 2017.

34. Kwon, Y.H., Shin, S.B., Kim, S.D., Electroencephalography based fusion two-dimensional (2D)-convolution neural networks (CNN) model for emotion recognition system. Sensors, 18, 5, 1383, 2018.

35. LeCun, Y., Bengio, Y., Hinton, G., Deep learning. Nature, 521, 7553, 436–444, 2015.

36. Lin, L., Wang, K., Zuo, W., Wang, M., Luo, J., Zhang, L., A deep structured model with radius–margin bound for 3D human activity recognition. Int. J. Comput. Vision, 118, 2, 256–273, 2016.

37. Long, J., Shelhamer, E., Darrell, T., Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440, 2015.

38. Manor, R. and Geva, A.B., Convolutional neural network for multi-category rapid serial visual presentation BCI. Front. Comput. Neurosci., 9, 146, 2015.

39. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J., Distributed representations of words and phrases and their compositionality, in: Advances in neural information processing systems, pp. 3111–3119, 2013.

40. Mousavi, S., Afghah, F., Acharya, U.R., Sleep EEGN: Automated sleep stage scoring with sequence to sequence deep learning approach. PLoS One, 14, 5, e0216456, 2019.

41. Niedermeyer, E., da Silva, F.L., Electroencephalography: basic principles, clinical applications, and related fields. Lippincott Williams & Wilkins, Lopes da Silva, F. H., 1, 2005.

42. Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H.G., Ogata, T., Audio-visual speech recognition using deep learning. Appl. Intell., 42, 4, 722–737, 2015.

43. Ogawa, T., Sasaka, Y., Maeda, K., Haseyama, M., Favorite video classification based on multimodal bidirectional LSTM. IEEE Access, 6, 61401–61409, 2018.

44. Olivas-Padilla, B.E. and Chacon-Murguia, M.I., Classification of multiple motor imagery using deep convolutional neural networks and spatial filters. Appl. Soft Comput., 75, 461–472, 2019.

45. Op de Beeck, H.P., Vermaercke, B., Woolley, D., Wenderoth, N., Combinatorial brain decoding of people’s whereabouts during visuospatial navigation. Front. Neurosci., 7, 78, 2013.

46. Ren, J., Xu, L., On vectorization of deep convolutional neural networks for vision tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 29, 2015.

47. Roy, A., Sun, J., Mahoney, R., Alonzi, L., Adams, S., Beling, P., Deep learning detecting fraud in credit card transactions. In: 2018 Systems and Information Engineering Design Symposium (SIEDS), Charlottesville, VA, USA, IEEE, 129–134, 2018.

48. Shamwell, J., Lee, H., Kwon, H., Marathe, A.R., Lawhern, V., Nothwang, W., Single-trial EEG RSVP classification using convolutional neural networks. In: Micro-and Nanotechnology Sensors, Systems, and Applications VIII. vol. 9836, Baltimore, Maryland, United States, International Society for Optics and Photonics, 2016, 983622.

49. Soleymani, M., Asghari-Esfeden, S., Fu, Y., Pantic, M., Analysis of EEG signals and facial expressions for continuous emotion detection. IEEE Trans. Affect. Comput., 7, 1, 17–28, 2015.

50. Spampinato, C., Palazzo, S., Kavasidis, I., Giordano, D., Souly, N., Shah, M., Deep learning human mind for automated visual classification, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6809–6817, 2017.

51. Subasi, A. and Gursoy, M.I., EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl., 37, 12, 8659–8666, 2010.

52. Tan, C., Sun, F., Zhang, W., Chen, J., Liu, C., Multimodal classification with deep convolutional-recurrent neural networks for electroencephalography, in: International Conference on Neural Information Processing, Springer, pp. 767–776, 2017.

53. Toshev, A. and Szegedy, C., Deeppose: Human pose estimation via deep neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1653–1660, 2014.

54. Waytowich, N., Lawhern, V.J., Garcia, J.O., Cummings, J., Faller, J., Sajda, P., Vettel, J.M., Compact convolutional neural networks for classification of asynchronous steady-state visual evoked potentials. J. Neural Eng., 15, 6, 066031, 2018.

55. Wei, X., Zhou, L., Chen, Z., Zhang, L., Zhou, Y., Automatic seizure detection using three-dimensional CNN based on multi-channel EEG. BMC Med. Inf. Decis. Making, 18, 5, 111, 2018.

56. Yu, R., Qiao, L., Chen, M., Lee, S.W., Fei, X., Shen, D., Weighted graph reg-ularized sparse brain network construction for MCI identification. Pattern Recognit., 90, 220–231, 2019.

57. Zhang, J., Cui, L., Fu, Y., Gouza, F.B., Fake news detection with deep diffusive network model. arXiv preprint arXiv:1805.08751, 2018.

58. Zheng, X., Chen, W., You, Y., Jiang, Y., Li, M., Zhang, T., Ensemble deep learning for automated visual classification using EEG signals. Pattern Recognit., 102, 107147, 2020.

59. Zhou, M., Tian, C., Cao, R., Wang, B., Niu, Y., Hu, T., Guo, H., Xiang, J., Epileptic seizure detection based on EEG signals and CNN. Front. Neuroinf., 12, 95, 2018.

  1. *Corresponding author: [email protected]
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.76.200