Classification of patient by analyzing EEG signal using DWT and least square support vector machine

A R T I C L E I N F O A B S T R A C T Article history: Received: 25 May, 2017 Accepted: 15 July, 2017 Online: 01 August, 2017 Epilepsy is a neurological disorder which is most widespread in human beings after stroke. Approximately 70% of epilepsy cases can be cured if diagnosed and medicated properly. Electro-encephalogram (EEG) signals are recording of brain electrical activity that provides insight information and understanding of the mechanisms inside the brain. Since epileptic seizures occur erratically, it is essential to develop a model for automatically detecting seizure from EEG recordings. In this paper a scheme was presented to detect the epileptic seizure implementing discrete wavelet transform (DWT) on EEG signal. DWT decomposes the signal into approximation and detail coefficients, the ApEn values the coefficients were computed using pattern length (m= 2 and 3) as an input feature for the Least square support vector machine (LS-SVM). The classification is done using LS-SVM and the results were compared using RBF and linear kernels. The proposed model has used the EEG data consisting of 5 classes and compared with using the approximate and detailed coefficients combined and individually. The classification accuracy of the LS-SVM using the RBF and Linear kernel with ApEn using different cases is compared and it is found that the best accuracy percentage is 100% with RBF kernel.


Introduction
Epilepsy in humans is an intrinsic brain pathology and its major manifestation is epileptic seizures. Epileptic seizures may affect partial part of the brain (partial) or the whole cerebral mass (generalized), seizures are recurrent with interictal period ranging from several minutes to several days. Brain's electrical activity is measured through the electroencephalogram (EEG) signal which is an effective tool for studying functioning of brain and diagnosing epilepsy [1]. EEG signals are a non-invasive testing method that provides valuable details of distinct physiological states of the brain. The data recorded usually are of long duration and inspected by experts to analyze the huge data recorded in the form of EEG signal to detect epilepsy.
The advanced signal processing has enabled to process and store the EEG signal digitally. The pro-cessed EEG data is then feed as an input for the automatic detection system to detect the traces of epilepsy. The automatic system reduces the workload of neurologists by reducing the amount of effort and time required to detect the traces of epilepsy in the recorded EEG signal. Automatic prediction and detection of epilepsy from the EEG signal are developed using different signal processing techniques like frequency domain analysis, wavelet analysis, spike detection, and non-linear methods. Gotman [2] has presented an automatic detection system for epilepsy by decomposing the EEG signal into elementary waves. Srinivasan et al., [3] has used time domain and frequency domain analysis to detect epilepsy the author has selected five features out of which two features were from frequency-domain and three features were from time-domain. The author has used the recurrent neural network for detecting epilepsy and the neural net-* Mohd Zuhair, RMSoEE, IIT Kharagpur, 9038273634 & md.zuhair.cs@gmail.com work was trained and tested for the epileptic EEG signal with an accuracy of 99.6%, Keshri et al., [4] has used the slope of the lines between each pair of two consecutive data points (x1, y1) and (x2, y2) and feed it into Deterministic Finite Automata and got the accuracy level as high as 95.68%. Geva et al., [5] has used wavelet analysis as both time and frequency domain view can be provided with the use of WT.
Various methods are available to detect and predict the epileptic seizure. Artificial Neural Network (ANN) has been widely used for detecting the epileptic spike [1,6,7,8,9]. Features extraction is an important aspect in the performance of ANN models as the model is trained in and tested on these extracted features. ApEn was proposed by Pincus [10] as a statistical parameter to measure the regularity of time series data. ApEn is being predominantly used in the electrocardiogram and other related heart rate data analysis [11,12,13] as well as in the analysis of endocrine hormone [14]. It is the measure of regularity as smaller value of ApEn depicts a high regularity and higher value of ApEn depicts low regularity in time series data [15]. Diambra et al., [16] has denoted that parameter ApEn gives the valuable temporal localization of a variety of epileptic activity. Elman, PNN and SVM network for detection of epilepsy through ApEn based feature with 100% overall accuracy. Hence, it is an acceptable feature for automated detection of epilepsy.
The Support Vector Machine (SVM) is a classier method that performs classification tasks by constructing hyperplanes in a multidimensional space which try to find a combination of samples to build a plane maximizing the margin between two classes. SVM is widely used in epileptic detection and prediction [17] has used permutation entropy as a parameter, as it drops during the seizure interval. The Burg Burg AR coefficients has been used as an input for SVM that shows accuracy of 99.56% [18].
In this paper an automated detection of the epileptic seizure was discussed using LS-SVM comparing two kernel functions RBF and linear. In this work EEG signal were decomposed into six sub-bands namely D1-D5 and A5 using DWT. The analysis of complexity in the sub-bands are done by ApEn which acts as an input feature for SVM. Experiments are done using different cases having a different combination of EEG data sets.

Proposed Method
In this paper there are four main tasks in sections 2.1) Clinical data 2.2) Preprocessing EEG data through DWT to decompose into several sub-bands in section 2.3) Feature extraction (ApEn) in section 2.4 ) Classification of EEG data using feature (ApEn). Figure 1 shows the proposed approach used in this paper.

Clinical data
The data set used in this paper is available in public domain and is accessed online from the University of Bonn, Germany which consists of five different data set of EEG data [19]. The data used is artifact free EEG time series data and is widespreadly used by the ongoing research on epilepsy [1,3,7,15,17,20]. The complete data set contains five sets marked as (A-E) each set has 100 single channel EEG segment having duration of 23.6-sec. Each segments of data are selected and cut out after visual inspection for artifacts from continuous mechanical EEG recordings with the sampling frequency of 173.61 Hz with the band pass settings of 0.53-40 Hz. The data contains three different classes normal, epileptic background (pre-ictal), and epileptic seizure (ictal). The normal EEG data (A and B) was collected from five healthy volunteers. The pre-ictal EEG data (C and D) was recorded during the period when there were no traces of seizure from five epileptic patients. The ictal EEG data (E) was recorded during the epileptic seizure from the same five patients. EEG signals were recorded from 128-channel amplifier using an average common reference and digitized at 173.61 Hz sampling rate and 12-bit A/D resolution. Figure 2 and Figure 3 shows the specimen of an epileptic and normal signal.

Discrete Wavelet Transform
Wavelet transform (WT) uses variable window size, has well-known data compression and timefrequency filtering capabilities. It provides a good local representation of the signal in time as well as in the frequency domain which makes it as an effective tool for analyzing the signal and extracting features. The wavelet transform looks for the spatial distribution of singularities whereas fourier transform provides a description of the overall regularity of signals [5]. WT captures transient features and localizes them in time as well as frequency content and is widely used in epileptic seizure detection [21,22]. WT uses long time windows for obtaining finer low-frequency resolution as well as the short time windows for obtaining high-frequency information. Thus, WT provides specific frequency information at low frequencies and specific time information at high frequencies [23].
Where ψ, a and b are wavelet function called as scaling and shifting parameters. The quadrature mirror filters are used to realize the scheme by passing the passing the signal through a series of low-pass (LP) and high-pass (HP) filter [24] The out put from the low pass filter is termed as . Approximation (A) and output from the high pass filter id termed as detail (D) coefficients. The figure illustrates the fifth level wavelet decomposition of a signal showing the coefficients A1, D1, A2, D2, A3, D3, A4, D4, A5, and D5 .
In this model fifth-level wavelet decomposition was performed on normal subjects data (A, B, C and D) as well as on epileptic patient data (E) using Daubechies order 4 wavelet (db4). The researchers have found that db4 wavelet is most appropriate for the analysis of epileptic EEG data [25]. The structure for the wavelet decomposition at each level with its approximation and detail coefficients are shown in Figure 4.

Approximate entropy (ApEn)
Approximate entropy is a statistical feature for quantifying regularity and complexity in a time-series data [10]. ApEn is a non-negative number and has been successfully applied in the field of pattern recognition. ApEn have potential application throughout medicine and prominently in ECG and EEG [14].The following steps determine the value of ApEn [1,8,26,27,28].

For a given A(i), number of j is counted as
6. . Repeat step (1) to step (4) by increasing the dimension m to m+1 and find C m+1 r (i) and Φ m+1 (r)

Finally ApEn is computed as
To compute the ApEn value of the signal of length N two parameters length of the compared run m tolerance window r are specified. We have taken the value of m=2 and 3 and r is in between 0.1 to 0.25 times the standard deviation of data. In this model the ApEn values of the approximate (A1-A5) and detailed coefficients (D1-D5) are computed using the length of the compared run (m), tolerance window (r) were set to m= (2,3) and r= (0.2)*standard deviation of the data to compute ApEn.

Support vector machine (SVM)
Support vector machine (SVM) has been used in several EEG signal classification problems [17,20] and was first introduced in 1995. SVMs belong to the family of kernel-based classifiers and are extremely powerful classifiers. Linear as well as non-linear classification can be performed in SVM using different kernel functions [29]. The approach of SVMs is to implicitly map the classification data into higher dimension input space where a hyperplane separating the classes may exist. The implicit mapping is achieved through different Kernel functions. In the case of linear SVM to classify linearly separable data, the training data, {a i , b i } for i = 1, , m and y i ∈ {−1, 1} then the following decision function is determined by [18]: g(x)is a mapping function that maps x into ldimensional space, y is a scalar and w is the ldimensional vector. The decision function satisfies the following condition to separate the data linearly: There are infinite number of decision functions that satisfy Eq.(8) for a linearly separable feature space. The largest between the two classes are selected between the two classes. The margin given by D |x| / w . Let the margin is ρ then the following condition are required to be satisfied: The product of ρ and w is fixed In order to obtain the maximum margin for the optimal separating hyperplane,w with the minimum w that satisfying Eq. (9) should be found from Eq. (10), This provides optimization problem as follows. Minimizing 1 2 w t w (10) subject to the constraints: To maximize the margin and minimize the training error the optimal separating hyperplane is determined and is achieved by minimizing, subject to the constraints: The tradeoff between the maximum margin and minimum classification error is determined by the parameter C.

Least squares support vector machines (LS-SVMs)
The LS-SVM are trained by minimizing subject to the equality constraints: The conventional SVM uses inequality constraints where as in the LS-SVM equality constraints are used.
The equality constraint has reduced the complexity to obtain the optimal solution by solving a set of linear equations rather than solving a quadratic programming problem. To derive the dual problem of Eqs. (14) and (15) Lagrange multipliers are used Q (w, y, α, ξ) where α = (α i , ..., α M ) t is Lagrange multipliers and by differentiating the above equation with respect to w, ξ i , b, and α i and equating the resulting equations to zero the the conditions for optimality are determined [30].

Kernel Function
Classically SVMs were designed to classify the data in linear space, in the nonlinear space SVMs do not preformed well to overcome this limitation on SVMs, kernel approaches were developed. The following kernels are most commonly used [31]. : k(a, a )  To train a SVM classifier, the user has to determine a suitable kernel function, optimum hyper parameters, and proper regularization parameter. in this paper we have used two kernels 1) Linear Kernel and 2) RBF kernel. The goal to achieve optimum hyper parameter and regularization parameter is accomplished by cross-validation technique. The crossvalidation technique can be used to select parameters.

Cross-Validation
Cross-validation is a validation technique used to determine the quality of the classification model. It partitions a sample data into different subsets such that the analysis is initially implemented on a single subset. The remaining subset(s) are kept for validating the result of initial analysis. The data subset used for initial test is called as training set while the other subsets are called as testing or validation sets [32]. In K-Fold cross validation data is partitioned into k roughly equal size sets and each set is used once as a test set while other remaining sets are used as training sets. For each k=1,2,...N, fit the model parameter for other retained K-1 parts. The cross-validation procedure is repeated for K number of times using each of the K set exactly once as validation data. The average of the K result obtained from the folds produces a single estimation. In this paper, we have used 10-fold scheme to achieve best performance efficiency.

Performance evaluation parameters
The performance of LS-SVM is estimated by using the parametres, namely, sensitivity (SE), specificity (SP) and overall accuracy (OA) defined as: Where T N CP denotes the count of correctly detected positive patterns and T N AN denotes the count of actual positive pattern. The positive pattern represents a detected seizure.
Where T N CN denotes the count of correctly detected negative pattern and T N AN denotes the actual count of negative pattern. The negative pattern represents a detected non-seizure.
OA (%) = T N CDP T N AP P * 100 (18) Where T N CDP denotes the count of correctly detected pattern and T N AP P denotes the count of applied patterns [1].

Design of Experiment
The data is processed and the ApEn values of the detailed and approximate coefficients are computed. The design of experiment is done to make different cases of the processed data-set and then compare the results of different cases the cases are as follows

Result and Discussion
The EEG data set is decomposed into different sub bands by applying DWT using db4 wavelet having 5 level of decomposition shown in   The ApEn values of the approximate and detailed coefficients are computed from the entire data set A-E consisting 100 epochs having parameters m=(2,3) and r=0.2*standard deviation of data set. ApEn value of the detailed coefficient D1 from data set A, B which were recorded from the surface of the scalp of the normal subject while they are in a relaxed and an awake state with (Data Set A) eyes open and (Data set B) eyes closed vs E are recorded from the epileptic subjects through intracranial electrodes and having embedded dimension m=2 is shown in Figure7. The ApEn values for data set A and B are higher than that of data set E which means that the data set E is more ordered and periodic than set A and B. Similarly Figure8 shows the Apen values of detailed coefficient D1 from data set C and D v/s data set E recorded from intracranial electrodes having embedded dimension m=2. The ApEn values of data set C and D are also higher than that of data set E which that data set A,B,C,D are more complex than data set E. Figure 7 and Figure 8 shows that the complexity of the data set A recorded from normal subject and data set C which is recorded from opposite to the epileptogenic zone are almost similar. ApEn values for embedded dimension m=3 are shown in Figure 9 and Figure 10 depicting similar attributes. The calculated value of ApEn got reduced in this case than that of having embedded dimension m=2 the curves and variation of data also got reduced. Data set D and E are overlapping in both the case as the data set is acquired within the epileptogenic zone.  In this paper SVM is implemented by using MAT-LAB R2012b and LS-SVM toolbox [33].The input feature vector ApEn is divided into two parts training data set 60% and testing 40%. The training data set is used to train the SVM while testing data set is used for verifying the accuracy of the trained SVM. For dividing the data set into training and testing part we have used holdout method of cross validation. The SVM is initialized by initlssvm function trained using tunelssvm and trainlssvm function. The performance parameters were tuned by using tunelssvm function for regularization and kernel parameter (gam, sig2) of LS-SVM [34]. The SVM algorithm is used with linear and Gaussian radial basis kernel functions. Linear kernel function require gamma parameter for training the SVM. SVM with rbf kernel function require gamma as well as sigma parameter which has to be selected based on training data. In this paper we have set gamma ∈ [0 − 1] for linear Kernel and gamma ∈ [0 − 10] for rbf Kernel. The sigma parameter for rbf kernel is ∈ [0.7 − 9] using tunelssvm function. Each row of the input data matrix is one observation and its column is one feature.
The feature vector of each data set has 100 rows (epochs) and 6 columns (D1-D5) and A5. Case (1 − 4) consist 200 observations, Case 5 consist of 400 observations, Case 6 consists 500 observations. The obser- vations are normalized by scaling between 0 and 1 and then 60% and training and 40% of testing data is used for training and testing SVM.
• The Case when D1-D5 and A5 is used individually as an Input to LS-SVM The highest precision accuracy of for embedded dimension m = 2 are found in cases 1 and 3 for detailed coefficient D1 are 98.75% using linear kernel and same for rbf kernel ie. in cases 1 and 3 are 98.75%. The lowest accuracy was found to be 55% for case 1 for detailed coefficient D5 in linear kernel and 59% for case 1 for detailed coefficient D5 in rbf kernel. The results suggests that there is marginal differences between the results of rbf and linear kernel but the overall average efficiency remains same for both in this cases as shown in the Table 1 and Table 2. The highest accuracy of for embedded dimension m = 3 is found in cases 1, 2 and 3 for detailed coefficient D1 are 96.25% and for RBF kernel it is 98.75% for cases 1 and 3 for detailed coefficient D1. In this modeling RBF kernel shows better results than that of linear kernel the overall result is summarized in Table 3 and Table  4.
• The Case when D1-D5 and A5 is combined as an Input to LS-SVM The precision of the proposed system for embedded dimension m = 2 with linear kernel is 100% which is maximum and a lowest of 98:12% respectively for cases 1, 3, 4, 5 and case 6. Similarly, the precision of the proposed system for embedded dimension m = 2 with RBF kernel is 100% maximum and lowest of 99.50% respectively for cases 1-5 and case 6. The precision of the proposed system for embedded dimension m = 3 with linear kernel is 100% maximum and a lowest of 97.0% respectively for cases 1, 3, 4 and case 6. Similarly, the precision of the proposed system for embedded dimension m = 3 with RBF kernel is 100% maximum and a lowest of 99.5% respectively for cases 1, 3 − 5 and case 6. It is clearly observed in the results summarized in Table 6 and Table 7 that RBF kernel based automatic epileptic seizure detection system gives better precision than linear kernel based automatic epileptic seizure detection system. Table 5 presents the the comparison of results between the existing and the proposed method using same data set.

Conclusion and Future Scope
The least squares version of support vector machine classifiers is discussed in this paper. The quadratic programming problem has been eased to solving a set of linear equations with the use of equality constraint instead of inequality constraint. In this paper we have modeled LS-SVM using EEG data set with different combinations of input features as well as changing the parameters of the ApEn. The results suggest that the combination of detailed and approximate coefficients ie. D1-D5 and A5 as an input to classifier produces better results than using single feature D1 as a classification input. The kernel function also plays significant role in the classifying the EEG data set using LS-SVM and RBF kernel produces better results than the linear kernel in this modelling. The embedded dimension m for calculation the ApEn is also significant ad this model suggests that at m=2 the classification system performs better than that at m=3. The proposed approach can be deployed as a quantitative measure for monitoring EEG signal associated with epilepsy. As an extension of the proposed method, it would be challenging to scrutinize the efficacy of the proposed method for other neurological disorders which uses brain signals for analyzes such as Parkinson diseases etc [35]. Furthermore, it would be interesting to analyze the learning effectiveness of this model on other database. In this study, the