CNN-LSTM Based Model for ECG Arrhythmias and Myocardial Infarction Classification

ECG analysis is commonly used by medical practitioners and cardiologists for monitoring cardiac health. A high-performance automatic ECG classification system is a challenging area because there is difficulty in detecting and clustering various waveforms in the signal, especially in the manual analysis of electrocardiogram (ECG) signals. In this paper, an accurate (ECG) classification and monitoring system are proposed using the implementation of 1D Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTM). The learned features are captured from the CNN model, and then fed to the LSTM model. No handcraft features are required for the model for the ECG classification. The result of the CNN-LSTM model has demonstrated superior performance than several state-of-the-arts that cited in the result section. The proposed models are evaluated on MIT-BIH arrhythmia and PTB Diagnostics datasets. Based on the obtained results, the CNN-LSTM method can improve the accuracy rate, such that 98.1 % and 98.66 % on Myocardial Infarction (MI) and arrhythmia classification, respectively.


Introduction
One of the most crucial organs in the human body is the heart, and monitoring of the heart has become a necessary diagnosis to the health of the human being. The electrical activity of the heart is well known as ECG, and Abnormality in ECG may cause symptoms of heart disease, such as persistent ventricular tachycardia, low blood pressure, rapid atrial fibrillation, and persistent ventricular tachycardia. These mentioned diseases are dangerous for human life and desire an urgent treatment [1]. The most significant information about the state of the heart can be observed from the ECG analysis, which can be monitored manually and automatically [2]. Due to the fact that there are various morphologies in the ECG signal, manual diagnosis is found difficult. Therefore, an automatic system of the ECG diagnosis has been interested in. Both feature extraction and classification approaches are the critical successes of any automatic ECG classification. There are plenty of applications that ECG analysis and classification involve, such as ischemic heart disease, arrhythmia, and myocardial infarction [3].
Feature extraction process is a viral area in ECG classification, and many approaches have been developed for this purpose as the feature has a significant impact on any machine learning system. Nowadays, there are numerous approaches for extracting features from ECG signals such as Discrete Wavelet Transform (DWT), Wavelet Transform (WT), and Mel Frequency Coefficient Cepstrum (MFCC).The DWT has been widely adopted in ECG classification as an effective feature, for instance, Desai at el. in [4], proposed system-based approach for computer-assisted detection of five classes of ECG arrhythmia beats by adopting DWT as a feature to train Support Vector Machine (SVM). In [5] developed a comprehensive model based on random forest techniques and discrete wavelet for arrhythmia classification. The authors in [6]used the DWT to capture features from the ECG signal based on ST-segment elevation and inverted T wave logic.
The MFCC was used as a handcrafted, which was extracted from the ECG signal directly, then the artificial neural network (ANN) was fed by the MFCC features [7]. The authors in [8] developed a hybrid features, which was merged MFCC and DWT after that the proposed feature was classified by k Nearest ASTESJ ISSN: 2415-6698 Neighbor (kNN). In addition, The WT was conducted to take features out from ECG signal for classifying four cardiac types such as healthy (N), Paced beats (P), right bundle branch block (RBBB), and left bundle branch block (LBBB). Then ANN and SVM were used to classify the extracted features [9].
The handcrafted feature is not always the best candidate feature for the ECG application, as it is a time-consuming and boring task [10]. Therefore, a featureless model like CNN has been interested in classifying the ECG analysis. CNN has been conducted into two forms, such as one dimensional and two-dimensional forms, for instance, Zubair at el. in [11] proposed the raw data fed a modelbased 1D-CNN for classifying ECG signals into some classes that were suggested by the Association for Advancement of Medical Instrumentation (AAMI) and the model. Wang in [12] developed a model based on 1D CNN and modified Elman neural network (MENN),which was consisted of the 11-layers neural network, and the non-transform ECG signal also fed the model. While some researchers proposed their CNN model for ECG analysis based on two-dimensional forms, for instance, Zheng at el in [13] converted the one dimensional ECG signal into two-dimensional images and then fed to the CNN model. Yildirim at el. developed a model for detecting a diabetic subject by adopting pre-trained 2D-CNN model with frequency spectrum images, which were obtained from heartbeat signals [14].
In this paper, a hybrid model, which is composed by CNN and LSTM, is developed. No handcrafted feature requires the model as the CNN part is responsible for obtaining the feature. The learned features are given to the LSTM model to learn a deep pattern in the learned feature and classify into healthy and abnormal ECG signals, where the health ECG signal is chosen as the absolute normal ECG pattern template. According to our experimental result, LSTM can improve the classification rate, as shown in the result section. The rest of the paper is organized as follows: Section 2 explained a concise background area of the proposed model, Section 3 methodology. Results and discussion are addressed in Section 4. Finally, the conclusion section is given in Section 5.

Background
A representation of the cardiac activity of the heart is known as the ECG signal. The automatic monitoring system of the ECG signal has been focused area as it is related to the human health condition. Feature extraction and classification are the primary processes in any automatic monitoring system. In this paper, the suggested model is a fusion model, which consists of CNN and the LSTM model, as shown in Figure 1. A brief background of CNN and LSTM is addressed in the below section as CNN is used to extract the learned features, and LSTM is conducted to classify the learned features.

Convolution Neural Networks
The CNNs also known as the feature learner has excellent potential to extract useful features automatically from input raw data. The CNNs are normally comprised of two parts and each part has different function. Feature extractor is the first part of the CNNs and which is normally responsible to extract features automatically. The second part is classifier which is called a fully connected multi-layer perceptron (MLP). The fully connected part performs classification depend on the learned features from the first part. The feature extraction's part includes some layers namely; convolution layer and pooling layer. Any convolutional layer can take feature maps out from the previous layer. convolutional layer contains several convolution kernels (filters), added by bias, and then put throughout the activation function to generate a feature map for the next layer [15]. One of the common sub layers in feature extraction's part is called pooling layer. There are some types of the pooling layer which are Max, Min and Average pooling. Pooling layer can shrink the resolution of the feature maps. Max-pooling operation is used in the proposed model, which calculates the maximum value in a set of nearby inputs [11].There is difference between the both parts in term of complexity. The feature extraction's part, which is primary layer, performs more computations including feature extraction and feature selection compare to the fully connection layers [16].
Many similarities can be seen between the CNN structure the ANN with an input, hidden, and output layers. On the other hand, unlike NN, CNN is an enhanced version of the ANN, which is both translational and shift invariant. Typically, the CNNs comprise of various types of layers such as input, convolution, max pooling, average pooling, drop layer, and Softmax layer, etc. and play a specific role in the model [17].

Long Short-Term Memory
RNN is one of the types of network architectures, which is specifically designed to deal with the sequential problem, and it is commonly used in sequence classification. A well-known improvement of RNN was introduced by Hochreiter and Schmidhuber in 1997 [18], which is called Long Short Term Memory (LSTM). Plenty of applications of LSTM has been published recently [19]- [23]. The LSTM unit comprises a cell; a forget gate, an output gate, and an input gate [19]. The cell unit is in charge of remembering values at every time interval. The rest unit gates manage the flow of information into and out of the unit. In the memory block structure, a simple one-layer neural network www.astesj.com 603 controls the forget gate. The functionality of this gate is formulated as Eq (1) [24].
where, is the sequence of input; c t−1 is the previous LSTM block memory; h t−1 is the earlier block output; is the bias vector; δ is the logistic sigmoid function, and W symbolizes separate weight vectors for every input. An input gate is a unit where the new memory is formed by the previous memory block effect and a simple NN with an activation function which is tanh. These operations are calculated by Eq (2) and (3) [25].
The output gate is an output of the current LSTM block and can be formulated using Eqs (4) and (5) [19].
Where, b i and b i are the outputs of previous memory block;These mentioned units are linked to each other, as illustrated in Figure 2, which permit information to cycle between steps of adjacent time and also construct an inner feedback state which the network to the temporal feature in the given data [26].

Methodology
This section is divided into; data description, preprocessing, and experimental design.

MIT-BIH Arrhythmia Database
In this study, one of the examined database is a gathering of annotated ECG recordings that achieved by the Arrhythmia Laboratory of Boston's Beth Israel Hospital, known as the MIT-BIH Arrhythmia Database [27]. It comprises 48 half-hour excerpts of two-channel ambulatory ECG recordings, which were obtained from 47 subjects, and the dataset was recorded in the BIH Arrhythmia Laboratory between 1975 and 1979. The recordings are obtained with a frequency of 360 Hz; this means each record approximately consists of 30 *360 = 10800 samples and contains recordings from two leads: one is ML2, and the other is one of V1, V2, V3, V4, V5, or V6 . Two or more cardiologists independently annotated each record for a total of 110,000 annotations covered with the database: it is worth to mention that each annotation is placed in correspondence to the R peak of a single beat so that we have totally solved for this particular case the beat discovery problem. The MIT-BIH Arrhythmia database is split into two classes healthy and abnormal.

The Physionet PTB Diagnostic ECG database
The second examined database, which is used in this paper, was recorded by the Department of Cardiology at Benjamin Franklin University in Berlin, Germany. The dataset consists of 290 subjects and include 549 records (aged 17 to 87, mean 57.2; and 81 women, mean age 61.6 and 209 men, mean age 55.5). One to five records were observed from each subject. The dataset was split into two sections namely, healthy signals and different cardiac diseases such as myocardial infarction, bundle branch block, arrhythmia, cardiomyopathy/heart failure, myocardial hypertrophy, etc. Each record includes 15 simultaneously measured signals: the conventional 12 leads (i, ii, iii, avr, avl, avf, V1, V2, V3, V4, V5, V6) together with the 3 Frank lead ECGs (vx, vy, vz) [8].

Preprocessing
In the MIT-BIH database, the beats are labeled based on the R position. The authors segment each record sample intoa separated beat and take samples from both sides of R waves. Each heartbeat consists of 130 sample points (65 samples before the R-peak and 64 samples after the peak), which is the smallest length ever been used for ECG classification. As showed in Figure. 3, the balanced dataset was created; the number of healthy beats is 86456 and 11230 beats of the abnormal beat. Feature from the raw signal and eliminate noise from the ECG signals. The LSTM section can deal with the characteristics of ECG related to time series properly and able to predict the future QRS complexes using the previous QRS complexes. The 1D CNN structure contains 19 layers, including four convolution layers, two Dropout layers with half percent drop, and two fully-connected layers.
Some parameters of the model such as number of convolution layer, filters and epochs are tuned based on cross validation methods. The description of the CNN-layers,are set out in Table  1.The spatial and local feature map is well extracted from the thirteenth layers of the CNN model, which is the convolutional layer with ten filters. The filters act to identify different features present in an ECG signal like edges, vertical and horizontal lines. Then, the features are fed to the LSTM model. For MIT-BIH Arrhythmia Dataset, the length of the learned features is ten times of the original signal thus the length of the feature is equal to 1300.Then; the learned feature is divided into ten separated beats and later merged them into a cell to be input to the LSTM model. In terms of the PTB Diagnostic ECG Dataset, the length of the ECG signal and the length learned features is 187.

Experimental design
In the current study, a cross-learning method based on deep learning is proposed for automatically classifying healthy and abnormal of the ECG signal. The proposed model is developed by integrating CNN and LSTM, which consists of 19 layers,as illustrated in Figure 4,where the CNN section is responsible for capturing the locally The implemented LSTM structure contains two lstmlayers, dropout layers, and two fully-Connected layers, as it is shown in Table 2. The output of the proposed model is two classes, which are healthy and abnormal Arrhythmia for MIT-BIH Arrhythmia Database and healthy and Myocardial Infarction for the PTB Diagnostic ECG Database.

MIT-BIH arrhythmia database
Nowadays, machine learning has been adopted for many applications, such as arrhythmia classification. One of the wellknown datasets is the MIT-BIH arrhythmia dataset that has been adopted by a mounting number of researchers in ECG research. Table 3 shows the result of the proposed methods and some state of arts who have studied a model for detecting ECG arrhythmias. Inspire of these state of arts, ten-fold cross-validation is utilized to evaluate the proposed model. Although the proposed CNN-LSTM model is fed by the shortest length of the ECG signal ever, the result demonstrates higher classification accuracy than other related studies. However, the result of 1D CNN does not exceed the state of arts. Moreover, the performance of our proposed model exceed the performance of a CNN-LSTM model which was conducted by [28]. However, the structure of their model (11 layers) is simplest than our model (19 layers).

MI Classification
Another well-known datasets are the PTB Diagnostics dataset that is used to evaluate the proposed methods. MI classification is an application of using the PTB Diagnostics dataset, and the MI can be treated as two classes' problem with infracted and noninfracted classes. The length of the signal 14552 samples and divided by 4046 infracted and 10506 non-infracted. Based on relevant researches in the literature, ten-fold cross-validation is adopted to weigh the proposed method. The experimental result in Table 4 shows a significant improvement in the classification accuracy. Moreover, the percentage of precision and recall are improved compared to the existing approaches available in the literature.

Conclusion
Detecting heart abnormalities using ECG signals is very challenging area. An accurate model to detect the abnormal ECG signals leads to offer correct treatment to the patients in the earlier stage of disease. In this paper, an effective arrhythmias and myocardial classification method that combines 1-D CNN and LSTM model. The proposed CNN-LSTM model is applied to the computerized recognition of abnormal ECG signals. According to our experimental result, the learned features from CNN can be useful features for a time series approach like LSTM. Moreover, the LSTM can find out a better pattern in the learned feature compare to the fully-connection layer itself. Consequently, the classification result by LSTM has outperformed the states of arts. The authors recommend to use the proposed model to classify multiclass of arrhythmias and myocardial instead of binary class.