A Support Vector Machine Based Technique for Fault Detection in A Power Distribution Integrated System with Renewable Energy Distributed Generation

A R T I C L E I N F O A B S T R A C T Article history: Received: .18 May, 2020 Accepted: 17 July, 2020 Online: 19 August, 2020 The integration of renewable energy distributed generation (REDG) into the energized distribution power grid has become more popular in recent years. This has been escalated by the general global energy shortages. The REDG has proven to be effective for energy sustainability and reliability. However, there are technical challenges which arise from integrating REDG into the energized power grid. These challenges include the effectiveness of power grid protection against faults. In this paper, a fault diagnostic algorithm is proposed to detect faults in a power system integrated with REDGs. The algorithm utilizes wavelet packet transform (WPT) for signal filtering, support vector machine (SVM) for fault classification and detection. The proposed algorithm is validated using the Eskom 90 bus electrical system and the results obtained show that faults can be detected with a high accuracy of 99%. The Eskom 90 bus system is modelled using DigSilent platform and the algorithm is tested on the WEKA software.


Introduction
The reliable and sustainable source of energy plays a critical role for the potential growth in a state [1]. The global energy sector has been faced with many challenges over the years, such as energy shortages, high levels of air pollution from burning fossil fuel to generate electricity, and the high cost of coal. These problems have led to finding an alternative source of electricity supply to meet the required demands. Renewable energy distribution generators (REDG) for instance, photovoltaic (PV), wind turbine (WT), hydropower, biomass, etc.…, have been effective for energy supply with minimum environmental impact. The flexible introduction of REDGs enables their application to be necessary for formulating a power mix framework for energy sustainability.
The integration of REDGs into the existing distribution power grid has numerous practical aids such as voltage improvement, dependability increase, network performance increase, and power loss minimization [2,3]. However, integrating REDGs into the power distribution grid changes the traditional trajectory of the energy supply. These changes in the power flow may influence the power balance of the entire power grid [4]. Furthermore, the meteorological and geological dependency of REDGs affects the expansion planning of the power grid, which results in escalated operational costs [5]. Coupled with these problems are there technical effects that arise from integrating REDGs into the power grid such as frequency variation [6], voltage fluctuations [7], reliable and secure power flow [8,9]. Generally, the performance of the power grid with or without REDGs highly depends on the reliability of the protection system. The high penetration level of REDGs into the grid affects the traditional topology of the protection system. This may result in a catastrophic incident if a fault is not cleared timeously and with great effect. Integrating REDGs into the existing power grid has operational challenges that may affect the technical performance of the system. These challenges include voltage variations, power supply forecasting, frequency fluctuation and load demand management.
A power system grid is prone to faults, whether internally or externally. Electrical protection schemes are essential to timeously eradicate the presence of a fault in the power system [10,11]. In recent years, numerous methods for power system fault diagnostic have been proposed. In [12], a method base on voltage imbalances and third harmonic distortion (THD) was used to determine the faulty section for an integrated power system network. The passive methods including frequency variations [13,14], degree of alteration of power frequency [15], power and voltage variation ASTESJ ISSN: 2415-6698 [16], energy mismatch and harmonic content recognition [17] and the degree of variation of reactive power [18], these techniques were applied in power grid integrated network for fault diagnostic. Moreover, undetected faults have an undesirable effect on the overall performance of the power grid, the active algorithms used for fault diagnostic includes frequency signal injection [19], singular and dual harmonic current injection [20,21], output power variation [22], reactive power control [23,24], and the impedance measurement method [25,26], these methods were used to detect faults in power grid integrated network with REDGs.
To improve the fault detection algorithms, artificial intelligence (AI) and machine learning (ML) have been adopted to solve the problem. The AI and ML algorithms have the advantage of discriminating and classifying different types of faults. Fault classification is an important element for reliability improvement and network security. The authors in [27], proposed a fault classification technique based on the decision tree algorithm. The technique was implemented in a PV plant system connected with the distribution network. An artificial neural network (ANN) based method was implemented for fault diagnostic in a system interconnected with WT energy system [28]. Another method based on ANN for islanding and fault detection in microgrids integrated with the power system was proposed [29]. Other AI and ML algorithms used for fault diagnostic in the power system includes fuzzy logic [30,31], adaptive neuro-fuzzy interface system (ANFIS) [32], and Bayesian classifier [33]. Signal processing plays a major role in improving the fault detection algorithms, signal decomposition techniques which includes wavelet transforms (WT) [34], Hilbert-Huang transform (HHT), multi-resolution analysis (MRA) and Intrinsic Mode Function (IMF) [35] were used for signal tracking to improve the fault detection algorithm. Statistical data analysis is a critical aspect of the decision-making process. The support vector machine (SVM) has been successfully used in power engineering applications for fault diagnostics. The authors in [36], applied SVM for fault detection in high voltage systems. The quadratic function was implemented as an optimal kernel function for effective fault diagnostics. A technique based on SVM for fault diagnostic in active distribution integrated network with the PV plant was proposed [37]. The technique employed SVM for fault classification and isolation. A technique based on a modified multi-class support vector machine (MMC-SVM) was proposed to detect and classify different faults on a power system [38]. A maximal overlap discrete wavelet transform (MODWT) based technique was proposed for fault detection in a power system [39]. In [40], a method based on the multi-agent system (MAS) technique was proposed for fault diagnostic and location in a power system integrated network with DGs. Probabilistic Boolean network (PBN) technique was proposed for fault diagnostic in smart power system grid for performance and technical improvement [41]. A novel technique based on an unknown input observer (UIO) method was proposed for fault detection and isolation on microgrids [42].

Manuscript Organization
In this work, a hybrid fault diagnostic method based on packet wavelet transform (PWT) and support vector machine (SVM) is proposed. The PWT is used for signal decomposition and feature extraction, this is done to reduce computation burden and improve processing time. The particle swarm optimization (PSO) algorithm is implemented for determining the ideal parameters of SVM for maximum classification and detection. The paper is summarized as follow: • The SVM method is used for a dual purpose (fault detection and classification).
• Feature extraction and selection technique using WPT is used to minimize the data size to effectively reduce the computational burden and time.
The remaining sections are organized as signal processing, feature extraction and selection are discussed in section 2, In section 3, the SVM method discussed. Section 4. Discussed the implementation of the proposed fault diagnostic method. The results are discussed in section 5, and lastly, a conclusion is drawn in section 6.

Signal Processing
Signal tracking and spectrum analysis is an essential element of signal decomposition and have been utilized for numerous signal analysis applications. The application of signal tracking is mostly used to distinguish the signal of interest from a range of present signals. Most power utilities use spectral analysis to analyze the signals recorded from a power system during the fault conditions to determine the nature of the fault. In this present work, WPT is used to track the signals of interest to enhance the fault detection algorithm.

Wavelet Packet Transform
The WPT technique is a mathematical tool used for tracking and analyzing signals. The WPT may be observed as the simplification of discrete wavelet transform (DWT) that produces more efficient signal analysis results [43]. The WPT can be utilized for numerous expansions of a signal at different levels. The DWT has generally been used for many power systems applications and has proven to be effective. However, DWT doesn't produce good results for small values. This has led to the implementation of WPT in the present work. When using the WPT, the signal ( ) is passed through numerous filters containing both the low and high pass filtering process. The low frequency is represented by the approximation ( ) coefficient and the high frequency is represented by the detail ( ) coefficient. The ( ) and ( ) coefficients are supplementary decomposed repetitively to a particular point . The main advantage of WPT over DWT is that more features are obtained and the frequency resolution of WPT is greater than that of DWT.
In the present work, 30 kHz is used as the sampling frequency, and level 4 the decomposition is considered. Shannon's entropy criterion is selected. The criterion is used to calculate the entropy at each decomposition level [44]. The best decomposition results are obtained when the parent entropy is greater than the total entropy of the decomposed level. The WPT decomposition tree of the present work is presented in Figure 1.

Extraction of features
The extraction of certain features from a signal is a process of minimizing huge data matrix by selecting significant feature which represents a specific pattern to improve computational processing time. The data matrix is then converted to a feature matrix with fewer data points. The entropy and the energy features are extracted from the fault current signal. Energy is mathematically defined as [45,46]: where, is the energy signal between the time range of ( 1 , 2 ) and ( ) is the signal. The value of the energy signal is greater than the value of the normal signal. The entropy feature is used to measure the signal information content [45]. The measured information includes the cost function of the signal ( ) defined by the entropy ′ ′ such that the energy signal at zero (0) = 0, the entropy is mathematically defined in eq. (2).
where, ( ) is the decomposed coefficient of the signal ( ). The value of the entropy ′ ′ is larger for transient signals and small for normal signals.

Feature Selection
Feature selection is a process of selecting the best features from the developed data matrix. The best features are ranked based on the correlation with the target output, the redundant features are then rejected. The feature selection process is imperative as large redundant data increases computational processing time and gives erroneous results which affect the efficiency of the classification algorithm [47]. The general concept of the feature selection technique is to select the feature which best represents the target. The forward feature selection technique is employed for selecting the best features. The technique is used to calculate the best features at each step iterative and a sub-input features matrix is developed, thus removing the redundant features [47]. The mean square error (MSE) function is used to evaluate the ranking of features determined by the K-nearest neighbor (KNN) technique [48]. The features are ranked from high to low depending on the margin of error.

Pattern Recognition and Classification
Pattern recognition and classification have been a subject of interest for many researches in the past decades. The interest has arisen because of the many application ranging from speech recognition, image identification, power system fault, and optical character recognition. It is therefore important to build intelligent machines that can reliably and accurately be used to solve classification problems. Generally, classification is defined as a process of categorization, in which data, objects, and ideas are recognized and understood to produce an accurate response.

Support Vector Machine
Support vector machine (SVM) was originally established to resolve statistical problems in empirical data modelling. When using SVM, the input data is plotted into a high dimensional space to determine the separating margin between two classes of data. The hyperplane is the separating index between the two classes of data [49]. The hyperplane is optimal when the distance between the class of data sets is maximized. The hyperplane can be calculated using the quadratic programming method defined mathematically as: where is the ith example and the class label which is either +1 or -1 is represented by . The problem is solved using its dual form , subject to 0 ≤ ≤ ∀ , ∑ = 0.

Evaluate fitness of each data set
Parameters with best parameters achieved ?
Optimal SVM parameters YES NO Figure 2: SVM parameter selection process using PSO Kernel function can be used to solve the problem of nonlinearity in statistical analysis. The linear, quadratic, radial bias function (RBF) and sigmoid are the most commonly used kernel functions. The advantage of SVM is that it gives a global solution, it is inclined to overfitting and it converges to local minima. The selection of SVM parameters is a significant and critical task for accurate fault diagnostic in a power distribution system [50]. To improve the performance of SVM, a particle swarm optimization (PSO) scheme is used to select the best parameters of the SVM.
The PSO technique was initially developed by Eberthart and Kennedy to solve optimization in 1995 [51]. The PSO technique is founded on the ordinary conduct of birds during a flight in space. The PSO relies on updating the initial position and velocity of each particle at every iterative until an optimal solution is determined. The process of using the PSO method begins with generating the random particles given by . The best optimization solution is determined by calculating the fitness value of each particle. and finally, the velocity of each particle ( ) is modernized by the mathematical representation defined as: where, is the best global solution, is the solution at the current position, 1 and 2 are the non-negative constants representing the best local and global position weight respectively, and is the inertia coefficient. The position of the particle will be updated using the following expression: To estimate the fitness value of the SVM by utilizing the PSO technique the fitness function is represented as: where, denotes the discrete sample number, ( ) is the discrete signal, and is the SVM output. The process of obtaining optimal parameters for SVM application is presented by a flow chart in Figure  2.

Proposed Fault Detection Technique
Section 4, discussed the proposed fault diagnostic technique which is applied in a power grid network. The fault detection taxonomy proposed in this work is presented in Figure 3. The fault current signal with one cycle is analyzed to detect the fault type that occurred in the power grid network. The fault current signal is decomposed into large frequency sub-bands using WPT. From the disintegrated fault current signal, the statistical signal features (energy and entropy) are extracted. The total set of features to build a matrix is 32 (16 coefficients × 2 statistical features). The fault current signal data is then generated considering simulation conditions. The generated data is then divided into the training and testing data.
From the total extracted features using WPT, other features do not forecast the desired results accurately and thus reducing the efficiency of the scheme. A feature selection technique is employed to eliminate features that do not present desired results. The selection technique uses a ranking algorithm to eliminate the features which are redundant and compromises the accuracy of the proposed technique. The best selected features are then fed into a classifier for fault identification. The PSO method is utilized to determine the optimal parameters of the SVM classifier. The fault classification scheme using the SVM technique is presented in Figure 4. From the presented scheme, each phase of the power system has a classification scheme to identify faults occurring in each phase, another SVM scheme is placed to detect ground faults. To accurately classify different fault conditions, the SVM output is either '+1' or '0', where '+1'    shows that there is a fault and '0' shows that there is no fault in the power system. In practice the line to line fault is usually misclassified as a line to line to ground, this problem may affect the restoration time. To solve this problem, a separate SVM scheme is positioned between the phase and ground where a zerophase sequence current pointer is utilized as a directory value as presented in Figure 4. The index threshold value is determined using a trial and error method, in the present work, the value is set at 0.03. The ground fault detection is detected if the directory value is bigger than the set minimum value. The current index can be defined mathematically as:

Faull Current Measurements
where, , ,and are the instantaneous current values. The classification accuracy ( ) is determined by: × 100 (9) where, represents the accurate fault classification.

Power System Case-Study
In this work, an Eskom 90 bus 22 kV system is considered. The power system network is modelled using Digsilent Power Factory platform. The PV and WT sources are connected into the network at the location optimally determined to satisfy technical consideration (power loss, voltage stability, fault levels, and power quality, etc.…). The proposed Eskom network is presented in Figure 5. The base voltage of the network is 22 kV and the base apparent power is 100 MVA. The total load connected to the system is 115.5MVA. the maximum fault level at the substation is 15 kA. The PV connected into the system is rated at 50kW and the WT is rated at 120 kVA.

Results and Discussion
An integrated power mix energy distribution system is considered in this work. The simulated fault current signals are presented in (a) The sampling frequency considered in the present work is taken to be 30 kHz. The WPT scheme is used to decompose the signal at level 4, and the entropy and energy features are extracted. The choice of a mother wavelet is vital for analyzing the signals. In the present work, Daubechies 4 is selected as the best for transient signal analysis. Some of the WPT decomposed features are presented in (d) Figure 6. Furthermore, the best features are obtained from the extracted features. The best features using WPT to accurately predict the target is presented in Table 1. After the acquisition of best features, the features are subdivided into training and testing data sets. The training and testing parameters with various conditions are presented in Table 2.
Different fault current cases are simulated and the data is subsequently fed into the SVM for classification. A fault training matrix using SVM is detailed in Table 3. In the present work, SVM output is either a '+1 or 0', where +1 shows that there is a fault and 0 is the output of a non-faulty section in the corresponding phase.
An evaluation fault classification process of selecting the optimal mother wavelet is determined and the results are shown in Table 4. In this present work, we further investigated other signal tracking analysis techniques such as the wavelet transform (WT) and the Fourier transform (FT). Based on the analysis done, the WPT signal analysis technique performed better than both the FT and WT. this is because when using the WPT, both the low and the high frequencies are measured and this improves the data analysis of a signal. From the results, dB4 has the highest accuracy level and thus it is selected for the scheme application. The PSO technique is used to obtain optimal parameters of the SVM. In Table 5, the best SVM parameters are presented. The PSO parameters are presented in Table 6.    Table 7, the classification results of different fault cases are presented. From the presented results it may be seen that the accuracy of classification is 1321. In the present work, we tested the Naïve Bayes (NB), Neural Network (NN), Decision tree (DT), and K-Nearest Neighbor. These techniques are implemented in a free machine learning platform Waikato Environment for Knowledge Analysis (WEKA). The classification process is carried by developing a classifier based on the training and testing of different class labels. Subsequently, the test data set is applied to the classifier to predict the accuracy of the classification.
The description of the different classifiers is discussed below: • Naïve Bayes (NB): The NB algorithm is recognized as a fast learning technique. It is a simplified version of the Bayesian classifier and functions and under certain assumptions, (i) Attributes are conditionally independent for the class label, (ii) The prediction process is not affected by the latent attributes [52].
• Decision Tree (DT): The DT is a well-known efficient data mining algorithm for solving difficult problems by formulating computer graphic illustrations. The DT algorithm has been used proficiently to solve real-world problems [53].
• Neural Network (NN): The NN technique was developed by using the biological analysis of a human brain. This algorithm performs better with big data analysis and has proven to be efficient for classification and prediction purposes [54].
The performance of the different classifiers is evaluated using the confusion matrix. The correlation of the predicted instances values of the NB, DT and NN classifiers are presented in Table 8, Table 9 and Table 10 respectively.    The performance comparison of the SVM, NB, DT, and NN techniques applied for classification is presented from Table 11 to Table 14 respectively. To improve the computational time analysis, the fault current data set is subdivided and a quarter of the data sample is used for fault identification. From the presented results the average precision of SVM, NB, DT, and NN is 99.9%, 83%, 99.7%, and 94% respectively. For this application, the SVM classifier performed better than other tested classifiers in the present work.

Conclusion
An increase in electricity demand has enlarged the technical variations of the load-demand phenomenon. To solve this problem external electricity sources have been considered. The environmental attributes of using REDG improves the quality of the air and thus contributing positively to the health of the people. However, integrating REDGs has technical challenges that must be addressed. In the present work, we focus on the fault diagnostic mechanism when the REDGs are integrated into the distribution system. An Eskom power system is modelled, and various fault studies are carried out. In the present work, a fault diagnostic technique is proposed. The method consists of the signal processing scheme, a feature extraction section, feature selection section, and a fault diagnostic section. The WPT is used to decompose the signal into frequency sub-bands, subsequently, the entropy and energy features are selected from the decomposed signal. From the selected features a feature selection scheme is used to select the best features to improve the computational time and reduce burden. The selected features are then fed into the SVM classifier to determine the fault occurrence in the network. The PSO algorithm is used to determine the best parameters of the SVM. We further investigated the effectiveness of other classification algorithms. From the results obtained the SVM classifier performed better with the accuracy of 99%. The future work will entail a fault location scheme in an integrated system.