Review on Smart Electronic Nose Coupled with Artificial Intelligence for Air Quality Monitoring

A R T I C L E I N F O A B S T R A C T Article history: Received: 13 February, 2020 Accepted: 10 April, 2020 Online: 20 April, 2020 With the advent of the Internet of Things Technologies (IOT), smart homes, and smart city applications, E-Nose was created. Almost of gas sensors consisting the electronic nose system suffer from cross sensitivity and lack of selectivity. Coupling smart gas sensors with artificial intelligence algorithms can thus empower conventional gas sensing technologies and increase accuracy in gas detection. This study describes the overall types of smart gas sensors used in air quality control, signal pre-processing and extraction features. Also it presents pattern recognition methods used in E-nose applications including linear methods such as Linear Discriminant Analysis LDA, K-Nearest Neighbors KNN and Non-linear such as algorithms Support vector Machine SVM and Artificial Neural Network ANN and their impact on improving accuracy rate for gas detection. Finally, this paper summarizes by providing directions for about how to leverage the benefits of combining these classifiers which is known as Data fusion approach and ensemble classifiers.


Introduction
Nowadays, with the advent of the Internet of Things (IOT), Society is interconnected with smart cities and the smart industry due to smart sensor growth. Among these sensors, smart gas sensors have been growth in many applications: food quality testing [1], environmental safety monitoring [2], Disease diagnosing [3], detecting the fresh vegetables freezing time [4].Also, smart gas sensors corporate with smart home and building application such as gas leakage [5].
One of the most issues in our life which is hazardous to our health is air pollution. Many works based on this field and created new instruments for monitoring the air quality [6], [7]. Due to its low cost, small size, robustness, the electronic nose approach is considered the most promising method among gas detection methods in several fields.
Electronic nose (E-nose) is mimicking the human olfactory system [8]. The e-nose system consists of two important elements: the hardware component that is an array of gas sensors coupled with the software component that is pattern recognition methods. They are one of the most relevant artificial intelligence systems. The gas sensor array is made from different materials including thin metal oxide films, nanowires, and nanotubes. For each gas each sensor gives a different response. The collection of signals emitted by sensors that are exposed to the sample is called "cross selectivity" of the sensors, allowing a "fingerprint" to be formed and also called signature. All of the signatures constitute a database. Instead, these signatures demand that data processing methods be used to group gases into different groups. The gas identification process can be divided into four successive stages: Signal preprocessing, dimensionality reduction, classification, and validation. The preprocessing stage is often used to specify a certain number of parameters which are descriptive sensor response array [9]. Though pre-processing is based on the underlying sensor technology. At this point there are three general steps: simple manipulation, compression and normalization [10].Principal Component Analysis is used for dimensionality reduction. This method is widely used in gas identification system and olfaction machine [11].
On the other hand, enhancing smart electronic nose performance is heavily dependent on the pattern recognition scheme in use. Artificial intelligence and machine learning algorithms are used to achieve successful discrimination, and pollutant gas detection.

ASTESJ ISSN: 2415-6698
Some machine learning models have recently been explored in E-Nose applications. They can be divided into two categories. The first one is linear such as k-nearest neighbor (KNN) [12], linear discriminant analysis (LDA) [13]. The second one is nonlinear classification models such as the multilayer perceptron (MP) [14], radial basis function neural network (RBFNN) [15] and decision tree (DT) [16]. Support vector machine (SVM) is a promising classifier that has been widely applied in gas identification due to its kernel functions which solve non-linear problems [17].
In addition to individual classifiers, classifier ensemble has attracted considerable attention for E-nose pattern recognition [18].Classifiers combination takes advantages of used individual classifiers to improve gas identification accuracy.
This review provides a summary of recent classification algorithms used in gas detection coupled with E-nose approach and the pertinent researches published in recent years in this field. The remainder of this paper is organized as follows: Section 2 presents a literature review on smart technologies for air quality monitoring and artificial intelligence algorithms. Section 3 exhibits related works about gas sensors and data preprocessing. Section 4 presents finding and discussion. Section 5 is a conclusion to summarize smart electronic nose investigation.

Literature review
Smart Cities applications are an important field of interest. Such technologies will be categorized into different categories by Smart Energy, Smart Lighting, Smart Transportation, Smart Parking, Smart Driving, Smart Buildings and Towns, Smart Grid, Smart Health and Smart Environment.
The notion of dedicated and non-dedicated sensors in smart city was introduced by authors in [19]. The first sort of compound sensors dedicated to a particular mission, including ambient sensors. Non-dedicated sensors are called embedded smartphone sensors such as accelerometer, gyroscope, GPS, microphone, and camera. The last form can be used for various applications. Derek Doran et al. described how people can be in use as human sensors to provide specific sources of information for smart cities. Authors developed a methodology to show how human sensors can support smartphones and other mobile devices through the provision of data streams [20].
For example, the authors in [21] identified an approach to energy sensing for Smart Buildings and Houses. The attitude consisted of voltage noise signatures or voltage signatures to categorize the operation of various devices and to note the similarity or difference between the harmonics 'spectral envelope and existing templates.
In this paper we will concentrate on Smart Environment sensing technologies more precisely pollution sensing technologies.
Regarding the harmful effects of polluting air, environmental control remains a significant challenge. This control involves expensive and fairly complex techniques. Hence the idea of designing an electronic nose is also called an artificial nose. Among smart sensing technologies, electronic nose (e-nose) can greatly enhance human quality of life. E-nose is an instrument capable of detecting odors and of distinguishing them. It consists of odor-sensitive sensors, a signal conditioning system and applications for data processing [22].
As shown in Figure 1, Gardner et al [23] identified an electronic nose as four main components. Next, a diffusion device that transfers air inside the instrument to be checked from the outside, then a chamber that contains the sensors, a signal processing device and a simple software for identification and classification. Figure1. Smart electronic nose system structure diagram As seen in the figure1, Sensor array is the key component in electronic nose. Choice of materials is very important. The most used type is metal oxide MOX gas sensors [24]- [28] for different reasons. In fact MOX sensors have a very high sensitivity to various gases [29].

Feature extraction and dimensionality reduction
The most adopted data analysis method for reduction dimensionality is principal component analysis (PCA) which ps linear and unsupervised method [30]. PCA is a statistical tool allowing simple visualization of all related information. The PCA pathway includes as follows five phases [31]: We typically have a data matrix of n observations on p correlated variables • Compute data mean • Compute the standard deviation for each variable 2 (2) • Compute covariance matrix • Compute reduced centered matrix • Compute eigenvectors and eigenvalues ∑V = ˄V , where˄ is the vector of eigenvalues of the covariance matrix.
Order eigenvalues in descending order λ1>……>λP We note PC1, PC2…, PCk… PCq the main component principal such that each has a linear combination of the original X1… Xp PC k = u 1k X 1 + u 2k X 2 + ⋯ + u pk X p (6) The ujk coefficients are calculated in such a way that all PCk is 2-2 uncorrelated variance and a maximum decreasing significance.
Main component characteristics are defined as follows: The first factor extracted in a main factor analysis typically accounts for a large amount of total variance in the observed variables. The second component will reflect a certain amount of variance in the data set which was not taken into account by the first component.
It is going to be uncorrelated to the first part too. Components 1 and 2 are designed to have a zero association. The two axes obtained after running PCA, called main components, correspond to the highest individual values and contain most of the information. The main components contain the greatest variance of data, and are orthogonal.
Therefore, the percentage of the data variance expressed by each main component is determined using the corresponding own value.
The new set of data obtained with the linear orthogonal transformation was drawn in two or three dimensions, with the most important data set. Thus, PCA avoids consistency of the data and reduces the high dimensionality to smaller dimensions. However this method can't discriminate between gases classes. For this reason, we remedy this problem by using different AI algorithms.

AI algorithms and pattern recognition
After extracting from the sensor signal the most significant parameters, the data generated by a sensor array is grouped into a set of semi-, which are the output of the sensor array and a set of dependent variables. This data will be used by pattern recognition methods to identify detected gases [32].Initially, electronic nose algorithms consist of three stages: drift calibration algorithms, preprocessing and extraction characteristics of signal, and methods for data analysis and classification with artificial intelligence algorithms [33].
Artificial intelligence algorithms can be divided into linear /nonlinear algorithms considering that classification methods are supervised. The unsupervised methods are used for feature extraction and selection.
The methodology for the supervised multivariate analysis involves the labeling of the observations; a priori all the easily determined results are known. It is composed of two separate processes [34]: • Training stage: This process is often carried out in the lab. It consists of building the prediction model with known X data, independent of the sensor response and the corresponding measured Y data. • Evaluation test: In this step only the variables Y are computed. The X variables are calculated using the model that was constructed during the learning process. They test the exactness and efficiency of the model.

Linear classification methods
Linear classification methods include linear discriminant analysis LDA, k-Nearest Neighbors (KNN) and Support vector machine (SVM) can train and identify high dimensional samples in gas recognition.
Like Principal Component Analysis (PCA), Linear Discriminant Analysis LDA is a dimension reduction factorial tool for exploring explanatory variables. It is used to view individual classes in an optimized graphic form. It is based upon Fisher's work in 1936.
Unlike PCA, LDA is a supervised dimension reduction process, which is based on each individual's quest for membership in a class specified a priori. The variables that classify individuals are a fortiori quantitative variables, while a qualitative variable defines the classes. LDA is in fact proposing two approaches: the first is concise. The second is predictive; it consists of deciding the class of allocation of n new individuals as defined by the same explaining variables.
Basic steps involved in the LDA algorithm: • Calculate the within class scatter matrix Where i  is the mean of each class; and c is the number of classes.
• Calculate the between class scatter matrix where i n is the number of observations for each class, is the mean of each class and  is the mean of all the classes.
• Solve the eigenvalue problem = ⋀ The purpose of LDA is to maximize the following objective.
• Implementing a K-Nearest Neighbors (KNN) classification model is simple, since it does not implement any learning function. The only metrics involved are the number K of the closer elements and the distance between the elements which we desire to classify [35]. It is necessary to choose distance so that the device works properly. The simplest distances allow for the achievement of satisfactory results. Euclidean distance, or Manhattan distance, is widely used to measure training sample similarity X = {X1, X2, X3... Xn} to forecast data x = {x1, x2, x3... xn}.
• Support vector machine (SVM) is intended primarily for binary forecasting, i.e. two-class discrimination. Multiclass models were therefore built to understand more complex cases involving more than two classes. It is based on "kernel methods," which allow optimum separation of the data. • SVM algorithms are very powerful for the identification of classesSVM determines the optimal hyperplan split between the groups to optimize the margin. There are two cases presented: linearly separable and nonlinear. Basic steps in the SVM algorithm; For the linearly separable case, the hyper plan has Equation (1).
The distance from a point to the plan is: Maximize the distance means minimize ||w||.
To reduce ||w||, the Lagrange dual issue coefficients should be solved as follows: Kernel functions will be added for the case which is not linearly separable. Example of such kernels, Polynomial kernel (For MATLAB, "d" the order of the polynomial = 3 by default) RBF kernel (δ= 1 by default using MATLAB)

Non-linear classification methods
Many researchers have created the Artificial Neural Network (ANN) because of its benefits of enhanced Big Data. It imitates the human brain, and solves several practical problems in different fields [36]. ANN performs its tasks as follows: As shown in Figure  2, each neuron receives and adds the signal from the previous layer that has a weight relation, compares weighted sum to threshold values and generates outputs by activation function such as sigmoid function, radial basis [37]. There are two simple separate single layer and multilayer versions. All essential layers are interconnected in the first one. Within the second model, which is the multilayer structure, is divided into a layer of input, hidden layer, and layer of output. The number of hidden layers affects precision in classification and prediction [38].
For quantitative discrimination, MLP was used in many researches. V. Krivetskiy et al. [39] demonstrated in their research the identification of individual gasses by a single sensor, combined with methods for pattern recognition. Supervised learning algorithms were used to solve the classification problem: random forest, vector support machines (SVM), multilayer perceptron (MLP), (with one or two hidden layers). Only MLP had been used to solve the quantification problem. As an ANN disadvantage this requires a long period of training. To solve this problem, an algorithm named ELM has been proposed [40]. This is an algorithm of fast learning to feed the neural networks (SLFNs) into a single hidden layer. Instead of iterative tuning the output weights are generated randomly by the hidden node parameters. ELM's run time is therefore fast and demonstrates dominance over other classifiers [41]. The classification efficiency is clearly influenced by the parameters of the algorithm. Meanwhile, ELM's randomly generated input weights and hidden layer biases can make the algorithm unstable. Chao Peng et al [42] proposed an ELM-based Kernel Extreme Learning Machine (KELM) in combination with kernel functions for VOC detection to solve this problem. It was compared against SVM, KNN, and LDA. The results show that the KELM achieves the highest precision performance of about 95%.
Supervised methods are dedicated to quantitative classification such as the Linear Discriminant Analysis (LDA), Support Vector Machine (SVM) and the K-Nearest Neighbor method (KNN). It remains to be noted that artificial neural networks (ANN) are the most complex methods in terms of execution time and ease of implementation. ANN has been especially effective in predicting and classifying complex gas mixtures. In [43], researchers use ANN to model fuel-gas methanol output. A comparison of the main algorithms used for gas detection has been given in table1. Very poor (--); poor (-); good (+); strongly good (++)

Related works
The main component for an electronic nose is a gas sensor array. A sensor for MOX gas is composed of many layers. First of all, a silicone substrate is mounted on which is the dielectric membrane. The silicon oxide layer [44] or silicon oxide and silicon nitride bilayer will make up this membrane [45,46]. His function is to thermally insulate between the substrate and the sensitive layer [47]. Then the micro hotplate enables the detection of gaseous substances to exceed high temperatures. [48]. A Silicon Oxide passivation layer is deposited to separate the micro hotplate from the sensitive layer. The function of Sensitive layer is to react with the gaseous material. Tin oxide (SnO2), tungsten oxide (WO3) and zinc oxide (ZnO) are the most common and widely used metal oxides [49,50].
Tungsten oxide (WO3) is very suitable for gas controlling applications [51,52]. Take this latter material, for example, which is especially sensitive to ozone [53], A large number of researchers developed and tested a WO3 gas sensor to detect various gases, such as NO2, where it was shown that the optimum working temperature for detecting this gas was approximately 225 ° C [54] and H2 where the best annealing temperature was just about 500 °C [55].
In [56], Joy Dutta et al proposed an Air Quality Monitoring System AQMD for tracking urban air pollution. They used two MQ135 and MQ7 air quality sensors, connected to an Arduino board that interacted with the Bluetooth module HC-05. Data were then moved to smartphones. Collected information from various users of smartphones then shaped the city's air quality index map for the outdoor and indoor environments.
Authors at [57] studied over a 20-day period the relationship between human activity and environment. Data from environmental monitoring stations and mobile networks is compiled to assess the air quality and weather. Consequently, authors demonstrated that stronger environmental analysis was achieved by incorporating information from various sources into sensing.
Persaud et al [58]'s application is to use a system that consists of polymer gas and quartz microbalance sensors to monitor the air quality in real time. This smart gas pump has been demonstrated for years to achieve and to detect complex mixtures of volatile substances.
The number of smart city electronic nose network applications is growing ever more. In 1995 [59] Hodgins evacuated the first electronic nose research with the purpose of separating ethanol, dimethyl sulfide and diacetyl from water mixtures using CP sensors.
Authors in [24] studied an electronic nose response based on SnO2 thin dioxide sensors to identify the presence of various gases such as CO, CH4, ISBU and EtOH with different concentrations ranging from 100ppm to 1000ppm. Results showed that the system can detect and recognize gasses with an error of less than 15%.
The ability of the electronic nose to identify and distinguish concentrations as low as 20 ppm for NO2 and 5 ppm for CO was demonstrated in [25]. The main objective was to perform the indoor air quality control IAQ. Authors proposed an electronic nose based on an integrated gas sensor array and highly efficient pattern recognition techniques for the quantification of carbon monoxide and nitrogen dioxide in mixtures of relative humidity and volatile organic compounds. The unit was composed of a metal oxide resistive sensor array, due to its low power consumption at high temperature and high sensitivity. Authors used a flimsy logic system. The goal was to extract the notable information from the response of the sensor array. The machine was able to distinguish and classify the presence of each pollutant in the test area, as well as their mixture.
Features selection is the key step in the classification scheme, which selects a subset from the original functions. In this context, Principal component analysis method was severely applied in the area of pollution monitoring.
For example, the overall objective of [60] was to apply a main component analysis (PCA) to air pollution data in order to provide a comprehensive definition of components that can be interpreted in terms of various air pollution sources and to investigate whether cause-specific daily mortality can be attributed to specific air pollution sources.
In [61] Wu and Kuo used data on air quality collected from eight automated air quality monitoring stations in central Taiwan and examined the connection between air quality variables and statistical analysis in an attempt to accurately represent the difference in air quality observed by each monitoring station and to create a suitable air quality classification system for the whole of Taiwan. The authors in [62] suggested a framework for defining and allocating sources of air pollution in an urban site in France. Using the key component analysis, the identification of the source profiles was achieved. In [63], PCA was used to compare air pollutions profiles of cities. PCA was applied in a study by [64].
However this method can't discriminate between gases classes. For this reason, we remedy this problem by using different AI algorithms.
For the pattern recognition stage, Hui et al [65], used LDA to detect and identify four different industrial gases, which achieved high classification rate and satisfactory predictive ability.
Researchers at [66] have used fishery LDA to classify mutton duck adulteration. For qualitative and quantitative analysis this approach was used. Mahdi Ghasemi et al. [67] suggested LDA as one of the discriminating methods for distinguishing and classifying various varieties of both grown and wild black caraway and cumin. The results show that LDA displays the highest precision performance near 100%.
In [68], wavelet energy has been extracted from the responses of e-nose and e-tongue for the classification of different grades of Indian black tea. Results confirm that classification rate using KNN have improved reaches 99.75%.
Researchers at [69] built on the Zynq platform a quick prototyping of the KNN-based gas identification system. For k = 1 and k = 2 the best results were obtained with a classification accuracy of 97.91 percent and 98.95 percent respectively. For example, a good k can be chosen via various heuristic techniques, such as crossvalidation [70]. The value of k should be chosen on the basis of which classification error is minimised.
In comparison, KNN's biggest drawback is that after the training stage it is considered a slow learner. It does need an extensive test level, though. In order to address this problem, Mehmet Aci et al [71] proposed a hybrid classification system of k closest neighbour, Bayesian methods and genetic algorithm to solve these drawbacks by removing data that makes learning difficult.
Researchers in [72] have implemented hybrid KNN (HBKNN) classification approaches, which perform classification in highdimensional space on the datasets with noisy attributes.
According to some researchers, they are outperforming many of the common methods of gas classification [73,74]. Kyu-Won Jang et al [75] implemented a pairing plot scheme to classify gas form with SVM more effectively to detect CH4 and CO. Apart from the classification task, the difficulty of machine learning methods is to estimate gas concentration or unknown gas concentration. In this context, researchers [76] applied Least-Squares Support-Vector-Machine-based (LS-SVM-based) nonlinear regression to determine the gas concentration of each constituent in a mixture.
SVM presents a considerable attention in gas classification in real time [77,78]. R Faleh et al [52] showed in their research that successful classifications have been reached in the discrimination of three kinds of oxidizing and reducing gas using support vector machine.
In their research, Helli et al [26] used an electronic nose consisting of six Tagushi Gas Sensors for metal oxide (TGS-800, -813, -822, -825, -832, -2105) to track two gas forms: a reducing gas H2S and an oxidizing gas NO2. By applying Discriminant Factorial Analysis (DFA), they researched the capability of the sensor networks to detect the presence of these two gases in mixture or alone. This approach was applied to a complex reference atmosphere containing CO2 and some humidity levels. DFA has proved to be a strong discriminator between different groups (H2S, NO2 and H2S / NO2 mixtures) and even between different concentrations for a single group.
The main research goal in [27] was to build an electronic nose containing five gas sensors (TGS 822, TGS 2442, TGS 813, TGS 4160, TGS 2600), a temperature sensor, a humidity sensor, and wind speed measurements to monitor carbon monoxide, carbon dioxide (CO, CO2) gasses for three different regions of Alexandria-Egypt along the Corniche River and two different rivers. The sensor array had been attached to a 16F628A microcontroller. After that, data clustering was implemented to investigate the relation between measurements from different sensors. Variance analysis (ANOVA) has been used to track critical variables which are to be inserted into the regression equation. Results showed that the electronic nose was able to prove that the Corniche Street was the least polluted place while the most polluted was Abukir Street.
In research [79] Loutfi et al combined mapping of gas distribution with the definition of odors. The system is fitted with two electronic noses gathered into a handheld robot. The classification performance of ethanol and acetone in indoor experiments ranged from 88 to 100 per cent was very high.
In [80], authors analyzed the performance of the three electronic nose systems based solely on semi-conducting gas sensors (TGS880, TGS2600, and TGS1331), amperometric gas sensors (H2S-B4, CO-B4, 7SH CiTiceL) and both types of outdoor air quality sensors. They then compared the predictive potential of the e-noses gas concentration of four environmental gases, namely nitrogen dioxide, sulphur dioxide, ozone and carbon monoxide, with reference data from the air control system's stationary station. They concluded that the best results for prediction are obtained by integrating both types of sensors.

Findings and discussion
In the above subsections, we have provided a detailed overview of individual classifiers such as SVM, KNN, ANN and ELM. Each of these algorithms has their problems. However, by combining certain processes, we can maximize the full benefits.
The data fusion or mixture of data is implemented to a wide variety of fields. To take an example: the human body, the human brain permanently merges knowledge from an object in order to determine its property.
In general, data fusion methods can be classified into three levels; low abstraction level, intermediate level and high level.
In order to determine the accuracy of agricultural and industrial olive oil samples, Natale et al. [81] have introduced two electronic nose systems based on specific sensor technologies (quartz resonators and metal oxide chemoreceptors). Researchers indicated that data fusion for the two abstraction levels: high (fusion of the key components) and low (fusion of sensor data) indicates that cooperation was formed from commercial samples between the two matrixes of animal product detection outcomes.
Pardo et al. [82] simultaneously used MOX sensors to measure more than one sensor property (i.e., conductivity and surface potential) with other types of sensor gases. This included the use of various types of sensors (so-called hybrid matrix) to reduce the linearity of the response and, therefore, achieve better discrimination.
A hybrid electronic nose (6 voltammetric and 15 potentiometric sensors) was used by Gutierrez et al. [83] to differentiate beer forms using data fusion and pattern recognition methods (PCA and LDA). They indicated that in comparison with the simple E-tongue, the performance of hybrid E-language systems using the latest approach of data fusion from different sensor types is more accurate.
Ensemble classifier is considered a mixture of a high level; an overall decision is made by merging the decisions of the classifiers used. This combination of classifiers can improve classification. Fusion requires modification of a learning ensemble to promote better classification accuracy. In other words, it is desirable to design and build a classification system using various combination classifiers to boost the overall classification accuracy.
Lijun Dang et al [84] have proposed a new fusion approach that carries out an efficient weighted base classifier process called an enhanced support vector machine ensemble (ISVMEN). Results show that the average accuracy of classification was increased from less than 86 % to 92.58% percent relative to that of base classifiers.
A novel hybrid approach to solving the problem of crossselectivity of gases in electronic nose (E-nose) using the combination classifiers of support vector machine (SVM) and knearest neighbor (KNN) methods was proposed in research by R.Faleh et al [85]. The new coordinates calculated by PCA were used as inputs for classification by the SVM method. Finally, the classification achieved by the KNN method was carried out to calculate only the support vectors (SVs), not all the data. The proposed fusion approach has been shown to lead to the highest classification rate relative to the accuracy of the individual classifiers: KNN, SVM-linear, SVM-RBF, and SVM-polynomial.
The combination of SVM, PNN and LDA was shown a high accuracy rate compared to classification results obtained with individual classifiers [86]. Experimental findings show that the optimized Adaboost. M2 model, which solves a multi-class identification problem for Chinese herbal medicine, combined with SVM, PNN and LDA, achieves the best 91.75 %accuracy, compared to 87.62 % of the best single classifier SVM.
The above discussion shows that using multiple classifiers together lead to get maximum benefits in various applications.

Conclusion
Health and environmental effects whose origin is liability to air pollution requires monitoring and controlling air quality parameters. In this way we present this review summarizing various sensing technologies for air quality supervising. This paper was devoted primarily for Electronic nose technology. Metal oxide Gas sensor was then presented with different layers consisting a gas sensor. Signal processing for feature extraction based on PCA was then introduced. In addition, gas pattern recognition technology based on machine learning were detailed and compared. Finally, ensemble classifier notion was discussed as a solution to improve classification accuracy.