Optimized Component based Selection using LSTM Model by Integrating Hybrid MVO-PSO Soft Computing Technique

A R T I C L E I N F O A B S T R A C T Article history: Received: 09 April, 2021 Accepted: 17 June, 2021 Online: 10 July, 2021 Research focused on training and testing of dataset after Optimizing Software Component with the help of deep neural network mechanism. Optimized components are selected for training and testing to improve the accuracy at the time of software selection. Selected components are required to be attuned and accommodating as per requirement. Soft computing mechanism such as PSO and MVO will be used for optimization. Deep NeuralNetwork mechanism is performing training and testing to get the confusion metrics of true positive/negative and false positive/negative. The accuracy, precision, recall value and fscore are computed to assure accuracy of proposed work. The proposed mechanism is making use of LSTM layer for more accurate output. Proposed research is exploring inadequacy of existing research and extent of incorporation of previous mechanism to soft computing mechanism in CBSE.


Introduction
Research is considering the dataset of the CBSE model [1] where the dataset presenting software component selection is trained and during testing of the trained network the confusion matrix is produced. According to the confusion matrix the accuracy, F-score, Recall, and precision values are found. On the other hand, data set of each grade would be passed to a hybrid MVO-PSO optimizer to find an optimized rating for each grade to filter the dataset. It could be said that the optimized value for each result is kept to find accuracy, F-score, Recall, and precision values accordingly. Finally, the comparison of accuracy, F-score, recall as well as precision values for non-optimized dataset to optimized Precision, F-score, and Recall for optimized dataset would be performed.

CBSE
The method of creating various software projects in different categories has been examined in Software engineering. Computer engineering applications have often been used to achieve this goal. However, the dependability of the software system is a difficult job to predict. CBSE may be regarded as a software reliability mechanism that is able to handle the issue. The broad method that supports the creation of various components depending on current software research was evaluated in terms of component-based software engineering. The newest software is not a simple task for beginners, but CBSE [2] allows developers to minimize efforts during the creation of software. Different influence variables are important in the case of CBSE, such as reusability, reliability, component dependence, and interactions between components. These characteristics promote the development of new software and reduce system complexity. There were many software computing methods that tried to predict software dependability. There are several observations. During software development, the selection of component steps has been discovered. Componentdependent design patterns of software are utilized for the recovery and assembly of components.

Reinforcement Learning
Reinforcement Learning is determined as Machine Learning. It is a branch of AI. Reinforcement Learning has been considered as a category of Machine Learning. It is also considered a branch of AI. Exactly permit a representative of hardware and computer program to mechanically identify perfect attitude in particular circumstances, for maximizing its efficiency. Exactly a type of Machine Learning method which permits representative of hardware and computer program to mechanically identified perfect attitude within a particular circumstance, for maximizing its efficiency. Reinforcement learning is utilized in various graphical games and uses a multi-agent mechanism to control the approach to environment exploration. It is also integrated with the company of abstraction methods which have the various levels to form powerful games dependent on artificial intelligence. Agent: The performer in the learning system is an agent. All actions are performed by the agent in the environment. The agent gets a reward as per action.

State:
The state represents the current status of the environment that plays a significant role in assigning a reward to agents.
Environment: All action of agents is performed in the environment and rewards are providing to the agent as per the state of the environment.
Action: It is a method of representative due to which it communicates and exchanges its setting, in the middle of states. Whenever an action is executed by representatives it gives output inform of reward-dependent on the setting.

Reward:
A reward in RL is part of the feedback from the environment. When an agent interacts with the environment, he could observe changes in state and reward signals through his actions.

Recurrent Neural Network
RNN network has been considered as a class of neural networks where interconnectivity among nodes is forming a directed graph. It is also creating a temporal sequence which is allowing it to show temporal dynamic behavior. RNN is capable to utilize their memory to compute sequences of inputs of different size as these are inherited from neural networks that are based on feed-forward. It is making them unable to implement operations like un segmented as well as interconnected consideration. RNN has been utilized to take into account two different broad categories of networks that are supporting the usual structure. Here one is having finite impulse. But another one is having impulse which is infinite. Such categories of networks have exhibited runtime actions that are not permanent.

Long Short-Term Memory
LSTM is a well-recognized artificial RNN. This is often used in the field of profound education. Feedback connectivity is provided by LSTM. It is not like a neural network feed. Not only are single data points like graphs processed. Sequences of information such as audio and video are also completed. In the case of LSTM networks, categorization is deemed appropriate. It does process and predicts based on information from time series. This is because there may be temporal delays not known throughout time series in important occurrences.

RNN and LSTM
An ongoing neural network is also known as an RNN. The category of ANN is examined. The node connections provide a network guided by a graph. You accomplished this with the sequence of time. Standard recurring neural networks have disappearances and explosions. LSTM networks were regarded as RNN types. Besides conventional units, LSTM is supported by special units.

Resolving Overfitting Problem by Dropout Layer
The dropout layer is playing a significant role in resolving the issue of over-fitting. The issue of over-fitting arises during the training of the neural network model. The dropout layer is used to handle such issues. If the training is continued, then the model adopts idiosyncrasies. Sometime training becomes less suitable for data that is new to it. This data could be different samples from the population. The model is considered to over-fit when it is too welladapted to training as well as validating data.
Over-fitting is traced during plotting by checking the validation loss. The model is over-fitting when training loss is constant or it is decreasing. Techniques known as regularizes are used to minimize the influence of over-fitting. Dropout has been considered one out of them.
Dropout is working by eliminating or dropping out the inputs to layer. These could be input variables in a sample of data that is the output of the previous layer. In other words, the Dropout layer is attached to the model among previous layers. It applies to the results of the last layer that have been fed to the next layer. This is influenced by the simulation of a huge network with various network structures. Dropout rate could be considered to layer as chances of configuring every input to layer.

Literature Review
There are several types of research in the field of componentbased software systems, optimization mechanisms, and neural networks. Researchers have used the SVM as a Classification Method for the Prediction of a defect in software with code metrics. Moreover, research for Performance Modeling of Interaction protocol for CBSD using OOPS based simulation came into existence. After some time, research related to software components selection optimization for CBSE development was made. Researchers applied particle swarm optimization for the performance prediction of the software components. Building models for optimized CBSE was built in several applications development. Some researchers did reliability estimation and performed prediction and measurement of CBSE. A Genetic Algorithm was proposed to manage SVM for predicting components that might be fault-prone. S. Di Martino proposed genetic algorithm. The objective of their research was to set the SVM in order to forecast components that are fault-prone. M. Palviainen did research to estimate reliability. They predicted and measured of component-based software.
In [3], the author applied PSO to software performance prediction.
In [4], the author proposed optimization model. This model has been developed for selection of software component. It has been used in development of several applications.
In [5], the author used SVM based classifier approach for reusability of software components.

Response Feedback
Learns Reinforced response In [6], the author performed multi-objective optimization. They did optimization of software architectures. Authors have used Ant Colony Optimization mechanism to accomplish their objective.
In [7], the author did estimation of software reusability in case of component-based system. They made used of several soft computing techniques in their research.
In [8], the author proposed adaptive Neuro fuzzy model. The objective of their research was to predict the reliability in case of component-based software systems.
In [9], the author introduced research on Test case prioritization. This research considered prioritization to perform regression testing. Research made use of ant colony optimization.
In [10], the author proposed research on Neuro-Fuzzy Model in order to find & optimize the Quality as well as Performance in case of CBSE.
In [11], the author presented dynamic mechanism in order to get software components with support of genetic algorithm.
In [12], the author proposed LSTM-based Deep Learning Models. The objective of research was to perform Non-factoid Answer Selection.
In [13], the author did research on multi-Verse Optimizer. They considered it as nature-inspired mechanism in case of global optimization.
In [14], the author proposed research to detect inconsistency in software component. Author made use of ACO and neural network mechanism.
In [15], the author presented deep learning mechanism in case of short-term traffic forecast. The research considered LSTM network.
In [16], the author proposed multiple target deep learning in case of LSTM.
In [17], the author proposed Preference-based component identification by making use of PSO.
In [18], the author proposed research on deep learning in order to perform solar power forecasting. Their approach made use of AutoEncoder along with LSTM Neural Networks.
In [19], the author presented quality assurance by soft computing mechanism. The research focused in field of component-based software.
In [20], the author did quality prediction by making use of ANN. Their research was based on Teaching-Learning Optimization.
In [21], the author proposed multi-objective model for optimization.
In [22], the author did Component selection considering attributes.
In [23], the author did software reliability prediction by making use of Bio Inspired approach.
In [24], the author proposed identification and selection of software component.
In [25], the author proposed model in order to predicting CBS reliability by making used of soft computing mechanism.
In [26], the author proposed enhanced Ant lion Optimizer along with Artificial Neural Network. This research focused on predicting Chinese Influenza.
In [27], the author Imoize considered software reuse and metrics in software engineering.
In [28], the author focused on improvement of reusability of component-based software. Research considered the advantages of software component by making use of data mining.
In [29], the author introduced hybrid Neuro-fuzzy as well as model for feature reduction in order to perform classification.

Problem Statement
Many studies shown the selection of software components, but the optimization mechanism is required for better performance. PSO was used to optimize results in previous studies. However, it is found that MVO offers better performance. In addition, an intelligent model should be introduced that may allow for a deep learning approach using RNN based on LSTM. However, it takes lot of time during training and testing. Then it finds accuracy, fscore according to confusion matrix. The optimization has been included to neural network model for better performance. Through the incorporation of the LSTM MVO-PSO hybrid method, the proposed study is needed to address the performance and accuracy problem.
The proposed work is answer for the accuracy and performance issues faced in previous researches.

Proposed Work
In the proposed work, the dataset of the CBSE model is considered. This dataset is trained using a neural network mechanism. During testing of the trained network, the confusion matrix is produced. On the basis of this confusion matrix the accuracy, F-score, Recall, and precision values are calculated. On other hand, the dataset is classified grade-wise in order to get optimized value for each grade. The data set of each grade would be passed to a hybrid MVO-PSO optimizer in order to get the optimized rating for each grade. The data of each grade would be filtered on basis of these optimized values. In another word, the data above the optimized value for each result would be kept. Then the accuracy, F-score, Recall, and precision values are calculated considering this filtered dataset. Then the comparison of accuracy, F-score, Recall, and precision values for non-optimized datasets are made to optimize, F-score, Recall, and precision values for the optimized dataset.
The figure illustrates the proposed system is required to train a component-based selection system that should be capable to predict with maximum accuracy. The trained model for component-based selection would be made with the support of LSTM and simulated in a Matlab environment. Existing researches in component-based selection have provided limited accuracy with limited precision, f-score, and recall value. The implementation of such a model is quite challenging but such research opens doors for innovations. There are several existing types of research that have contributed to the field of component-based selection. It has been observed that previous researches have made use of Fuzzy logic, Genetic algorithm, Machine learning mechanism, KNN classification, LSTM model. But these researches are suffering from accuracy issues. Long Short-Term Memory networks have been considered as a category of recurrent neural networks. This is found capable to get taught order dependence in case of sequence prediction problems. This is a behavior needed in case of complicated issue domains like translation by machine. Long Short-Term Memory has been considered a complicated field of deep learning. This is difficult to understand Long Short-Term Memory. There has been little work in the field of Long Short-Term Memory. LSTM units are consisting of a 'memory cell'. These memory cells are capable to maintain data in memory for a large time. Users are moving from RNN to LSTM because it is introducing more controlling knobs. They are capable to manage the flow and mixing of Inputs according to trained Weights. So it provides flexibility during the management of outputs. Thus LSTM is providing the ability to manage and good results.

Performance Parameters
In this section, we will define the following parameters and confusion matrix is produced using true positive (TP), true negative (TN), false positive (FP), false negative (FN).
TP: True positives have been considered as correctly predicted positive values. In other words, the value of the real category is true and the value of the category that has been predicted is also true.
TN: True negative have been considered as correctly predicted negative values. In other words, the value of the real category is false and the value of the category that has been predicted is also false.
• FP: False positive is the case when the actual category is false but the predicted category is true. • FN: False-negative is the case when the actual category is true but the predicted category is false.
Parameters utilized to confirm results have been f-score, recall, accuracy and precision which have been explained as follow: 1. Accuracy has been considered as intuitive performance measure. This is the ratio of correctly forecasted findings to total findings.

Simulation
In this section, dataset of 629 packages have been considered for training purposed in research where the average rating is available considering factors such as number of reviews, total sentences, feature requests, feature requests in percentage, problem discoveries, problem discoveries in percentage, GUI Contents, Feature and Functionality, Improvement, Pricing, Resources, Security. A network model has been trained considering this dataset.  In order to get 100% accuracy, there is need of following confusion matrix    A  B  C  D  A  110  1  1  0  B  1  225  2  0  C  2  2  237  1  D  1  1  1  44  Total  114  229  241  45 Considering above confusion matrix overall accuracy has been found Results TP: 616 Overall Accuracy: 97.93%

Optimization of GRADE A
Hybrid MVO-PSO is applied in order to get the optimized data set for training;

Hybrid MVO-PSO optimization of GRADE A results in
At iteration 50 the best universes fitness is 0.30119 At iteration 100 the best universes fitness is 0.30119 At iteration 150 the best universes fitness is 0.30119 At iteration 200 the best universes fitness is 0.30119 At iteration 250 the best universes fitness is 0.30119 At iteration 300 the best universes fitness is 0.30119 At iteration 350 the best universes fitness is 0.30119 At iteration 400 the best universes fitness is 0.30119 At iteration 450 the best universes fitness is 0.30118 At iteration 500 the best universes fitness is 0.30118 The best solution for class A obtained by MVO is: 4.6807. The best optimal value for class A of the objective function found by MVO is: 0.30118 Elapsed time is 6.498883 seconds.

Optimization for Grade B
At iteration 50 the best universes fitness is 0.30119 At iteration 100 the best universes fitness is 0.30119 At iteration 150 the best universes fitness is 0.30119 At iteration 200 the best universes fitness is 0.30119 At iteration 250 the best universes fitness is 0.30119 At iteration 300 the best universes fitness is 0.30119 At iteration 350 the best universes fitness is 0.30119 At iteration 400 the best universes fitness is 0.30119 At iteration 450 the best universes fitness is 0.30119 At iteration 500 the best universes fitness is 0.30118 The best solution for class B obtained by Hybrid MVO-PSO is: 4.2356. The best optimal value for class B of the objective function found by Hybrid MVO-PSO is: 0.30118. Elapsed time is 3.650676 seconds.

Optimization of Grade C
At iteration 50 the best universes fitness is 0.3017 At iteration 100 the best universes fitness is 0.3017 At iteration 150 the best universes fitness is 0.3017 At iteration 200 the best universes fitness is 0.3017 At iteration 250 the best universes fitness is 0.3017 At iteration 300 the best universes fitness is 0.30149 At iteration 350 the best universes fitness is 0.3014 At iteration 400 the best universes fitness is 0.30134 At iteration 450 the best universes fitness is 0.30132 At iteration 500 the best universes fitness is 0.30131 The best solution for class A obtained by MVO is: 3.6263. The best optimal value for class A of the objective function found by MVO is: 0.30131. Elapsed time is 3.911880 seconds.
After optimization 139 components have been selected where average rating is more than 3.6263

Optimization of Grade D
At iteration 50 the best universes fitness is 0.30156 At iteration 100 the best universes fitness is 0.30156 At iteration 150 the best universes fitness is 0.30156 At iteration 200 the best universes fitness is 0.30156 At iteration 250 the best universes fitness is 0.30156 At iteration 300 the best universes fitness is 0.30156 At iteration 350 the best universes fitness is 0.30153 At iteration 400 the best universes fitness is 0.30151 At iteration 450 the best universes fitness is 0.3015 At iteration 500 the best universes fitness is 0.30149 The best solution for class A obtained by MVO is: 2.4891. The best optimal value for class A of the objective function found by MVO is: 0.30149. Elapsed time is 2.961058 seconds. After optimization 31 components have been selected where average rating is more than 2.4891. After grade wise optimization the following components would be selected according to grade.   B  0  119  1  0  C  0  1  137  0  D  0  0  1  31  Total  58  120  139  31 Considering above confusion matrix overall accuracy has been found Results TP: 345 Overall Accuracy: 99.14%

Comparative Analysis
This section is comparing the accuracy, precision, recall and F1 score before optimization and after optimization. Comparison of accuracy for previous and optimized data set has been shown in table 9.
Comparison of Accuracy for previous and optimized data set has been shown below

Conclusion
It has been concluded from simulation that the optimized dataset is capable to produce more accurate result as compared to non-optimized mechanism. Research has considered training and testing of dataset after Optimizing Software Component by deep neural network mechanism to improve the accuracy at the time of software selection. Deep Neural-Network mechanism has performed training and testing to get the confusion metrics of true positive/negative and false positive/negative. Training has been performed using LSTM neural network mechanism to produce confusion matrix for getting accuracy, F-score, Recall and precision values. Simulation result concludes that the accuracy, Fscore, Recall and precision values in case of optimized mechanism are better than non-optimized mechanism. Table 10 shows that proposed work is providing high reliability and feasibility as compare to previous research models.

Scope of Research
Such research could play a significant role in the field of software development, AI, big data processing, and many other fields where prediction is the major objective. Such a mechanism is suitable to provide an efficient and accurate approach to perform forecasting and decision-making in different areas. Moreover, further researches could use this research as a base in order to get more fruitful results.