On the Application of Combine Soft Set with Near Set in Predicting the Lung Cancer Mortality Risk

A R T I C L E I N F O A B S T R A C T Article history: Received: 30 November, 2020 Accepted: 11 February, 2021 Online: 17 March, 2021 The advancement of artificial intelligence is quick as it can be quickly deployed in many ways, such as medical diagnosis. Lung cancer is both men's and women's deadliest form of cancer. The best clinical approach to non-small resectable cell lung cancer treatment is surgical. Patients who undergo lung cancer thoracic surgery do so with the hope that their lives will be prolonged for a reasonable period afterward. In this paper, we suggest an expert system for calculating the risk factor for mortality one year after thoracic lung cancer surgery. Centered on clinical and functional evidence from cancer patients with lung cancer resections, we are developing an interesting hybrid model combining near sets with soft sets, namely soft near sets. as a system for not only predicting patient lung survival or not but also, to determine the degree of risk. Some fundamental concepts of the proposed model are introduced. Basic properties are deduced and supported with proven propositions. The correct survival classification is done with 90.0 % accuracy. Our innovative soft-near setbased criteria for determining the survival rate is an effective and reliable diagnostic process. Identify the possibility of lung cancer surgery will help the doctor and patients make a more informed decision about how to locate the treatment methods.


Introduction
More recently, scientists have been interested in ambiguity modeling so that they can describe and extract useful information hidden in uncertain data [1]. In 1999 [2] the author introduced a new mathematical method to overcome uncertainties. It is referred to as a soft set that has rich application potential in many directions. In [3] the author present the implementation of soft set theory in a decision-making dilemma, and extended to fuzzy soft sets by classical soft sets in [4]. Established interval-valued fuzzy soft sets by author in [5]. In [6] the author also presented a new concept of soft set parametrization reduction. Several researchers have recently investigated the application of hybrid soft-set models in different systems, such as [7][8][9][10].
In [11], the author initiated the perception of rough set theory. A set is considered rough if the boundary between its lower and upper approximation is nonempty. Near set theory, proposed by author in [12], Differ in terms of the boundary principle of the set approximation by traditional rough set. In [13][14][15], the authors studied it in many papers.
Near set theory provides a description-based approach to observing, comparing, and classifying perceptual granules, as the study of It concentrates on the exploration of granulate affinities. A similarity of object definitions is the fundamental concept in the near-set approach to object recognition. Objects having the same appearance (objects with corresponding descriptions) are perceptively considered near one another. Many papers have appeared as a generalization of traditional near sets models such as [16,17].
The key to an understanding of near sets is the notion of description. Each perceived object is represented by a vector of feature values and each feature is presented by a probe function that maps an object to a real value. Probe functions in near set theory provide a link between soft sets and near sets since every parameter in soft set theory is a particular form of probe function. It follows that we illustrate some important relations between soft relations and model near sets.

ASTESJ ISSN: 2415-6698
Alongside scientific advances and investigations, there has been a rise in the number of real-life implementations dealing with unclear, imprecise, contradictory, and confounding information in different fields in recent years [18]. Soft computing is a sophisticated form of computation that corresponds to the human intellect's extraordinary connection to intention and perception in an environment of imprecision and ruggedness to improve and strengthen the Inclusiveness of the real-life decision-making process [19][20][21].
We will present the concept of soft sets in this paper and then redefined some essential concepts of traditional near sets, based on soft sets, as soft near concepts. Basic properties of soft near approximations are deduced and proved. We developed a methodology to expect survival of the patient from soft near identified theory perceptions in lung cancer resections.

Lung cancer resections survival study
Lung cancer is the explanation for roughly 170,000 deaths of cancer within the U.S, accounting for nearly 25 percent of all cancer demises. More people die as a result of lung cancer per year than of the combination of colon, breast, and prostate cancer. The 5-year lung cancer mortality rate is 55 percent for diagnosed cases where the cancer is also located. The diagnosing of carcinoma at an initial stage is incredibly low (about 16%); also, more than 50% of lung cancer patients died before the first year after their discovery [22].While early diagnosis and timely therapy are good for cancer patients' survival rates, the growing health-related issues are negatively impacted [23].The combination of surgery, chemotherapy and radiation treatment has been improved on lung cancer treatment. Pneumonectomy is important to the administration of non-small cell lung cancer (NSCLC), but it is associated with high mortality rates [24]. Thus, patient's candidate for pneumonectomy must be considered cautiously in order to prevent surgical dangers [25].
Numerous factors are influencing the complication and mortality of malignant disease after pneumonectomy [26]. In [27] the author Analyzed the health history of 406 successive patients who experienced pneumonia to detect postoperative signs and risk factors affecting long-term survival. In [28] the author Discuss postoperative mortality in patients who have undergone pneumonectomy with vascular disease and find that they are significantly elevated. Complications occur or grow in all cases in those patients that have vascular or insulin related diabetes after surgery. In [29] the author proved that the high controlling nutritional status (CONUT) score, centered on some in peripheral blood that easily be calculated from blood examination data are a significant indicator of a One-year mortality rate in patients undergoing lung resection. The adverse survival factors after pneumonectomy, according to author in [30], involved older age, prolonged resections, advanced stage, postoperative nonlethal complications, adenocarcinoma, and others. Several articles have established gender and other demographic factors of people on lung cancer, and age, gender, tumor size, FEV1, histology, and tumor classification are significantly influenced by survival rate in this type of cancer [31,32].
Surgical therapies prove to be an effective tool against lung cancer. If it is not possible to provide surgical resection, the death rate grows with all the diagnostic treatment.

Contribution and Organization of the Paper
Our approach was influenced by previous innovations of soft computing techniques for medical diagnosis. In [33] the author Used fuzzy set with soft set to predict a 5-year survival rate in lung cancer patients undergoing pulmonary resection. In [34] the author conduct a comparative medical prostate cancer diagnosis analysis using a multi-criteria decision-making approach in a fuzzy environment. In [35] the author presented solutions to improve the approach of soft rough sets and introduced a medical application about the Coronavirus "COVID-19" to illustrate the important solutions in decision making. In [36] the author Review rough and near-set methods to multiple medical imaging challenges.
In this work, we using the new method that depending on soft set with near set to determine not only classify a patient as mortality risk of one-year time for lung cancer patients or not, but also, to determine the degree of that risk. Also, we analyze its results and then conclude them in a diagram as a statistical representation. Finally, we support this application with the algorithm and some decision rules.

Preliminaries
In this section, basic definitions of information system, soft set, and near set approximations are introduced.

Definition 2.1 [37] IS = (U, E, V, f)
is called an information system, U is called universe and consist of a nonempty finite set of objects. E is a finite non-empty set of attributes, V = ∪ { , ∈ }, is the value set of attribute e, and f: → , is called an information (knowledge) function or knowledge representation system and if = {0,1}, for every e ∈ E, then IS called a Boolean-valued information system. [2] Consider U as an initial universe set, E as a set of parameters, A ⊆ E, and let P(U) represent the power set of U. Then, a pair S = (F, A) is named a soft set over U, where F is a mapping given by F: A → P(U). In other words, a soft set over U is a parameterized family of subsets of U. For ∈ A, F ( ) It may be viewed as the set of -approximate elements of S. Definition 2.3 [11] An equivalence class of an element x ∈ U, determined by the equivalence relation E is The following definitions and concepts of near sets are introduced by Peters in [12][13][14].

Definition 2.4
Let F represents a set of features of objects in a set X. For any feature a ∈ F, we associate a function fa ∈ B that maps X to some set (range of ). The value of (x) is a measurement associated with a feature a of an object x ∈ X. The function is called a probe function. www.astesj.com

334
The study of near-set theory shows an interest in classifying samples using probe functions that are correlated with objects, for example, digital images, defined probe functions are color, shape, contour, spatial orientation, and line length segments through a bounded field. Definition 2.5 GAS = (U, F, Nr, VB) is a generalized approximation space, where U is a universe of objects, F is a set of functions representing object features, Nr is a family of neighborhoods defined as: After previous definitions, Interesting relations between soft sets and near sets can be indicated. Near sets philosophy is dependent on information (data, knowledge) about every object of interest. For example, if consider the patients that have the certain illness are objects, symptoms of this disorder are features and then there exists a probe function for every symptom measuring the values of it. Hence, we get an information system, which can be regarded as a tabular representation of a soft set, its universe is the set of objects (patients) and its parameters are symptoms of a certain disease thus any soft set could induce an information system. On it, near set approximations can be redefined. For more illustration, Remark 2.1, is given Also, near set approximations may be redefined as soft near set approximations, based on the concept of soft set.

Soft near set approximations (SN-set approximations)
In this section, lower SN-approximations and upper SNapproximations are defined. Also, their properties are deduced and proved.  For illustration, we consider the following example.
Example 3.1 Let us consider the following soft set S = (F,A) Which identifies the conditions of patients suspected influenza, that a hospital is considering to make a decision. Suppose that the universe U = { 1 , 2 , 3 , 4 , 5 }, consists of five patients and A = { 1 , 2 , 3 , 4 } is a set of decision parameters. The ai (i = 1,2,3,4) stands for fever, nasal discharges, headache and sore throat, respectively. The soft set S = (F, A) over U, given by the following collection of approximations {(fever, the soft set can be viewed as a boolean-valued information system corresponding to S, given by Table 1, as follows Table 1 Boolean tabular representation of the soft set an Example 3.1. is named SN-positive region of the considered set X, with respect to all parameters taken r parameters at a time. The real meaning of PosrX is the set of all elements, which are surely belonging to X, having r parameters. Definition 3.5 Let (F, A) be a soft set over a nonempty set of patients U, A be a set of parameters measuring some symptoms of a certain disease, ξ r be the family of all elementary sets of U, and let every parameter in A has the same importance in this disease. Then, we can measure the incidence of this disease, in any subset X ⊆ U, by the following concept It is easy to see that, this concept aims to discover the incidence of a certain disease in a specific area (surrounded region) to be able to take a suitable decision, in an obvious view.
The value of r is defined by disease type (here, r is the number of symptoms, which the person must have, to be a patient).
It means that the set X is suffering from this disease with 37.5%. It follows that this set (specific area) needs some prevention of this disease.

Soft near set concepts (SN-set concepts)
In this section, certain definitions and properties of near-set principles are redefined.

Definition 4.2 Let (U, A) be an information system based on a soft set
can not be canceled, and then we can define the core of parameters A, as follows  Let w(a) = 1, then | {Ri ∈ R : a ∈ Ri}| = |R|, and then for all reducts Ri, we have a ∈ Ri. Hence, U/A ≠ U/ [A − {a}], it follows that the parameter a ∈ A cannot be canceled. Therefore, a ∈ cor(A).

Remark 4.2
In this paper, the notion of soft nearness is meaning near in, only, a positive view of the parameters. For more illustration, if we have a soft set, its universe contains some patients and its parameters are the symptoms of a certain disease. Here, two patients are considered soft near each other if there exists, at least, one symptom (parameter) such that these two patients suffering from it (near in their illness).

Dataset
This research was based on data collected from patients with lung cancer referring to health care centers which are also available in UCI datasets [38]. This dataset was compiled retrospectively between the years 2007-2011 and was registered in the Polish National Cancer Registry. The dataset consists of 17 variables as seen and defined in Table 2. 470 samples are included in the dataset and there are 16 discrete inputs and one discrete output element. The qualitative or quantitatively of each of the features mentioned in Table 1. Two classes of 0 or 1, indicating death or life respectively.
As a preprocessing step, we selected the most significant attributes (performance, dyspnoea, cough, tumor size, and diabetes mellitus) of dead people after a year, and leaving people in the dataset was performed by author in [39] as shown in Table  3. The data includes two categorical variables are show in Table  4.      (2) Tumor_Size 1 (smallest) to 4 (largest) The encoding system scheme [40] converts categorical attributes into a format that effectively addresses algorithms for classification and regression. One hot encoding produces new   max ( , ) : , ( , ) , r x y x X y Y R X Y X X   =  (binary) columns, indicating the presence of each possible value from the original data and represented by a vector where all elements of the vector are 0 except for one, which has a value of 1 as seen in Table 5.

Methodology
The algorithm is defined to measure the lung cancer mortality risk for patients after lung cancer resections in period 1-year based on 5 parameters. by New soft near set approach. 11. Compute ({ }, ), for all ∈ , (the disease degree of the patient pi ). 12. Represent ( , ({ }, )) for all ∈ , in a statical model. 13. If 0.1 < ({ }, ) ≤ 0.5, then will be a survival patient. 14. If 0.5 < ({ }, ) ≤ 1.0 then will be dead in one year.
Step1: Input the Boolean-valued information system corresponding to a considered soft set S = (F, A) on U as shown in Table 4.
Step 2: Compute the set cor(A).
From Table 5, we can deduce that, It follows that, Cough and Dia_M cannot be canceled, then    Table 7.
Step 4. From Table 4 Compute every set of parameters A⊆ A, such that X/A = X/A: ; as the values of all attributes are not equivalent cannot be dropped any attribute, as a result we get Table 6 still without change.
The set of parameters A will be A´= {per_0 , per_1 , per_2 ,Dysp ,Cough, TN_1, TN_2, TN_3, TN_4, Dia_M } and the boolean-valued information system corresponding to the soft set S´ = (F,A´ ) can be presented in Table 4. The soft set ( , ′ ) on P, is given in a tabular form, in Table 8.
By using Definitions 4.1 and 4.5, the soft nearness degree between p1 and s1 in a soft set S ´= (F, A´ ) over the set (U ∪ P), can be calculated as follows.  Table 9 introduces soft nearness degrees between every element in the universal set U and every element in the set of standard patients P, Step 11. Compute ({ }, ), for all ∈ , (the disease degree of the patient pi). From Table 9, we can deduce Table 10, where ∈ .
By using Definition 4.7 and Table 9, the nearness degree between singleton set { }, for all ∈ and the set of standard patients P; can be calculated and arranged in Table 11.
To analyze these results, we can draw Figure 1, as follows Patients Let the degree of the survival for a patient p be λ. From Diagram 1, we can deduce the following decision rules If 0.1 < λ ≤0.5, then pi will be a survival patient.

Results and discussion
The classification accuracy is the percentage of findings accurately estimated by the method. It was used to measure each algorithm's efficiency. From Table 12 that contains the actual diagnosis from the dataset and the predicted diagnosis from the expert system. Table  13 shows the confusion matrix and the accuracy is 90%.
In our work, we introduced an expert system to estimate the mortality risk to Patients of pulmonary resection for lung cancer in a period of one-year. We have used a new concept depending on combined soft set with near sets, which is part of the field of soft computing development that is very effectively used to deal with the confusing or uncertain information that we always identify in the available data. the model takes into account five entry variables (performance, dyspnoea, cough, tumor size, and diabetes mellitus) and produces the 1-years mortality risk as to the expert system output. Accurate classification for one-year survival achieved 90.0%.
The Risk scale can determine that patients p1 with risk 0.401 and p2 with risk 0.277 for example, as not death cases and the model classified it correctly as shown in Table 12, but that tell for experts that p1 has high risk more than p2 and needs more care.
In addition, certain current documents using the same dataset were also comparable with the work suggested. Table 14 Provides the study findings, in fact, the classification accuracy of the proposed Method surpassed all recently suggested. The Lung Cancer mortality Risk Table 11: Soft nearness degree of every singleton set in U and the set P.    The primary database contains 470 patients with lung resection, we configure our expert system with 20 patients selected randomly from the total existing data and 90.0 percent accuracy has been achieved.
We are investigating that our soft near set novel version is an effective and accurate diagnostic application for determining the 1-year survival rate in lung cancer patients.

Conclusion
In this paper, we have presented the notion of soft near sets, which can be viewed as a hybrid model combining near sets with soft sets. It leads that, the proposed model is more effective and useful in a decision-making problem. Some fundamental concepts have been defined and their basic properties have been deduced and proved. Finally, we have presented an application of the suggested model for the decision-making of patients' lung cancer. In it, we succeeded in getting the Risk scale for every patient, and then we deduced some decision rules.