Resilience Assessment of System Process Through Fuzzy Logic: Case of COVID-19 Context

Article history: Received: 08 September, 2020 Accepted: 05 October, 2020 Online: 24 October, 2020 The present work is undertaken as part of research studies aiming to provide sociotechnical systems with a decision-making tool that supports them in assessing the resilience of their processes. The ultimate objective is to fix the identified imperfections in order to steadily gain strength and effectiveness to cope with new and existing threats and challenges. In that respect, this paper presents a framework called 3R process whose goal is to calculate the resilience score for a process based on several parameters and grade it on the resilience scale. In order to achieve this, a fuzzy model has been put in place and graphic user interfaces have been designed so that the 3R process becomes an integral part of daily working practices. Finally, a real case company example in the context of COVID-19 is exposed towards putting into practice the proposed model.


Introduction
In an increasingly uncertain and volatile world, sociotechnical systems, which are complex systems incorporating technical (machines and technology that provide an authentic link between the user input and the system's output) and social (people and society) systems that both have interactions between them, and those interactions are a key factor in the success or failure of system performance [1], are bending over backwards to become more resilient in order to be able to react to change as quickly as possible and to treat challenges and mistakes as a learning opportunity. An obvious case in point is the Nokia organization. This company, at one point, had 32% of the global market share in smartphones and mobile phones [2]. However, as it is well known, Nokia fell from grace with respect to the mobile phones market. They misread the North American market and they did not respond correctly when Apple and Samsung introduced new products and it took them too long to catch up. Eventually, they had to sell their mobile phone business to Microsoft, and effectively had to exit the mobile phone market, which they dominated only a few years earlier. Nevertheless, what makes Nokia such an interesting case of resilient organization is that despite the fact that Nokia made a mistake, it has been putting back together an organization. They refocused on their core transmission on business, they have been thinking how to position their navigational services and connecting with other organizations to produce new software, new kinds of handsets. The organization did not fall apart and was able to remake itself. In fact, their stock price, from the time they sold the business to Microsoft to recently, doubled. Indeed, what is important for every system is to develop resilience capabilities such as the ability to sense what is happening out in its environment and to compare that with what is done inside the organization, and, as necessary make the changes [3]. The question now being asked is how a system can assess its own resilience level so it can take stock of its strengths and weaknesses, threats and opportunities in terms of dealing with disturbances and returning to normal function.
In fact, several attempts have been made to find new approaches and frameworks that help in the resilience assessment. Amongst these, there are methods established to evaluate resilience of critical infrastructures, such as defining a generalized index that serves for quantifying the resilience under various scenarios [4], or designing a quantitative method to assess distinct elements in a critical infrastructure system [5].
Furthermore, different frameworks were proposed for the assessment of supply chain resilience, for example, elaborating an assessment model in order to measure organizational, business and labor resilience performance of supply chain [6], or developing a ASTESJ ISSN: 2415-6698 framework that integrates the concepts of resilience with risk management techniques in order to understand how a supply chain reacts to disruptions as a function of time [7].
Idem for urban communities as establishing a methodology for evaluating the earthquake resilience [8], or putting into place a workbook that helps urban areas incorporate resilience thinking into their planning practice [9], and so forth.
Moreover, measuring and quantifying resilience has proved challenging and research works in this field are not so numerous. However, some researchers have sought to contribute to fill this gap by developing methods aimed at calculating resilience. For example, we can find quantitative models explaining how to compute resilience of manufacturing [10] or nuclear power [11] plants, physical networks [12], mass railway transportation [13], and the list goes on.
In this same perspective, the 3R process dedicated to assigning a resilience level to sociotechnical system processes is described in this paper. This framework is based on the resilience scale [14], a scoring system, which is composed of five echelons: 0.Unconscious (Processes belonging to this category have no resilience processes associated with them), 1.Informed (These processes are warned of resilience without any elaborated resilience processes), 2.Aspiring (These processes are beginners and try to set up resilience processes), 3.Progressing (These processes overcome less serious disruptions within a reasonable time. However, the consequences can be more devastating in serious situations), 4.Expert (For this kind of processes, the anticipated situations are perfectly manageable. As per the unforeseen situations, they try to adapt to changes), and which aims to assess the level of resilience of a given process. The present work will go beyond this qualitative description to find out how this scale can be used by a sociotechnical system in order to assess the resilience of its own processes by attributing a resilience score to each process.
For this purpose, the first section will be devoted to the selection of needed indicators. This step is of such importance since the indicators provide the necessary information for decision-making and constitute a basis for our calculation model. However, in order to be able to accomplish these missions, the indicators should be relevant. This means that they should be specific so that the results can be clearly identified, measurable and quantified, practical and useable, available (it should be possible to collect the data), be transparent in methodology and selection and well-grounded in scientifically [15]. After this, the "3R" process will be defined in the second section. This method relies mainly on three concepts, which are (1) resources that can be considered, among others (time, budget, regulations…) as a constraint that may limit the achievement of the process objective [16], (2) risks, which are undesirable events that may happen several times can have a negative impact on the process [17], (3) Resilience, which is the ability of a system to manage disturbances and to adapt to changes without ceasing the activity [18]. In this section, a model explaining extensively the 3R process that is divided into two steps (Definition of parameters and resilience assessment) will be derived, the way of calculating the resilience score using fuzzy logic, a procedure that has already proved its effectiveness in many fields, will be explicitly shown, and graphic user interfaces will be developed. Then, a real case company example in the context of COVID-19 is presented in the penultimate section. The purpose is to check the reliability of the proposed method through assessing the resilience of a process of a huge company in face of this exceptional crisis. At the end of this paper, conclusions and opportunities for further work will be exposed.

Selection of indicators
The role of resilience processes is to ensure the continued availability of resources, in case of disruptions, in order to avoid failures. This objective can be achieved through resistance, recovery, and anticipation [19]. Thus, the selection of indicators will be in relation with these three missions.
As per resistance, a variety of metrics can be considered [20]: • The meantime between failures (MTBF), which indicates the expected time between consecutive failures.
• The operational availability that expresses the percentage of time a process is available to users.
• The operational reliability, which is defined as the ability of a process to operate at the normal service level for a specified period of time.
• The Maximum Tolerable Period of Disruption (MTPD) that measures the time for which a process can be available before entirely stopping the system activity. It depends on the system business objectives and thus determined by the organization. The importance of this indicator lies in the fact that it has an impact on the recovery metrics [21]. Furthermore, the MTPD is identified as one of the inputs used to prepare the response plans [22].
In the light of the foregoing, the MTPD is picked up and the criticality of processes will be defined according to the value of MTPD. Regarding recovery, the following metrics can be listed [23]: • Recovery Point Objective (RPO) describes the period of time in which data must be restored after a disruption.
• Recovery Time Objective (RTO) is the period during which the system must recover.
• Mean Down Time (MDT) is the mean time that a system is not operational.
• Recovery time, which is the time it takes for a process to return to normal operation after a failure is occurred. It is shown in a research work about the relationship between recovery and resilience that the recovery time depends on, among others, the resilience of the system and the scale of the perturbation [24]. It is considered also as the period within which it takes a system to return to a stable state [25]. This metric is chosen as one of the parameters used to define the process criticality.
As for anticipation, the notion of "workaround" is addressed. It may be defined, according to the Cambridge Dictionary, as "a way of dealing with a problem or making something work despite the problem, without completely solving it". In other words, it is an alternative solution, which allows continuing the operation in short term. For instance, we can define, as workaround, for raw material shortage, borrowing from the stock of another plant. In terms of indicators, Workaround Implementation Time (WIT) and Workaround Maximum Tolerable Time Length (WMTTL) are selected. WIT is the amount of time it takes to implement an anticipated workaround for the present situation, and WMTTL is the absolute maximum length of time for which an implemented workaround can be maintained.
The featured indicators are presented in the table 1.

3R process
The 3R process is a novel method and an instrument for decision-making aimed at assessing the resilience of sociotechnical systems processes. This process is built on three essential concepts: Resource, Risk, and Resilience, and is conducted in two stages, which are the definition of parameters and the resilience assessment. A model of the 3R process is proposed in figure 1.
When this is done, the next step is to define parameters. The aim is to attribute values to the four indicators detailed in the previous section. For each one, time ranges should be defined and the process criticality level should be identified based on these intervals. As regards WIT and WMTTL, before proceeding with this exercise, the existence of workarounds should be checked, and if not found, workarounds must be put in place.
The criticality level helps determine the priority ranking of processes as per their potential risk of failures. In the present work, five process criticality levels are considered: (1) Very low, (2) Low, (3) Medium, (4) High, (5) Very high. An example of intervals for the selected indicators (MTPD, RT, WIT, and WMTTL) is given in the tables below.  The risk of failure is very high since the process can carry on functioning only 4 hours during the disturbance before completely ceasing its activity. 4 High 4h ≤ MTPD < 1 day The risk of failure is high since the process can carry on functioning between 4 hours and 1 day during the disturbance before completely ceasing its activity. 3 Medium 1 day ≤ MTPD < 3 days The risk of failure is medium since the process can carry on functioning between 1 day and 3 days during the disturbance before completely ceasing its activity. 2 Low 3 days ≤ MTPD < 5 days The risk of failure is low since the process can carry on functioning between 3 and 5 days during the disturbance before completely ceasing its activity. 1 Very low 5 days ≤ MTPD < 1 week The risk of failure is very low since the process can carry on functioning between 5 days and 1 week during the disturbance before completely ceasing its activity. This is a sufficient time for the process to recover. The risk brought about by the disruption is low since the process can recover in a period of time between 1 day and 3 days. 3

Medium 3 days ≤ RT < 5 days
The risk brought about by the disruption is medium since the process can only recover in a period of time between 3 and 5 days. 4 High The risk brought about by the disruption is high since the process can only recover in a period of time between 5 and 7 days. 5 Very high 1 week ≤ RT The risk brought about by the disruption is very high since the process can only recover in more than 1 week. The risk of failure is low since the anticipated workaround can be implemented in a period of time between 1 day and 3 days. 3 Medium 3 days ≤ WIT < 5 days The risk of failure is medium since the anticipated workaround can be implemented only in a period of time between 3 and 5 days. 4 High 5 days ≤ WIT < 1 week The risk of failure is high since the anticipated workaround can be implemented only in a period of time between 5 and 7 days. 5 Very high 1 week ≤ WIT The risk of failure is very high since the anticipated workaround can be implemented only in more than 1 week. The risk of failure is low since the anticipated workaround can be maintained for a period of time between 5 and 7 days. 1 Very low 1 week ≤ WMTTL The risk of failure is very low since the anticipated workaround can be maintained for more than 1 week.  The potential threats for this process are: Lab unavailable, reagent supply shortage, scanners or printers unavailable, more than half of the team missing, equipment unavailable… Define MTPD (Five levels) Cf. Table 6 Define recovery time (Five levels) Cf. Table 6 Check the existence of workarounds If No →Identify workarounds If Yes →Select a workaround Identify workarounds -Lab unavailable→Externalization of quality control to another plant.
-More than half of the team missing→Borrow people from production department.
-Reagent supply shortage →Borrow from another plant.
-Equipment unavailable→Delegate part of the activity to another plant. Select a workaround Select one of the workarounds already identified Define WIT (Five levels) Cf. Table 6 Define WMTTL (Five levels) Cf. Table 6 The process criticality levels for the four selected metrics are gathered in table 6 (1/6 day stands for 4 hours) An example of application of the first part of the 3R process model is given in the process analysis template hereafter (table 7).
The second and most important step of the 3R process model is "resilience assessment". Its aim is to explain how to rate sociotechnical system processes on the resilience scale. In figure  1, it can be noticed that this part consists in determining inputs in order to obtain the echelon at the output. This will help the process identify the areas for improvement and afterwards overcome weaknesses.
In order to design the 3R process model, fuzzy logic will be used. This is an approach of data mining, which enables the modelling of a decision process in a situation of uncertain reasoning or incomplete information [26]. A fuzzy inference system (FIS) is defined as a system that uses fuzzy membership functions to make a decision [27]. In our case, Mamdani approach is adopted. This is a nonlinear mapping process on the base of fuzzy logic, which involves three components: fuzzification, inference rules, and defuzzification. The proposed fuzzy model is implemented using the programming language python, which is versatile and easy to use with a large and rich library. The first step in the fuzzy model is to import scikit-fuzzy, which is a fuzzy logic Python package, using the following code block: import numpy as np import skfuzzy as fuzz from skfuzzy import control as ctrl Afterwards, the inputs (MTPD, Recovery time (RT), Workaround implementation time (WIT), Workaround maximum tolerable time length (WMTTL)) and the output (Echelon) should be defined. Values included between 0 and 8 for the inputs according to Table 6, and 0 and 5 for the output according to the resilience scale (5 echelons (0, 1, 2, 3 and 4)). The code portion is inserted hereunder: (8) MTPD.automf (5) WIT.automf (5) WMTTL.automf (5) RT.automf (5) Then, membership functions are built based on resilience scale and table 6. Our choice was to go with the triangular membership functions (trimf) that are frequently used in many applications of fuzzy sets owing to their simplicity and which are suitable in this case (because they are defined by a lower limit a, an upper limit b, and a value m, where a < m < b) [28], [29] [3,3,5]) WMTTL['Low'] = fuzz.trimf(WMTTL.universe, [5,5,7]) WMTTL['Very low'] = fuzz.trimf(WMTTL.universe, [7,7,8
In example 1, MTPD and RT are very low, which means that the two first missions of resilience process (resistance and recovery) are perfectly fulfilled. It remains to be seen if the last mission (anticipation) is also fully accomplished or not. Accordingly, the output (Echelon) will oscillate between progressing and expert.
In example 2, the RT decreases, which conveys a much-needed effort to enhance recovery, and thus the process climbs down in the resilience scale from progressing to aspiring. It cannot be informed or unconscious because the resilience process is already set up with at least one mission that really holds water (resistance).
Regarding the last example, MTPD and RT are high, which is to say that resistance and recovery of the process are weak. Therefore, depending on the performance of anticipation, the echelon varies between informed and unconscious.
Subsequently, a control system (a nonlinear Single Input-Single Output (SISO) discrete-time process based on a set of rules [30]) is created using the code block hereunder According to the plot above, the resilience echelon is about 1.27. That is to say, by referring to the resilience scale, the process under study is informed. It is poorly developed with regard to resilience. An associated resilience process must be set up in order to enhance its resilience. The aim is to improve: • the resistance of the process and its resource to the disruption through increasing the MTPD • the recovery through the reduction of the recovery time (RT) • the anticipation through putting in place well elaborate workarounds, reducing WIT and increasing WMTTL for each workaround.
In order to make this model usable for sociotechnical systems and a core part of their processes, a user interface should be designed. For this purpose, a model is elaborated using UML language (Cf. figure 3).
As can be seen, the metamodel illustrated by figure 3 formulates the following points: • A process has one or more resources.
• A resource is exposed to one or more risks.
• A risk can have one or several identified workarounds. It may also have no created workarounds.
• There are processes called resilience processes.
• The existence of resilience processes in the sociotechnical systems help avoiding unavailability of resources.
• An echelon (Resilience score) is attributed to each process of the sociotechnical system based on the existence of associated resilience processes.
• The missions of resilience processes are resistance, recovery and anticipation.
• Resistance is measured using MTPD.
• Recovery is measured using RT.
• Anticipation is measured using WIT and WMTTL. These two indicators characterize workarounds.
• The echelon is calculated through MTPD, RT, WIT and WMTTL.
Firstly, the database should be populated with the list of all the processes in a sociotechnical system. Then, for each process, the critical resources should be identified. After this, the potential threats should be presented for each resource. Next, the verification should be carried out, for each risk, to ensure that anticipated workarounds are available. If this is not the case, workarounds should be developed to compensate for the lack. Thereafter, criticality levels (Very high, high, Medium, Low, Very low) should be determined for MTPD, RT, WIT and WMTTL.
In the second place, the parameters can be defined via the user interfaces dedicated for this purpose and which are developed using the framework Tkinter of Python (cf. figure 4) Then, based on the chosen criteria, the resilience of the selected process can be assessed. Figure 5 illustrates that values of the indicators MTPD, RT, WIT and WMTTL should be inserted to get the resilience score of the process. It is clear that for the selected combination ((Process(P1), Resource(P1R1), Threat (P1R1T1), Workaround (P1R1T1W1)), the resilience score is 1.27. That is to say, according to the resilience scale, the process P1 is informed and there is no elaborated resilience process associated with it. The first thing to be done is to try to enhance the resistance of the process/resource to the risk P1R1T1 through increasing the MTPD in order to avoid the activity interruption and at the same time attempt to recover more quickly. Furthermore, the workaround implementation time should be sharply reduced and the functioning of the workaround P1R1T1W1 should last longer at least until recovery.
On the basis of the given example, the 3R process can be considered as a decision-making tool that helps sociotechnical systems assess their processes as per resilience and hence identify improvement opportunities with the aim of building a system capable of weathering the continuing uncertainty surrounding business environment nowadays.

A real case company example -COVID-19 context
COVID-19 is having an unprecedented impact on sociotechnical systems (societies, companies…) and people's livesand will continue to do so into the foreseeable future. Since the first day, the selected company, which is one of the largest and best-known enterprises operating in the cosmetics sector, has been working on the COVID-19 outbreak in order to ensure its employees' safety. Country by country as well as globally, the evolution of the situation has been carefully monitored and decisive measures to limit everyone's risk to exposure have been taken. In order to evaluate the resilience of the company processes against this worldwide crisis, the 3R process method will be applied.
First, the physical distribution process (PDP) is chosen as an example. It is a part of the supply chain organization. Its aim is to ensure that the service to the consumer required by the business is met at the best possible costs by optimizing the inbound (reception and returns), internal (putaway and replenishment) and outbound (order preparation and shipping) physical flows whilst guaranteeing quality and safety in a sustainable environment. The table below summarizes all the crisis phases that the PDP has passed through. As mentioned previously, the first step of the 3R process method is the definition of parameters. For such needs, the human factor is picked up as a resource. Then, the threat chosen, based on the table above, is capability reduction due to school closing. Finally, the workaround selected is alternating shift schedules. As per the four indicators, they are determined as follows (table 11) The risk of failure is high since the PDP can handle the capability reduction only for a period of time between 1 month and 3 months before completely ceasing its activity. 3 Medium 3 months ≤ MTPD < 5 months The risk of failure is medium since the PDP can handle the capability reduction only for a period of time between 3 and 5 months before completely ceasing its activity. 2 Low 5 months ≤ MTPD < 9 months The risk of failure is low since the PDP can handle the capability reduction for a period of time between 5 and 9 months before completely ceasing its activity.
1 Very low MTPD ≥ 9 months The risk of failure is very low since the PDP can handle the capability reduction for more than 9 months before completely ceasing its activity. This is a sufficient time for the process to recover. RT Level Process criticality Description Comment 1 Very low RT < 1 month The risk brought about by the disruption is very low since the process can recover in less than 1 month. 2 Low 1 month ≤ RT < 3 months The risk brought about by the disruption is low since the process can recover in a period of time between 1 month and 3 months. 3 Medium 3 months ≤ RT < 5 months The risk brought about by the disruption is medium since the process can only recover in a period of time between 3 and 5 months. 4 High 5 months ≤ RT < 12 months The risk brought about by the disruption is high since the process can only recover in a period of time between 5 and 12 months. The risk of failure is low since the anticipated workaround can be implemented in a period of time between 1 month and 3 months. 3 Medium 3 months ≤ WIT < 5 months The risk of failure is medium since the anticipated workaround can be implemented only in a period of time between 3 and 5 months. 4 High 5 days ≤ WIT < 9 months The risk of failure is high since the anticipated workaround can be implemented only in a period of time between 5 and 9 months. 5 Very high 9 months ≤ WIT The risk of failure is very high since the anticipated workaround can be implemented only in more than 9 months. WMTTL Level Process criticality Description Comment 5 Very high WMTTL < 1 month The risk of failure is very high since the anticipated workaround can be maintained only for less than 1 month. 4 High The risk of failure is high since the anticipated workaround can be maintained only for a period of time between 1 month and 3 months.
The risk of failure is medium since the anticipated workaround can be maintained only for a period of time between 3 and 5 months.
The risk of failure is low since the anticipated workaround can be maintained for a period of time between 5 and 9 months. 1 Very low 9 months ≤ WMTTL The risk of failure is very low since the anticipated workaround can be maintained for more than 9 months. Afterwards, the second step, which is the resilience assessment, should be initiated. Therefore, the inputs are specified in the table below. By using the 3R process tool, the plot below (figure 6) is obtained.
By inference, the physical distribution process is aspiring. In fact, the Novel Coronavirus outbreak has some particularities that differentiate it from other disruptions. As a result, the corresponding resilience process is not yet completely set up. The resistance mission is functioning very well and the anticipation mission is under development. However, the recovery mission is still very ambiguous.

Conclusion
The presented work pointed out a method that can be used by sociotechnical systems to evaluate the resilience of their processes. First and foremost, the resilience scale with its five echelons (0. Unconscious, 1. Informed, 2. Aspired, 3. Progressing, 4. Expert) and three missions (resistance, recovery and anticipation) was introduced. Then, indicators related to these missions have been outlined (MTPD for resistance, RT for recovery, WIT and WMTTL for anticipation). Afterwards, the 3R process was detailed. The 3Rs stand for Resource, Risk and Resilience. This framework is divided into two parts, the first one is the definition of parameters, which consists in determining process criticality levels according on the four selected indicators. Time ranges are defined for the metrics and for each process, critical resources, potential risks and anticipated workarounds are selected and resilience is evaluated for all possible combinations. As for the second part, it is about the resilience assessment. In order to calculate the resilience score (echelon) and rank the process on the resilience scale, fuzzy logic has been used. A fuzzy model has been elaborated and user interfaces have been developed in order to enable decision makers to evaluate their systems and build their resilience. In short, the major advantages of this process are summarized as follows. First, this is a generic method that can apply to any system and help it to define the key indicators, related to resistance (MTPD), recovery (RT) and anticipation (WIT and WMTTL): the three functions of a resilience process, and identify plausible workarounds. Second, it provides a resilience score and a rating as per the resilience scale, which can be useful for detecting and improving weaknesses and also increasing strength and effectiveness to deal with adverse events. Finally, by applying fuzzy logic in the 3R process, the vagueness and uncertainty faced can be overcome. In the last section, a study has been conducted to look at the impact of the COVID19 sanitary crisis on a real case company example and how resilience plays a crucial role in preventing the worst from happening and resuming gradually the normal course of events. In fact, the exposed real case study shows that the pandemic situation has negatively affected the functioning of the studied process. In the aftermath, the process works on improving its resilience, especially, the recovery mission. In the future work, attempts will be made to establish new methods serving to evaluate the resilience of sociotechnical system processes and enhance it.