Trajectory Tracking Control Optimization with Neural Network for Autonomous Vehicles

For mission-critical and time-sensitive navigation of autonomous vehicles, controller design must exhibit excellent tracking performance with respect to the speed of convergence to reference command and steady-state accuracy. In this article, a novel design integration of the neural network with the traditional control system is proposed to adaptively obtain optimized controller parameters resulting in improved transient and steady-state performance of motion and position control of autonomous vehicles. Application of the proposed intelligent control scheme to mobile robot navigation was presented for an eight-shaped trajectory by optimizing a Lyapunov-based nonlinear controller. Furthermore, a Linear Quadratic Regulator-based controller was optimized based on the proposed strategy to control the pitch and yaw angles of a 2-Degree-of –Freedom helicopter. The simulation results showed that the proposed scheme outperforms the traditional controllers in terms of the speed of convergence to the desired trajectory and overall error minimization.


Introduction
There have been growing demands for autonomous vehicles with excellent maneuvering capabilities both for commercial and military applications [1,2]. Significant amount of work is ongoing to realize the commercial operation of autonomous cars [3,4], and Wheeled Mobile Robots (WMRs) are finding increasing use in industrial and service applications [5]. Autonomous Surface Vehicles (ASV) have been utilized to improve port safety and for ecological as well as meteorological purposes [6], whereas Autonomous Underwater Vehicles (AUVs) are useful tools for underwater search and inspection [7]. Safe navigation will require both excellent path planning and path tracking strategies. Path planning involves the determination of an appropriate trajectory for the vehicle whereas path tracking, which is the focus of this study, is the following of a desired trajectory. Several schemes have been reported in the literature for path planning, including in [8], where a framework was proposed to synthesize sequences of maneuvers that are tracked using nonlinear controllers, and other strategies are presented in [9,10]. To execute the maneuvers generated by path planners, control strategies have been proposed as well.
In [11], Fuzzy Logic Controller (FLC) was proposed for robot tracking and Proportional-Integral-Derivative (PID) controller was utilized to control the speed of WMRs in [12]. The authors of [13] presented an observer-based approach, whereas backstepping control strategy was presented in [14,15]. Sliding-mode control for WMR trajectory tracking with initial error was reported in [16], and modular-based method was proposed in [17]. The authors of [18] proposed a Model Predictive Control (MPC) approach and Kalman filter-based strategy was presented in [19]. Lyapunovbased scheme was reported in [20] and controller design by approximate linearization utilizing Taylor expansion was proposed in [21]. In order to enhance the perfomance of traditional control methods, Least Square Policy Iteration (LSPI) and Dynamic Heuristic Programming (DHP) algorithms were utilized in [22] for optimizing a Proportional-Derivative (PD) controller. Also, the authors of [23] utilized neural network (NN) to describe the inverse dynamics of a biped robot with respect to output errors for the control of level walking. However, application examples were limited to scenarios where the robot is initialized at the desired starting coordinate.
In order to execute maneuvers for unmanned Air Vehicles (UAVs), feedback linearization [24,25] and sliding mode control [26] have been proposed, but they have failed for certain models due to the nonlinear dynamics of the vehicle [27]. For helicopter control, backstepping control strategy [28] and Linear Quadratic Regulator (LQR)-based control [29] have been proposed. In this study, a novel learning-based adaptive scheme utilizing the neural network is developed for autonomous vehicle trajectory tracking. Whereas, plant models predict the vehicular motion for a given control command, accuracy is limited by modelling errors and approximations. Also, for certain models with complex nonlinear dynamics, it is difficult to obtain a suitable controller. Since machine learning models provide a more powerful tool to describe nonlinear dynamics of a plant given example data of plant operation, this article presents a parameterized control law designed to achieve trajectory tracking of autonomous vehicles adaptively. Rather than using a single controller, a family of controllers is obtained utilizing the NN model to estimate the time varying controller parameters. The scheme was applied to optimize a Lyapunov-based nonlinear controller parameters used to execute an eight-shaped maneuver for a mobile robot. Furthermore, LQRbased controller parameters for 2-Degree-Of-Freedom (DOF) helicopter position control were optimized using the proposed scheme. Simulation results show that the scheme outperforms the traditional control strategies in terms of faster convergence to the desired trajectory and more accurate steady-state performance, regardless of the initial shift in starting coordinates. Also, because the scheme is sample-based, it can compensate for modeling errors.
The rest of the paper is organized as follows; Section 2 presents the problem formulation. Section 3 provides an illustrative application of the proposed scheme. Simulation results are discussed in Section 4, and Section 5 summarizes the study.

Problem Formulation
A dynamic system can generally be defined in state-space as, The system state is represented by ( ), the control input is denoted by ( ), whereas denotes a mathematical function and ( ) is the state derivative with respect to time. Given a target signal ( ), a traditional feedback control law can be computed based on the difference between the target signal and the actual system output. Such difference can be denoted by ( ), resulting in a control input defined as, where ( ) denote the control gain, which can be constant or timevarying depending on the dynamic description of the system. However, by utilizing certain control strategies, not only the error is fed back for control but the states also. For two-dimensional and higher order systems; the states, control input and error signals are vectors, whereas the control gain could be a vector or matrix depending on the control strategy used.
In order to optimize the controller performance for changing system dynamics or operation regions, a neural network-based method that adaptively determines the control gain is proposed as follows. For a traditional closed-loop control system, sample test run of the system is performed and for each time step , the state variables and the control gain are measured and arranged into 3tuple { ( ), ( + ), ( ). The state measurement before the application of control input is denoted as ( ) whereas the next state caused by the control action is ( + ),. The control gain that caused the state transition is represented as ( ).
Multiple samples of state transitions and control gains are measured for a sequence of operation, then the control gain or the control gain parameters for time-invariant or time-varying system respectively are varied for other sequences. Example of a sequence is the tracking of a particular course by a mobile robot or the change in angular orientation of a helicopter over a period of time. For training stability, the data set is normalized depending on the data range. And based on the accumulated data set as represented by the 3-tuple, a NN is trained using the state and next state as input and the control gain as the output. Hence, The NN model is made up of processing units called neurons with weighted interconnections among them [30], and the neurons with nonlinear activation functions are arranged into layers as shown in Figure 1. The input layer is the vector of input variables, which are system states in the study. For a single hidden layer system, the hidden layer is a nonlinear combination of the input signals, and for multiple hidden layer system, subsequent layers are nonlinear combination of the previous layers. Knowledge is extracted from the output layer after model training. In supervised learning employed in this study, an error function based on the difference between the actual outputs called labels and the predicted outputs from the NN is minimized to adjust the interconnection weights among the neurons. After several iterations, the model is said to be trained. Whereas only a portion of the dataset is used for training, the remainder is used for testing the performance of the trained model. The mathematical description of the NN model as utilized in this study is, where the output vector of a layer is denoted by . Hence, the input layer vector 0 is equivalent to the state input { ( ), ( + )}. The activation function is denoted by and represents the input vector to layer . The interconnecting weights and biases from layer − 1 to unit of layer are represented by and respectively.
Following offline training as decribed above, the trained NN model is integrated as a control gain estimator in the traditional control system as shown in Figure 2. For online control gain estimation, the NN input due to the next state ( + ) is replaced with the reference signal ( ). Hence, Then, (2) is transformed to,

Illustrative Examples
In order to prove the effectiveness of the proposed scheme, two application case studies are presented. One is the tracking of an eight-shaped trajectory by a mobile robot, and the other is the position tracking of a 2-DOF helicopter.

Mobile Robot Trajectory Tracking
The nonlinear kinetic model [31] of a non-holonomic wheeled mobile robot is described as, The robot's states , , are the cartesian , , and angular displacements respectively. Whereas,, and ̇a re the state derivatives, corresponds to the linear velocity and represents the angular velocity around the vertical axis.

Traditional Control Design
Motion control of the mobile robot is achieved by computing appropriate linear and angular velocities to drive the robot in the desired trajectory. According to the procedure in [32], the state tracking error is generalized by defining it through a rotation matrix to obtain, where , ,and are the target coordinates for the robot. Then the error dynamics in generalized coordinates is obtained by taking the derivative of (9) as, Making the following substitustions, The error dynamics is transformed to the subsequent form, To ensure global stability for the nonlinear dynamics, the following Lyapunov function is selected, Taking the derivative of (14) gives By substituting for ̇, ̇ and ̇ from (13), Hence, the Lyapunov-based control law such that ̇ < 0 is guaranteed is obtained as, The controller gains are defined as, where 2 = > 0 is a constant, and is the damping ratio.
The control law defined by (17) and (18) where is 0 or 1 for forward or backward motion respectively. The feedback commands are adaptively computed based on NN estimation as described next.

Controller Parameter Estimation using the Neural Network
In order to optimize the controller for fast convergence to the desired trajectory and improved steady-state performance, the NN is utilized to adaptively estimate the feedback control parameters. Sample runs of the traditional closed-loop system are performed for 15 Figure 3. The figure shows good generalization performance as the test error is approximately equal to that of the training. The trained NN estimator is integrated in the second loop of Figure 2, where a normalizer and denormalizer are embedded for better control performance. Simulation results of the adaptive closed loop control system is presented in Section 4.

2-DOF Helicopter Position Control
A Two-Input-Two-Output (TITO) Quanser 2D Helicopter model [35] is considered. The linearized model is obtained in [36] as follows.
where and are the pitch and yaw angles respectively.The first and second derivatives of the pitch and yaw angles are ̇, ̈ , ̈respetively. The pitch and yaw propeller voltages and are applied to the corresponding motors driving the propellers.

Traditional Control Design
The pitch and yaw angles of the 2-DOF helicopter were regulated using the optimal LQR-based control design presented in [29]. And the control law was obtained as, The realized state feedback gains are, The system states are represented by ( ) whereas ( ) is the error feedback signal. The adaptive computation of the error feedback gains using the NN is subsequently described.

Controller Parameter Estimation using the Neural Network
In order to achieve better accuracy and faster tracking convergence, the NN is employed to optimize the error feedback control gains. In this case, six sequences of 700 state transitions totaling 4195 data samples were sufficient, since the plant has been linearized. The input features are the current angular orientation

Simulation Results
Closed-loop simulation was conducted in Matlab Simulink for both the traditional and the NN-optimized control system. The results for different initial state shifts were compared to prove the effectiveness of the proposed scheme. The desired eight-shaped trajectory for the mobile robot was defined as, The feed-forward driving and steering velocities as described by (20) and (21) are shown in Figures 5 and 6 respectively.  Figure 7 is the result of an underdamped control system due to low damping ratio, whereas Figure 8 is the result of an overdamped control system due to high damping ratio. The result of a critically damped control system with unity damping ratio is shown in Figure 9, whereas Figure 10 is the corresponding trajectory tracking output performance of the NN-optimized control system. It can be observed that the NN-optimized controller outperformed the different design modes of Lyapunovbased controller in terms of faster convergence to the desired trajectory and steady-state error minimization. The steady-state is the region starting at the point where the robot settles around the reference trajectory. Although, either fairly good convergence or steady-state performance can be obtained by varying the control parameters of the Lyapunov-based controller, but the improvement is mutually exclusive. However, the NN-optimized controller achieved both better transient and steady state performances concurrently by adaptively obtaining varying controller parameters. . Also in this case, linear and angular displacements are measured in meters and radians respectively. It can be observed, as the case is in the previous example, that the NN-optimized controller of Figure 14 showed better comparative control performance than the different variations of traditional Lyapunov-based controllers of  To further show the performance advantage of the NN-optimized controller, the average linear and angular tracking errors over the entire trajectory are presented in Table 1 as it applies to the different initial state errors. It can be seen that both the resultant linear and angular tracking errors are minimized using the NNoptimized controller.     where Q is associated with both the state and error feedback signals. The performances of the NN-optimized controller are shown in Figures 16 and 18 for pitch and yaw angle control respectively. Since the system is linearized, modest performance improvement is observed with respect to response speed and steady-state error minimization. Table 2 shows the comparative tracking errors.

Conclusion
A neural network optimized control system design for autonomous vehicle navigation has been proposed in this study. The design consists of an inner error loop integrated with an outer loop for estimating the controller parameters utilizing a neural network trained on samples from test navigation. Comparative studies of the proposed scheme with traditional methods were presented. In the first case, the neural network optimized control system was shown to outperform a Lyapunov-based controller in terms of faster convergence to the desired trajectory and better steady-state performance for an eight-shaped maneuver. A second illustrative simulation example was conducted for the control of pitch and yaw angles of a 2-Degree-Of-Freedom helicopter model, where improved transient and steady-state performances were observed for NN-optimized controller over a Linear Quadratic Regulator-based controller.
Because the neural network structure allows complex and nonlinear mapping of variables, the trained estimator is able to learn the behavior of both linear and nonlinear systems. Furthermore, since the proposed scheme is sample-based, it can compensate for plant modelling errors that degrades the performance of traditional controllers in real-life applications.

Conflict of Interest
The authors declare no conflict of interest.