Synthesis of LQR Controller Based on BAT Algorithm for Furuta Pendulum Stabilization

— In this study, a controller design method based on the LQR method and BAT algorithm is presented for the Furuta pendulum stabilization system. Determine the LQR controller, it is often based on the designer's experience or using trial and error to find the Q , R matrices. The BAT search algorithm is based on the characteristics of the bat population in the wild. However, there are advantages to finding multivariate objective functions. The BAT algorithm has an improvement for the LQR controller to optimize the linear square function with fast response time, low energy consumption, overshoot, and a small number of oscillations. Swarm optimization algorithms have advantages in finding global extrema of multivariate functions. Therefore, with a large number of elements of the Q and R matrices, they can also be quickly found and these matrices still satisfy the Riccati equation. The controller with optimal parameters is verified through simulation results with different scenarios. The performance of the proposed controller is compared with a conventional LQR controller and implemented on a real system .


INTRODUCTION
The Furuta pendulum, or inverted pendulum, consists of a control arm that rotates in the horizontal plane and a pendulum attached to that arm that rotates freely in the vertical plane.It was invented in 1992 at the Tokyo Institute of Technology by Katsuhisa Furuta and his colleagues [1]- [4].This system is typical of unstable and nonlinear systems of interest in classical and modern control theory.The large instability and nonlinearities are due to gravity and the coherence arising from the Coriolis force and the centripetal force.Furuta pendulum is a popular object in laboratories, used to test and prove linear and non-linear control laws.The inverted pendulum system has been studied to simulate the dynamics of many processes when designing the controller [5] such as: balancing rockets when launching vertically, stabilizing oil rigs at sea, stabilizing aircraft takeoff and landing, and ship cabin stability...The inverted pendulum model is a mechanical system that lacks actuators with fewer actuators than the number of degrees of freedom of the system.There have been many domestic and foreign studies to design control laws and control methods for the inverted pendulum system.In the study [7], [8] the traditional PID controller and its variants are used.But the nature of this controller is an error controller, so when designing, it will separate the system into 2 subsystems with separate control signals.In the studies [9], [10], a fast-acting sub-optimal control law has been presented to satisfy the system, but the control law is quite complicated.In the study [11]- [29] presenting the method of synthesizing the SMC sliding controller, linear feedback, and Backstepping for this system, the simulation results show the effectiveness of the designed control rules.The control rule design method based on fuzzy control theory and neuron network is presented in the works [30]- [34].
The LQR method is a classic method in linear controller design based on square function optimization.Studies [35]- [48] have presented the results of this controller for an inverted pendulum system.The optimal LQR controller is designed based on modeling the system structure; The accuracy and efficiency of the LQR controller have been verified, but it is difficult to determine the weight matrix  and .Many studies [49]- [53] have presented the option of choosing these matrices, but it is very difficult to simultaneously ensure the satisfaction of the objective function and satisfy some more criteria.This paper uses the BAT algorithm, which is one of the PSO algorithms, but it has advantages for multivariate problems [54]- [80].The BAT algorithm learns from the movement of bats in searching for prey and avoiding obstacles to build a method to find the extrema for the objective function [62]- [64].In terms of the mathematical model of the system, the LQR controller is designed based on swarm optimization and takes advantage of the searchability of the swarm algorithm (BAT) to optimize the matrices. of the LQR controller.In addition, in the way of updating matrix parameter values, we use weighting techniques, to incorporate the designer's experience in the algorithm's search.The purpose is to obtain the globally optimal solution of the matrices and for the LQR controller to design the optimal state feedback control matrix and overcome the disadvantages based on experience, trial, and error in selecting the matrix  and  of the controller.Controller results are illustrated by simulation on software and experimental results on real systems.The effectiveness of the optimized control law is shown when compared with some conventional LQR control rules and realized on embedded systems.

II. MATHEMATICAL MODEL OF FURUTA PENDULUM
The Furuta pendulum is composed of a rotating arm attached to a DC motor and a pendulum mounted on the top of the arm, having the structure shown in Fig. 1.The pendulum moves as an inverted pendulum in a plane perpendicular to the rotating arm.Angle α is the angle of rotation of the arm, angle β is the angle of deflection of the pendulum, these two variables are used as generalized coordinates to describe the rotating inverted pendulum system.Parameters of the inverted pendulum model used when simulating and investigating the system are approximated from the experimental model, including:  = 0.1 (m) -Length of Pendulum center of mass;  = 0.1(kg) -a mass of Pendulum; -pendulum angle in (rad),  ̇pendulum velocity (rad/s);  = (2) 2 /3 -a moment of inertia of pendulum link about its center of mass;  -motor shaft position (rad);  0 = 0.001(kgm 2 ) -Swing arm moment of inertia.Assuming that the friction at the joints is negligible, using the Euler-Lagrange method to get a Lagrange function of the form (1).
where  -moment generated by the motor.Substituting equation ( 1) into (2) we get the system of equations of the inverted pendulum (3).
Convert the system of equation (3) to the form of an equation of state with  1 = ;  2 = ̇;  3 = ;  4 =  ̇ get the system of equation (4).

LQR Method
The LQR method is the most classic optimal controller design method in the development of modern control theory [49]- [53].LQR control is determining a control rule for a given system that minimizes one or more quality parameters.The optimal control LQR is to find the amount of control  * () so that the system reaches a steady state and ensures that the efficiency index  reaches the minimum value, while the equation for index J is shown in (6).
In which,  * = − −1    = − and  are solutions of the Riccati algebraic equation [7], [19]  +    +  −  −1    = 0.In a study (13), the matrix  and the matrix  are mutual restrictions, and the matrix value  is proportional to the anti-interference ability of the system; increasing the value of , the anti-interference ability of the system is enhanced, shortening the adjustment time of the system.However, at the same time, system oscillations are enhanced and energy consumption increases.An increase in  makes the system power consumption less but the adjustment time increases.Therefore, the key to the design is to find the correct weight matrix  and matrix .As long as we are certain about the matrix  and the matrix , the state feedback matrix  is the only confirmation.However, the choice of matrix, matrix completely depends on experience and trial and error method in the design process of the LQR controller, so the greater the subjectivity, leading to the design of the controller is not perfect and affects the control efficiency.

Basics of BAT Algorithm
The standard BAT algorithm was developed by Xin-She Yang [54].The main features of BAT are based on the echolocation behavior of microbats.Since BAT uses frequency tuning, it is the first algorithm of its kind to combine optimization and computational intelligence.Each bat is coded with a velocity    and a position    , at iteration , in the -dimensional search space.The position can be thought of as a solution vector with a given objective function.Among the  bats of the population, the current best solution  * found can be stored during an iterative search.In the algorithm use the following approximation or idealization rules [54], [55]: − All bats use the echolocation of sound waves to sense distance, and they also know the difference between food, prey, and obstacles.
− The bat flies at random with speed   at position   .It can automatically adjust the  frequency (or wavelength) of the emitted pulses and adjust the pulse width rate  ∈ [0, 1], depending on the target's proximity.
− Although the magnitude of the echo can vary in many ways, it is assumed that the magnitude of the echo varies from the large  0 to the minimum value   .
For the sake of simplicity, many studies do not use ray tracing in this algorithm, but use variation in frequency  or wavelength  to suit different applications, depending on ease of implementation and other factors.The algorithm flowchart is presented in studies [54].

Synthesis of LQR Control Law based on BAT Algorithm for Pendulum System Furuta
The block diagram of the controller synthesis based on the LQR method based on the BAT algorithm is shown in Fig. 2.

Fig. 2. Controller synthesis block diagram
The objective function of the LQR method for the Furuta pendulum system with the matrices  and  selected as a positive definite diagonal has the form in equation (7).
where  is the time to survey when using the algorithm BAT;  1 ,  2 ,  3 ,  4 -matrix diagonal element .The problem of finding values  1 ,  2 ,  3 ,  4 , and   make sure that the  value is less than a given value and that the ,  matrices still satisfy the Riacti equation.At the same time, in the optimization process when updating the search values, the authors propose several techniques to update the optimal parameter values for the values to be searched based on the desired quality criteria of the system.output system.Such as fast response time, small overshoot, and low power consumption.Such update techniques are modeled through the update factor of each value in the algorithm in the direction of prioritizing the importance of the control goal of each output state.
The main parts of the bat algorithm can be summarized as follows: − In the last step: find the best solution (line 16), the current best solution is updated.

Numerical Simulation Results
To synthesize the LQR controller, the authors first choose the parameter determination scenario as follows: bring the pendulum from the initial position  0 = [ 0 0 0]  to the original equilibrium position [0 0 0 0]  in time 10 (s) so that the objective function J value is less than 25 (  = 25) with the control goal priority orientation is the stabilization time less than 2(s) without overshoot of the arm, the control torque is less than 5(N.m).The initial parameter value of the matrix ,  is  1 = 5.0;  2 = 5.0;  3 = 5.0;  4 = 5.0;   = 5.0.The controller parameter found based on the traditional LQR method is as (8).
weights  1,3 = 0.2,  2,4 = 1,  5 = 0.1.Initial data for the BAT algorithm include the number of bats:   =28 (bats), the loudness  = 0.33, and pulse emission rates  = 0.28.After performing the algorithm with the parameters, we get the diagonal matrix values  and  found in 14 search steps.The value of the calculation steps is shown in Fig. 3, at the results found that the coefficients related to the position of the pendulum angle  3 , and the lever  1 initially have a rapid increase in value, ensuring a fast effect for these variables, after that is slightly reduced.The coefficients related to the angular velocity  2 , and  4 decrease rapidly to the value 0.1, showing that the change in the values of these variables has little effect on the objective function .For the parameter   , the value increases compared to that of the objective function.The initial value then decreases slightly, which shows that the amplitude of the control signal value will be smaller during the transition when compared to the control law value found from the initial ,  matrix.Fig. 3 shows the values of the ,  matrix parameters step by step, the search process to the minimum value takes 14 steps.The final value when satisfying the requirements of the objective function in this algorithm is as follows:  1 = 5.6247;  2 = 0.10;  3 = 8.8897;  4 = 0.10;   = 5.8040; With the obtained values, solving Riccati's equation, we get the controller parameter as in (9).
In the first scenario with the same initial condition as above, the response of the system (4) to two control laws (8) and ( 9) is shown in Fig. 4. Fig. 4(a) shows the response of the arm deflection angle, at the results show that the system moves to the desired position without overshooting, with a fast response time.To achieve the goal system quickly stabilizes to the desired point with the goal of stability.The Pendulum will oscillate and have a larger amplitude in the transient mode but stabilize faster at the equilibrium point shown in Fig. 4(b).This has been achieved because in the optimal function with the technique of adding weights to the variables ,  will create an impact moment considering the response time.This is also shown by the value of controller (9), where the sign of this controller is the same as that of a regular LQR controller, but its value gives preference to the variables  1 ,  2 .The comparison of control quality is shown in Table I.In the initial time, the signal amplitude of the normal LQR rule is larger, but the control law is larger.The proposed LQR control will be large during the transition period.This indicates that the objective function  with the BAT algorithm and the techniques using weights only guarantee fast performance.6 (a) and (b), from the graph we see that with the proposed control law the response of the lever arm (Fig. 6 (a)) gives better results when the response time better yet, the static error is within ±0.05 (rad) with the LQR in the ±0.1 (rad) cavity and remains stable at the set value.With a pendulum, when the influence of noise on both control laws ensures the pendulum is stable under the influence of random noise, the control law is designed to ensure a stable system within ±0.01 (rads).However, in transient mode, the overshoot will be larger than when using the LQR controller.

Experimental Results
The real-time implementation of the controller was performed using the STM32F4 Discovery and the STM32Cube IDE embedded programming software (Fig. 7).The STM32F4 microcontrollers are powerful real-time controllers with fast processing and low cost.Embedded control system model for Furuta pendulum system at Control System Design Laboratory, Department of Automation Engineering, Le Quy Don University of Technology.The embedded control system is divided into data acquisition, signal processing, and output control, and includes relative encoder sensors, step-less rheostats, power amplifiers, and electrical circuits.In addition, the system is also connected to a computer via an RS232 port to monitor system parameters through software.The purpose of the system is to stabilize the vertical shakes at the positions where the arm angles overlap, and the sampling time is 2ms.Fig. 8 shows the connection of blocks of the Furuta pendulum real system.Experimental scenario with two control rules LQR and LQR_BAT with   set value change 1 (rad.)and -1 (rad.)respectively.Experimental results of angles ,  of the swing arm and pendulum are shown in Fig. 9 (a) and (b).The experimental results show that the LQR control law with the designed BAT algorithm responds well to the real system.Compared with the conventional LQR control law, the stabilization time is faster, the oscillation frequency is smaller, and the deviation margin from the supply value is smaller.However, while designing the control law only using a mathematical model without calculating all the properties of the real system such as position accuracy of the engine gearbox, and friction between joints ... Therefore, the real system is not as good as the simulation.

V. CONCLUSION
Based on the theory of the LQR controller combined with the bat swarm algorithm (BAT algorithm), the authors have synthesized an LQR controller with parameters ,  matrices that are optimized to control the balance of the Furuta pendulum system.From the simulation and experimental results, a method of building an optimal LQR controller with matrix parameters ,  combined with weighting techniques will ensure that the objective function meets the requirements and gives priority to some indicator's desired quality.The results compared with the classic LQR controller show the advantages of the proposed method.In addition, the controller is realized on a real system that has proven effective.From the simulation and experimental results, it has been demonstrated that the method's ability in designing a feedback controller satisfies the least squares criterion function, but has better control quality due to the ability to find global solutions of the algorithm BAT.For future studies, we will conduct an online search for Q and R matrices on real models and combined with the Lyapunov function method when the control is far from the equilibrium point.

1 4 :−
Original BAT algorithm Input: Bat population  = ( 1, … 4, ) for  = 1 …  MAX_FE Output: The best solution is qbest, solve the Riccati equation (Kbest) and its corresponding value Jmin=min(J(Kbest)).Line 1: init_bat() ; Line 2: eval=evaluate_the_new_population ; Line 3: Jmin=find_the_best_solution(Kbest) ; {initialization} Line While termination_condition_not_meet do Line 5: for i=0 to Np do Line 6if Jnew≤Ji and N(0,1)<Ai then Line 14: qi=y; K_besti = K; Ji=Jnew; Line 15: end if {save the best solution conditionally} Line 16: Jmin=find_the_best_solution(Kbest) ; Line 17: end for Line 18: end while − The first step is initialization (lines 1-3).In this step, we initialize the parameters of the algorithm, generate and evaluate the initial population, and then determine the best solution   for the population.− The second step is: to generate a new solution (line 6).Here, virtual bats are moved in the search space according to updating rules of the bat algorithm.−The third step is a local search step (lines 7-10).The best solution is being improved using random walks.In this step, the values to be searched for   ( = 1,5 ̅̅̅̅ ) are updated with a weight   ( = 1,5 ̅̅̅̅ ) with different priorities according to the control objective  _ =  _ +   (1,5).In addition, the parameters are limited to ensure that the matrices Q, R are always positive, to ensure the existence of solutions of the Riccati equation.−In the fourth step evaluate the new solution (line 11), and the evaluation of the new solution is carried out.In the fifth step save the best solution conditionally (lines 13-15), conditional archiving of the best solution takes place.

Fig. 3 .
Fig. 3. Values of matrix parameters Q, R through search steps by BAT algorithm


lqr-batThe control signals of the two rules are shown in Fig.5.

Fig. 5 .
Fig. 5.Control signal In the second scenario when bad at the external noise component or model error with  1,2 () are random values in the range [-4; 4], the   set value changes the form of the ladder.The response results of the system are shown in Fig. 6 (a) and (b), from the graph we see that with the proposed control law the response of the lever arm (Fig.6 (a)) gives better results when the response time better yet, the static error is within ±0.05 (rad) with the LQR in the ±0.1 (rad) cavity and remains stable at the set value.With a pendulum, when the influence of noise on both control laws ensures the pendulum is stable under the influence of random noise, the control law is designed to ensure a stable system within ±0.01 (rads).However, in transient mode, the overshoot will be larger than when using the LQR controller.

Fig. 6 .
Fig. 6.Angular response of arm (a) and pendulum (b) when there is a noise component and the set value changes

Fig. 9 .
Fig. 9. Swing arm angle response (a) and pendulum deflection angle (b) on an experimental model

TABLE I .
COMPARISON TABLE OF CONTROL QUALITY INDICATORS