Demonstration of MPC implementation by reverse engineering
E. N. Hartley, Control Group, Department of Engineering, University of Cambridge.
email: edward.hartley@eng.cam.ac.uk
The objective of this tutorial is to demonstrate how to reproduce the results from the example in the paper
[1] E. N. Hartley and J. M. Maciejowski, "Designing output-feedback predictive controllers by reverse engineering existing LTI Controller", IEEE Transactions on Automatic Control , 2013. DOI: 10.1109/TAC.2013.2258781
The source code to this tutorial can be found in ../reverse_engineering_tutorial.m
Contents
- Prerequisites
- Objectives
- Initial data
- Augment plant model with disturbance model
- Closed loop system matrix with baseline controller
- Decomposition into observer-compensator form
- Compute Moore-Penrose Pseudo-inverse of T and its nullspace
- State space system matrices for the observer and static Youla parameter
- Form the state space matrices for the prefilter
- Form the (unconstrained) target calculator
- Compare control systems so far
- Find inverse-optimal Q, R, S matrices to match Kc1
- Form the MPC controller using YALMIP
- Simulation (four separate systems)
- Plotting the simulation results
- References
Prerequisites
This demonstration assumes that the reader has the following software:
- MATLAB (any recent-ish version)
- YALMIP [6] (can be obtained from http://tbxmanager.com )
- An SDP solver compatible with YALMIP (SDPT3, SeDumI) (can be obtained from http://tbxmanager.com )
Objectives
This tutorial demonstrates how to take a state space plant model, and an existing discrete-time reference tracking controller with integral action, and transform these into an equivalent predictive controller (when constraints are inactive), by exploiting observer-gain controller realisations, and extending these to include a target calculator and pre-filter.
The objective of this exercise is to
- Transform the baseline controller controller into an (a) observer; (b) target calculator; (c) gain; and (d) prefilter.
- Use inverse optimality to find an MPC cost function that yields a controller that is equivalent to the control gain computed in the previous step.
- Enhance the MPC-based realisation of the baseline controller by adding input and state constraints.
Define the following mathematical symbols:
is the plant state
is the plant input
is the plant output
is the reference signal
is an unmeasured input disturbance
The plant model takes the form


The baseline controller takes the form


Initial data
% Sampling period Ts = 0.25; % Plant matrices A = [1 0.25 0 1]; B = 1/500*[0.03125;0.25]; C = [1 0]; D = 0; % Baseline controller matrices kp = 125; kd = 350; tau = 0.3; K1d = tf([(kp*Ts+2*kd), (kp*Ts-2*kd)], [Ts+2*tau, Ts-2*tau], Ts); K2d = tf([20.25, -19.75], [20 -20], Ts); K0d = [-K1d*K2d]; K0d = ss(K0d); [AK,BK,CK,DK] = ssdata(K0d)
AK =
1.4118 -0.8235
0.5000 0
BK =
32
0
CK =
13.0133 -26.1415
DK =
-871.0478
Augment plant model with disturbance model
Before converting to an observer-based form once can augment the plant model with a disturbance model. This is a common procedure in MPC design for offset-free target tracking.
Aaug = [A B;
zeros(1,2) 1];
Baug = [B;0];
Caug = [C 0];
Daug = D;
nx = 2; % Number of real states
nd = 1; % Number of disturbance states
nxd = nx+nd; % Total states
nu = 1; % Number of inputs
ny = 1; % Number of outputs
Closed loop system matrix with baseline controller
% Form closed loop system matrix Acl = [Aaug+Baug*DK*Caug, Baug*CK; BK*Caug AK]; % Check closed loop eigenvalues eig(Acl)
ans = 0.5579 0.9115 + 0.1192i 0.9115 - 0.1192i 0.9764 1.0000
Decomposition into observer-compensator form
The baseline controller can be transformed into an observer-compensator form by considering the state of the baseline controller as an estimate of a linear transformation of the plant state:
.
The transformation matrix
can be computed by solution of a non-symmetric Riccati equation of the form:
![$$\left[\begin{array}{cc}-T & I\end{array}\right]
\left[\begin{array}{cc}A_{aug}+B_{aug}D_KC_{aug} & B_{aug}C_K\\ B_KC_{aug} & A_K\end{array}\right]
\left[\begin{array}{c}I\\T\end{array}\right] = 0.
$$](reverse_engineering_tutorial_eq19256.png)
For a full derivation see references [2] and [3].
The non-symmetric Riccati equation is solved using an invariant subspace-based method. (For details, see e.g. references [4] or [5].
Compute Schur decomposition of Acl
[schur_U, schur_T] = schur(Acl);
Check ordering of eigenvalues
schur_ordeig = ordeig(schur_T)
schur_ordeig = 0.5579 0.9115 + 0.1192i 0.9115 - 0.1192i 0.9764 1.0000
Re-order the Schur decomposition so that the closed loop poles are distributed between observer and pure state feedback dynamics as per Option 1 in the paper.
cluster = zeros(1,size(Acl,1)); for k1 = [1 0.976 0.558] % Identify the positions of the eigenvalues in the vector that should % be in the pure state feedback dynamics. We have to do this, since % the ordering in the initial Schur decomposition varies between % versions of MATLAB. [~,I] = min(schur_ordeig - k1); cluster(I) = 1; end % Reorder the schur decomposition [ordschur_U, ordschur_T] = ordschur(schur_U, schur_T, cluster); ordschur_ordeig = ordeig(ordschur_T)
ordschur_ordeig = 0.5579 0.9764 1.0000 0.9115 + 0.1192i 0.9115 - 0.1192i
Compute solution T to non-symmetric Riccati equation
T = ordschur_U(nxd+1:end,1:nxd)/ordschur_U(1:nxd,1:nxd)
T = 1.0e+03 * -2.4376 -0.9519 0.0174 -1.2436 -0.4386 0.0087
Verify that T solves the non-symmetric Riccati equation. The value of "tmp" should equal or be very close to zero.
tmp = [-T eye(size(T,1))]*Acl*[eye(size(T,2));T]
tmp =
1.0e-08 *
0.3351 0.1121 -0.0023
0.1693 0.0568 -0.0012
Compute Moore-Penrose Pseudo-inverse of T and its nullspace
Tplus = pinv(T); Tnull = null(T);
Compute initial candidate controller and observer gains
Kc_cand = DK*Caug + CK*T; AKf_cand = Tplus*BK - Baug*DK;
Verify closed loop poles for state feedback and observer error dynamics
eig_state = eig(Aaug+Baug*Kc_cand) eig_obs = eig(Aaug-AKf_cand*Caug)
eig_state =
0.9764
0.5579
1.0000
eig_obs =
0.9115 + 0.1192i
0.9115 - 0.1192i
1.0000
Use the degrees of freedom in the pseudo-inverse of T to move the extra observer error dynamics to somewhere more sensible. This corresponds to scenario 1(c) in the paper [1]. The theoretical explanation of how this is done is presented in Section III(b) of [1].
%Q_y = diag([1e-6, 1e-7, 1e-3]); % Process noise %R_y = 1.2e-7; % Measurement noise %KNt = find_KN([Aaug-AKf_cand*Caug, AKf_cand; zeros(1,nxd), 0]',[Caug, -1]',[Tnull;0]',1e4*blkdiag(Q_y,R_y),0); %KN = KNt'; %KN = 7.2277; % KN = 7.7698; Kc = Kc_cand; % Equivalent control gain AKf = AKf_cand + Tnull*KN; % Observer gain Kf = Aaug\AKf; DQ = DK-Kc*Kf; % Static Youla Parameter
Split the control gain into the contribution from the state estimate and the contribution from the disturbance estimate:
.
Kc1 = Kc_cand(:,1:nx); Kcd = Kc_cand(:,nx+1:end);
Verify closed loop poles for state feedback and observer error dynamics
eig_state = eig(Aaug+Baug*Kc) eig_obs = eig(Aaug-AKf*Caug)
eig_state =
0.9764
0.5579
1.0000
eig_obs =
0.9115 + 0.1192i
0.9115 - 0.1192i
0.9559
State space system matrices for the observer and static Youla parameter
Aobs = Aaug-AKf*Caug;
Bobs = [Baug AKf];
Cobs = [eye(nxd)-Kf*Caug,
-DQ*Caug];
Dobs = [zeros(nxd,nu), Kf,
zeros(size(DQ,1), nu) DQ];
Form the state space matrices for the prefilter
The prefilter has the same state dynamics as the observer, however the output is modified to ensure that the generated internal reference signal is compatible with the external reference. See Section V(b) of [1] for a theoretical derivation of how this is done.
M = ( [-Kc1 1]*inv([A-eye(nx) B; C zeros(ny,nd)])*[zeros(nx);C] ) + Kc1;
M_Kc1 = [M;Kc1];
Kcmd = [zeros(nu,nxd), 0, 1
Kc1 Kcd, 1, 0];
gain1 = inv(M_Kc1)*Kcmd;
Qpre = C*gain1(:,5); % Contribution to reference state from Youla parameter
nq = size(Qpre,1);
Apre = Aobs;
Bpre = AKf;
Cpre = C*gain1(:,1:4)*Cobs;
Dpre = C*gain1(:,1:4)*[Kf;DQ];
Form the (unconstrained) target calculator
The target calculator is simply to compute a target equilibrium pair
satisfying:
![$$
\left[\begin{array}{cc}(A-I) & B\\ C & 0\end{array}\right]
\left[\begin{array}{c}x_{s}\\u_{s}\end{array}\right]
=
\left[
\begin{array}{cc}
B & 0\\
0 & I
\end{array}
\right]
\left[
\begin{array}{c}
\hat{d}(k)\\
y_{ref}(k)
\end{array}
\right]
$$](reverse_engineering_tutorial_eq12466.png)
where
is the output of the pre-filter computed in the previous section.
TGT_L = [A-eye(nx) B;
C 0];
TGT_R = blkdiag(-B, eye(ny));
TGT_mat = TGT_L\TGT_R;
Compare control systems so far
In this section we will verify that the computed unconstrained observer, gain, target calculator and prefilter combined together give the same step response as the baseline controller.
Build the closed-loop baseline
ss_plant = ss(A,B,C,D, Ts); % Plant ss_k0 = ss(AK,BK,CK,DK, Ts); % Controller % Closed loop system % Inputs: reference % Outputs: y ss_orig_cl = -feedback(ss_plant*ss_k0,1,1);
Build the observer-gain-target-calculator-prefilter-based controller
ss_obs = ss(Aobs,Bobs,Cobs,Dobs,Ts); ss_pre = ss(Apre,Bpre,Cpre,Dpre,Ts); % Target calculator % Inputs: xhat(1), xhat(2), dhat, yref % Outputs: xhat(1), xhat(2), xs(1), xs(2), us ss_xhat_tgtcalc = blkdiag(eye(nx), TGT_mat); % Connect prefilter and observer so that Youla parameter can provide xref % Inputs: u, y, r % Outputs: xhat(1), xhat(2), dhat, yref ss_obs_pre = [eye(3) zeros(3,1) zeros(3,1); zeros(1,3) Qpre 1]*blkdiag(ss_obs, ss_pre); % Connect target calculator to prefilter and observer % Inputs: u, y, r % Outputs: xhat(1), xhat(2), xs(1), xs(2), us ss_tgt_obs_pre = ss_xhat_tgtcalc*ss_obs_pre; % State feedback law % Inputs: xhat(1), xhat(2), xs(1), xs(2), us % Outputs: u ss_Kfb = [Kc1, -Kc1, 1]; % Complete observer-based controller ss_ctrl = ss_Kfb*ss_tgt_obs_pre; % Add direct input feedthrough to plant model % Inputs: u % Outputs: u, y ss_plant_ft = [1;ss(A,B,C,D,Ts)]; % Create forward path % Inputs: u,y,r % Outputs: u,y ss_forward = ss_plant_ft*ss_ctrl; % Close the loop % Inputs: u,y,r % Outputs: u,y ss_cl = feedback(ss_plant_ft*ss_ctrl, [eye(2);zeros(1,2)],1);
Compare the two unconstrained responses
subplot(1,2,1); step(ss_orig_cl, 40); title('Original') subplot(1,2,2); step(ss_cl(2,3), 40); title('Observer/tgt calc based');
We now have a control system based on an observer, steady-state target calculator and reference pre-filter which gives the same closed-loop response as the original system. The objective of the reverse engineering exercise is to now replace the control gain
with an MPC controller so that input and output constraints can be imposed.
Find inverse-optimal Q, R, S matrices to match Kc1
In this section, we find inverse optimal cost matrices Q, R, S for the infinite horizon discrete-time LQR problem for the system

such that

See Section IV of [1] for full details. We use YALMIP (reference [6]) as an easy way to specify and solve the corresponding semi-definite program.
P = sdpvar(nx, nx, 'symmetric'); Q = sdpvar(nx, nx, 'symmetric'); R = sdpvar(nu, nu, 'symmetric'); S = sdpvar(nx, nu, 'full'); CON = []; % Variable to specify constraints CON = CON + (R>=1e-3*eye(nu)); % Sanity constraint CON = CON + ([Q S; S' R] >= 0); % Convexity constraint CON = CON + (P >= 0); % PSD Constraint CON = CON + (Q >= 0); % PSD Constraint CON = CON + (R >= 0); % PSD Constraint % Riccati constraint CON = CON + ((A'*P*A - P - Kc1'*(B'*P*B + R)*Kc1 + Q) == 0); % Matching constraint CON = CON + ((B'*P*B+R)*Kc1 + (B'*P*A + S') == 0); % Stability constraint CON = CON + (( (A+B*Kc1)'*P*(A+B*Kc1) - P + Q + Kc1'*R*Kc1 + Kc1'*S' + S*Kc1) <= 0); % Minimise the 2-norm of the cross terms in the extended LQR cost function OBJ = norm(S,2)^2; % Run YALMIP solvesdp(CON,OBJ, sdpsettings('solver', 'sedumi', 'removeequalities', 1, 'verbose', 0));
Warning: Rank deficient, rank = 5, tol = 6.027241e-08.
Q = double(Q) R = double(R) S = double(S) P = double(P)
Q =
7.0242 77.5057
77.5057 856.2515
R =
0.0010
S =
0.0839
0.9254
P =
0.2454 0.5769
0.5769 9.0595
The solution is not unique (so does not have to exactly match that in the paper). However, the resulting control gain should match. Note that we use a positive feedback convention, however the MATLAB dlqr command uses a negative feedback convention.
Kc1 Kc1_test = -dlqr(A,B,Q,R,S)
Kc1 = -83.4038 -921.0096 Kc1_test = -83.4038 -921.0096
Form the MPC controller using YALMIP
We now use YALMIP [6] to construct a constrained MPC controller of the form

subject to:



N = 20; % Prediction horizon umin = -20; umax = 20; % Input constraint xmin = [-1000;-0.04]; xmax = [1000;0.04]; % State constraint % Recompute P with the DARE P = dare(A,B,Q,R,S); CON = []; % Constraint variable OBJ = 0; % Objective variable x = sdpvar(nx,N+1); % Predicted state trajectory u = sdpvar(nu,N); % Predicted input trajectory x0 = x(:,1); % Current measured state u0 = u(:,1); % Applied input xs = sdpvar(nx,1); % Target equilibrium state us = sdpvar(nu,1); % Target equilibrium input % Prediction dynamics for k1 = 1:N CON = CON + ( x(:,k1+1) == A*x(:,k1) + B*u(:,k1) ); end % Cost function for k1 = 1:N OBJ = OBJ + (x(:,k1)-xs)'*Q*(x(:,k1)-xs) ... + (u(:,k1)-us)'*R*(u(:,k1)-us) ... + (x(:,k1)-xs)'*S*(u(:,k1)-us) ... + (u(:,k1)-us)'*S'*(x(:,k1)-xs); % Stage cost end OBJ = OBJ + (x(:,N+1)-xs)'*P*(x(:,N+1)-xs); % Terminal cost % Input constraints for k1 = 1:N CON = CON + ( umin <= u(:,k1) <= umax); end % Output constraints for k1 = 2:N+1 CON = CON + ( xmin <= x(:,k1) <= xmax); end % Build Optimizer object ctrl = optimizer(CON,OBJ, sdpsettings('solver', 'quadprog'), [x0;xs;us], u0);
Simulation (four separate systems)
This section runs four concurrent simulations in response to a step input and a combined step/pulse disturbance.
- Baseline controller with a small perturbation
- MPC controller with a small perturbation
- Baseline controller with a large perturbation
- MPC controller with a large perturbation
% Plant and controller state variables and reference signals x_0 = [0;0]; d_0 = 0.003; r_0 = 0.01; xk_0 = [0;0]; % Baseline, small perturbation x_1 = [0;0]; d_1 = 0.003; r_1 = 0.01; xpre_1 = [0;0;0]; xobs_1 = [0;0;0]; % MPC, small perturbation x_2 = [0;0]; d_2 = 0.03; r_2 = 0.10; xk_2 = [0;0]; % Baseline, large perturbation x_3 = [0;0]; d_3 = 0.03; r_3 = 0.10; xpre_3 = [0;0;0]; xobs_3 = [0;0;0]; % MPC, large perturbation % Trajectory logs; Nsim = 500; xlog0 = zeros(nx,Nsim+1); xlog0(:,1) = x_0; ulog0 = zeros(nu,Nsim); xlog1 = zeros(nx,Nsim+1); xlog1(:,1) = x_1; ulog1 = zeros(nu,Nsim); xlog2 = zeros(nx,Nsim+1); xlog2(:,1) = x_2; ulog2 = zeros(nu,Nsim); xlog3 = zeros(nx,Nsim+1); xlog3(:,1) = x_3; ulog3 = zeros(nu,Nsim); Tdist = 201; Bd1 = B; for k1 = 1:Nsim % System 0: Baseline controller with small perturbation if k1 > Tdist, d1 = d_0; else d1 = 0; end; % Disturbance (constant) if k1 == Tdist, d2 = d_0; else d2 = 0; end; % Disturbance (impulse) u_0 = CK*xk_0 + DK*(C*x_0 - r_0); % Compute input xk_0 = AK*xk_0 + BK*(C*x_0 - r_0); % Update controller state x_0 = A*x_0 + B*u_0 + 0*Bd1*d1 +d2; % Update plant state xlog0(:,k1+1) = x_0; ulog0(:,k1) = u_0; % Log data % System 1: MPC with small perturbation if k1 > Tdist, d1 = d_1; else d1 = 0; end; % Disturbance (constant) if k1 == Tdist, d2 = d_1; else d2 = 0; end; % Disturbance (impulse) xqhat_1 = Cobs*xobs_1 + [Kf;DQ]*(C*x_1); % Current state and youla param estimate yref1 = Cpre*xpre_1 + Dpre*r_1 + Qpre*xqhat_1(4); % Filterered reference xs_us1 = TGT_mat*[xqhat_1(3);yref1]; % Steady state target u_1 = ctrl{[xqhat_1(1:2);xs_us1]}; % Input signal %u_1 = [Kc1 -Kc1 1]*[xqhat_1(1:2);xs_us1]; xpre_1 = Apre*xpre_1 + Bpre*r_1; % Update prefilter state xobs_1 = Aobs*xobs_1 + Bobs*[u_1;C*x_1]; % Update observer state x_1 = A*x_1 + B*u_1 + 0*Bd1*d1 + d2; % Update plant state xlog1(:,k1+1) = x_1; ulog1(:,k1) = u_1; % Log data % System 2: Baseline controller with large perturbation if k1 > Tdist, d1 = d_2; else d1 = 0; end; % Disturbance (constant) if k1 == Tdist, d2 = d_2; else d2 = 0; end; % Disturbance (impulse) u_2 = CK*xk_2 + DK*(C*x_2 - r_2); % Compute input xk_2 = AK*xk_2 + BK*(C*x_2 - r_2); % Update controller state x_2 = A*x_2 + B*u_2 + Bd1*d1 +d2; % Update plant state xlog2(:,k1+1) = x_2; ulog2(:,k1) = u_2; % Log data % System 3: MPC with large perturbation if k1 > Tdist, d1 = d_3; else d1 = 0; end; % Disturbance (constant) if k1 == Tdist, d2 = d_3; else d2 = 0; end; % Disturbance (impulse) xqhat_3 = Cobs*xobs_3 + [Kf;DQ]*C*x_3; % Current state and youla param estimate yref3 = Cpre*xpre_3 + Dpre*r_3 + Qpre*xqhat_3(4); % Filterered reference xs_us3 = TGT_mat*[xqhat_3(3);yref3]; % Steady state target u_3 = ctrl{[xqhat_3(1:2);xs_us3]}; % Input signal xpre_3 = Apre*xpre_3 + Bpre*r_3; % Update prefilter state xobs_3 = Aobs*xobs_3 + Bobs*[u_3;C*x_3]; % Update observer state x_3 = A*x_3 + B*u_3 + Bd1*d1 + d2; % Update plant state xlog3(:,k1+1) = x_3; ulog3(:,k1) = u_3; % Log data end
Plotting the simulation results
figure; set(gcf, 'Position', [0 0 800 400]); subplot(4,3,1); plot([0:Nsim]*Ts, xlog0(1,:)); set(gca, 'xlim', [0 100], 'ylim', [0 0.02]); subplot(4,3,2); plot([0:Nsim]*Ts, xlog0(2,:)); set(gca, 'xlim', [0 100], 'ylim', [-0.01 0.01]); title('Baseline, small perturbation'); subplot(4,3,3); plot([0:Nsim-1]*Ts, ulog0(1,:)); set(gca, 'xlim', [0 100], 'ylim', [-10 10]); subplot(4,3,4); plot([0:Nsim]*Ts, xlog1(1,:)); set(gca, 'xlim', [0 100], 'ylim', [0 0.02]); subplot(4,3,5); plot([0:Nsim]*Ts, xlog1(2,:)); set(gca, 'xlim', [0 100], 'ylim', [-0.01 0.01]); title('MPC, small perturbation'); subplot(4,3,6); plot([0:Nsim-1]*Ts, ulog1(1,:)); set(gca, 'xlim', [0 100], 'ylim', [-10 10]); subplot(4,3,7); plot([0:Nsim]*Ts, xlog2(1,:)); set(gca, 'xlim', [0 100], 'ylim', [0 0.2]); subplot(4,3,8); plot([0:Nsim]*Ts, xlog2(2,:)); set(gca, 'xlim', [0 100], 'ylim', [-0.1 0.1]); title('Baseline, large perturbation'); subplot(4,3,9); plot([0:Nsim-1]*Ts, ulog2(1,:)); set(gca, 'xlim', [0 100], 'ylim', [-100 100]); subplot(4,3,10); plot([0:Nsim]*Ts, xlog3(1,:)); set(gca, 'xlim', [0 100], 'ylim', [0 0.2]); subplot(4,3,11); plot([0:Nsim]*Ts, xlog3(2,:)); set(gca, 'xlim', [0 100], 'ylim', [-0.1 0.1]); title('MPC, large perturbation'); subplot(4,3,12); plot([0:Nsim-1]*Ts, ulog3(1,:)); set(gca, 'xlim', [0 100], 'ylim', [-20 20]); k3 = 1; for k1 = 1:4 subplot(4,3,k3); xlabel('State 1'); k3 = k3+1; grid on; subplot(4,3,k3); xlabel('State 2'); k3 = k3+1; grid on; subplot(4,3,k3); xlabel('Input'); k3 = k3+1; grid on; end
The results show that the MPC with small perturbation gives the same closed-loop response as the baseline controller (since no constraints are active). However, with a larger perturbation, the MPC enforces constraints to give a different response that satisfies the constraints.
References
[1] Hartley, E. N. and Maciejowski, J. M. "Designing output-feedback predictive controllers by reverse engineering existing LTI Controller", IEEE Transactions on Automatic Control , 2013. DOI: 10.1109/TAC.2013.2258781
[2] Bender, D. J. and Fowell, R. A. "Computing the estimator-controller form of a compensator". International Journal of Control, 41(6):1565-1575, 1985. DOI: 10.1080/0020718508961215; or
[3] Alazard, D. and Apkarian, P. "Exact observer-based structures for arbitrary compensators", International Journal of Robust and Nonlinear Control, 9(2):101-118, 1999. DOI: 10.1002/(SICI)1099-1239(199902)9:2<101::AID-RNC400>3.0.CO;2-U
[4] Zhou, K., Doyle, J. and Glover, K. " Robust and Optimal Control ", Prentice Hall, 1996.
[5] Laub, A. J., "A Schur method for solving algebraic Riccati equations", IEEE Transactions on Automatic Control, AC-24(6):913-921, 1979. DOI: 10.1109/TAC.1979.1102178 )
[6] Lofberg, J. YALMIP: A toolbox for modeling and optimization in MATLAB. In Proceedings of the CACSD Conference, Taipei, Taiwan, 2004. DOI: 10.1109/CACSD.2004.1393890, Download from: tbxManager or YALMIP Wiki