Inverted pendulum on a cart

48.4. Inverted pendulum on a cart#

todo

Update \(\texttt{control}\) library version, to have \(\texttt{ct.poles()}\)
…

Contents.

Equations of motion: non-linear equations, linearized equations around equilibria; augmented equations for reference tracking
Control:
- full-state feedback: this controller is used later in the observed-state feedback, exploiting separation principle
- observed-state feedback: combination of observer and controller
Verification:
- test control laws on non-linear equations

Here, algebraic Riccati equations (ARE) are solved for infinite time-horizon problems of optimal full-state feedback and optimal observer for stochastic disturbances:

48.4.1. Equations of motion with Lagrangian mechanics#

See Classical Mechanics: Lagrangian Mechanics

48.4.1.1. Non-linear equations#

Second-order dynamical equation of the mechanical system reads

\[\begin{split}\begin{aligned} ( M + m ) \ddot{x} - m \ell \ddot{\theta} \cos \theta + m \ell \dot{\theta}^2 \sin \theta & = F_x - c_x \dot{x} \\ - m \ell \ddot{x} \cos \theta + m \ell^2 \ddot{\theta} - m g \ell \sin \theta & = C_\theta - c_\theta \dot{\theta} \ . \end{aligned}\end{split}\]

or at the first order, \(\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x}, \mathbf{u})\),

\[\begin{split}\left\{\begin{aligned} \dot{x} & = v \\ \dot{\theta} & = \omega \\ (M+m) \, \dot{v} - m \ell \cos\theta \, \dot{\omega} & = - m \ell \omega^2 \sin \theta + F_x - c_x v \\ - m \ell \cos \theta \, \dot{v} + m \ell^2 \dot{\omega} & = m g \ell \sin \theta + C_\theta - c_\theta \omega \ . \end{aligned}\right.\end{split}\]

48.4.1.2. Linearized equations around equilibria#

Linearized equation around the unstable equilibrium. First-order system equation around the unstable equilibium becomes

\[\begin{split}\begin{aligned} \begin{bmatrix} M + m & - m \ell \\ -m \ell & m \ell^2 \end{bmatrix} \begin{bmatrix} \ddot{x} \\ \ddot{\theta} \end{bmatrix} + \begin{bmatrix} c_x & \cdot \\ \cdot & c_\theta \end{bmatrix} \begin{bmatrix} \dot{x} \\ \dot{\theta} \end{bmatrix} + \begin{bmatrix} \cdot & \cdot \\ \cdot & - m g \ell \end{bmatrix} \begin{bmatrix} \dot{x} \\ \dot{\theta} \end{bmatrix} = \begin{bmatrix} F_x \\ C_\theta \end{bmatrix} \ . \end{aligned}\end{split}\]

or

\[\mathbf{M} \ddot{\mathbf{z}} + \mathbf{C} \dot{\mathbf{z}} + \mathbf{K} \mathbf{z} = \mathbf{f}\]

State-space (first-order) representation. With state and input

\[\begin{split}\mathbf{x} = \begin{bmatrix} \mathbf{z} \\ \dot{\mathbf{z}} \end{bmatrix} \quad , \quad \mathbf{u} = \mathbf{f} \ ,\end{split}\]

the first-order state-space representation of the system reads

\[\begin{split}\left\{\begin{aligned} \dot{\mathbf{x}} & = \begin{bmatrix} \cdot & \mathbf{I} \\ -\mathbf{M}^{-1} \mathbf{K} & - \mathbf{M}^{-1} \mathbf{C} \end{bmatrix} \mathbf{x} + \begin{bmatrix} \cdot \\ \mathbf{M}^{-1} \end{bmatrix} \mathbf{u} \\ \mathbf{y} & = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{x} \ . \end{aligned}\right.\end{split}\]

This the equation we’re interested in, when studying the inverted pendulum on a cart.

48.4.1.3. Augmented system for tracking reference signal#

Let \(\mathbf{y}_{\text{ref}}\) a reference signal. An augmented system can be defined in order to used optimal control. Let

\[\begin{split}\begin{aligned} \dot{\mathbf{x}} & = \mathbf{A} \mathbf{x} + \mathbf{B} \mathbf{u} \\ \mathbf{y} & = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{u} \end{aligned}\end{split}\]

the state-space representation of the plant. Let \(\mathbf{y}_{\text{ref}}\) a desired output and the integral error

\[\mathbf{e}_{\text{int}}(t) := \int_{\tau=0}^{t} \left\{ \mathbf{y}(\tau) - \mathbf{y}_{\text{ref}}(\tau) \right\} d \tau \ ,\]

as a new state with dynamical equation

\[\dot{\mathbf{e}}_{\text{int}} = \mathbf{y}(t) - \mathbf{y}_{\text{ref}}(t) = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{u} - \mathbf{y}_{\text{ref}} \ .\]

The optimal control is applied to the augmented system

\[\begin{split}\underbrace{\begin{bmatrix} \dot{\mathbf{x}} \\ \dot{\mathbf{e}}_{\text{int}} \end{bmatrix}}_{\dot{\mathbf{z}}} = \underbrace{\begin{bmatrix} \mathbf{A} & \cdot \\ \mathbf{C} & \cdot \end{bmatrix}}_{\hat{\mathbf{A}}} \underbrace{\begin{bmatrix} \mathbf{x} \\ \mathbf{e}_{\text{int}} \end{bmatrix}}_{\mathbf{z}} + \underbrace{\begin{bmatrix} \mathbf{B} \\ \mathbf{D} \end{bmatrix}}_{\hat{\mathbf{B}}_u} \mathbf{u} + \underbrace{\begin{bmatrix} \cdot \\ -\mathbf{I} \end{bmatrix}}_{\hat{\mathbf{B}}_{ref}} \mathbf{y}_{ref}\end{split}\]

\[\mathbf{y} = \hat{\mathbf{C}} \mathbf{z} + \hat{\mathbf{D}}_u \mathbf{u} \ .\]

Optimal control framework provides the opitmal gain matrix \(\hat{\mathbf{K}}\), so that \(\mathbf{u} = - \hat{\mathbf{K}} \mathbf{z}\) and the closed loop system becomes

\[\begin{split}\begin{aligned} \dot{\mathbf{z}} & = \left( \hat{\mathbf{A}} - \hat{\mathbf{B}}_u \hat{\mathbf{K}} \right) \mathbf{z} + \hat{\mathbf{B}}_{\text{ref}} \mathbf{y}_{\text{ref}} \\ \mathbf{y} & = \left( \hat{\mathbf{C}} - \hat{\mathbf{D}}_u \mathbf{K} \right) \mathbf{z} \ . \end{aligned}\end{split}\]

If the output of the system is the angle \(\theta(t)\), with reference signal \(\mathbf{y}_{\text{ref}} = \theta_{\text{ref}}\), the dynamical system is a SISO system, whose state-space representation is

…

48.4.2. Libraries#

[ 0.00000000e+00+0.j  0.00000000e+00+0.j  0.00000000e+00+0.j
 -1.01602637e+01+0.j  9.65526368e+00+0.j -4.99999873e-03+0.j]

48.4.3. Control#

48.4.3.1. Full-state feedback#

Show code cell source Hide code cell source

# LQR Weights
Q = np.diag([10, 10, 1, 1, 10, 10]) #

R = np.diag([10, 10])
# R_values = np.linspace(0.001, .01, 4)  # [0.001, 0.01, 0.1]
# R_values = np.logspace(-4, 4, 9)

omega_freq = np.logspace(-2, 2, 1000)

fig, ax = plt.subplots(2,2, figsize=(8, 8))

# for R in R_values:

# Compute LQR Gain
# K, S, E = ct.lqr(A, B, Q, R)
Ka, Sa, Ea = ct.lqr(sys_aug, Q, R)

# #> Loop transfer function L(s) = K * (sI - A)^-1 * B
# #> Open-loop TF, L(s) = G(s) R(s), with R(s) = K
# # from the error to the output
# A_ol = np.block([
#     [     A - B @ Ka[:,:2], - B @ Ka[:,2:]],
#     [      np.zeros((1,2)), np.zeros((1, 1))]
# ])
# B_ol = np.block([[.0], [.0], [1.]])
# C_ol = np.block([[ C - D @ Ka[:,:2], - D @ Ka[:,2]]])
# D_ol = np.zeros((1,1))
# sys_ol = ct.ss(A_ol, B_ol, C_ol, D_ol)

#> Closed-loop TF
# from the disturbance signal to the output
# Simulate closed-loop response with reference input
A_cl = Aa - Ba @ Ka
B_ref = np.block([[np.zeros((4,2))], [-np.eye(2)]])
# C_cl = np.block([[np.eye(2), np.zeros((2,4))]])
# D_cl = np.zeros((2, 2))
C_cl = np.eye(6)
D_cl = np.zeros((6,2))

sys_cl = ct.ss(A_cl, B_ref, C_cl, D_cl)

evals_cl = ct.poles(sys_cl)

# #> Frequency response
# mag_L, phase_L, omega_L = ct.frequency_response(sys_ol, omega_freq)

#> Plots
# #> Bode plots (first line) of L(s)
# ax[0,0].loglog(omega_L, mag_L, label=f'R={R}')
# ax[0,1].semilogx(omega_L, np.degrees(phase_L), label=f'R={R}')
# #> Nyquist diagram of L(s)
# ax[1,0].plot(mag_L*np.cos(phase_L), mag_L*np.sin(phase_L), label=f'R={R}')
#> Eigenvalues of the closed-loop system
ax[1,1].plot(np.real(evals_cl), np.imag(evals_cl), 'x', label=f'R={R}')

#> Critical point and circle in Nyquist plot
theta_v = np.linspace(0, 2*np.pi, 100)
# ax[1,0].plot([-1], [0], 'o', ms=5, color="black")
# ax[1,0].plot(np.cos(theta_v), np.sin(theta_v), '--', color="black")

# #> Formatting plots
# ax[0,0].set_title('Bode Magnitude'); ax[0,0].grid(True); ax[0,0].legend()
# ax[0,1].set_title('Bode Phase'); ax[0,1].grid(True)
# ax[1,0].set_title('Nyquist Plot'); ax[1,0].grid(True); ax[1,0].set_aspect('equal'); ax[1,0].set_xlim([-5, 5]); ax[1,0].set_ylim([-5, 5])
ax[1,1].set_title('Eigenvalues of the closed-loop TF'); ax[1,1].grid(True); ax[1,1].set_aspect('equal')
ax[1,1].set_xlim([-5.5, .5]); ax[1,1].set_ylim([-3., 3.])

fig.set_tight_layout(True)

print(evals_cl)

plt.show()

[-160.29249401+0.j           -3.03748266+0.j
   -1.65844928+1.43107063j   -1.65844928-1.43107063j
   -1.03177742+0.j           -1.03711996+0.j        ]

../../_images/ab1a4676435b3eb1a4f4cd893ca28be7d11105ba243860a3d3cf86cdffd575d2.png

todo. Update \(\texttt{control}\) library

48.4.3.2. Feed-back on state estimate#

48.4.3.2.1. Observer design#

System equations, without disturbances/noise

\[\begin{split}\left\{\begin{aligned} \dot{\mathbf{x}} & = \mathbf{A} \mathbf{x} + \mathbf{B} \mathbf{u} \\ \mathbf{y} & = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{u} \ , \end{aligned}\right.\end{split}\]

Observer (state estimator) equations

\[\begin{split}\left\{\begin{aligned} \hat{\mathbf{y}} & = \mathbf{C} \hat{\mathbf{x}} + \mathbf{D} \mathbf{u} \\ \dot{\hat{\mathbf{x}}} & = \mathbf{A} \hat{\mathbf{x}} + \mathbf{B} \mathbf{u} + \mathbf{L} \left( \mathbf{y} - \hat{\mathbf{y}} \right) \\ & = \left( \mathbf{A} - \mathbf{L} \mathbf{C} \right) \hat{\mathbf{x}} + \mathbf{L} \mathbf{C} \mathbf{x} + \mathbf{B} \mathbf{u} \ , \end{aligned}\right.\end{split}\]

so that the dynamical equation of the error \(\boldsymbol\varepsilon := \hat{\mathbf{x}} - \mathbf{x}\) reads

\[\dot{\boldsymbol\varepsilon} = \left( \mathbf{A} - \mathbf{L} \mathbf{C} \right) \boldsymbol\varepsilon \ .\]

The dynamical equations of the augmented system, plant+observer, reads

\[\begin{split}\begin{bmatrix} \dot{\mathbf{x}} \\ \dot{\boldsymbol\varepsilon} \end{bmatrix} = \begin{bmatrix} \mathbf{A} & \mathbf{0} \\ \mathbf{0} & \mathbf{A} - \mathbf{L} \mathbf{C} \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ \boldsymbol\varepsilon \end{bmatrix} + \begin{bmatrix} \mathbf{B} \\ \mathbf{0} \end{bmatrix} \mathbf{u} \ ,\end{split}\]

and the observer design here is the design of a matrix \(\mathbf{L}\) that makes the dynamics of the error \(\boldsymbol\varepsilon\), and thus the matrix \(\mathbf{A} - \mathbf{L} \mathbf{C}\), asymptotically stable.

48.4.3.2.2. Control design#

Control design is performed as the state is known, and then applied to the estimated state \(\hat{\mathbf{x}} = \mathbf{x} + \boldsymbol\varepsilon\), as \(\mathbf{u} = - \mathbf{K} \hat{\mathbf{x}}\).

Here the state is augmented with an integrator

\[\mathbf{e}_{\text{int}}(t) := \int_{\tau=0}^{t} \{ \mathbf{y}(\tau) - \mathbf{y}_{\text{ref}}(\tau) \} d \tau \ ,\]

to get tracking of slow signals and then the state of the augmented system is \(\mathbf{z} = \begin{bmatrix} \mathbf{x} \\ \mathbf{e}_{\text{int}} \end{bmatrix}\), and optimal control produces a feedback on the augmented state

\[\mathbf{u} = - \mathbf{K} \mathbf{z} = - \mathbf{K}_{\mathbf{x}} \mathbf{x} - \mathbf{K}_{\mathbf{e}} \mathbf{e}_{\text{int}} \ .\]

The integral error can be measured and needs no observer. Thus, the feedback on the observed state becomes

\[\mathbf{u} = - \mathbf{K}_{\mathbf{x}} \hat{\mathbf{x}} - \mathbf{K}_{\mathbf{e}} \mathbf{e}_{\text{int}} \ .\]

48.4.3.2.3. Combination observer + controller#

State-space representation of the open-loop system from the tracking error \(\mathbf{e}\) to the output \(\mathbf{y}\) reads

\[\begin{split}\left\{\begin{aligned} \dot{\mathbf{x}} & = \mathbf{A} \mathbf{x} + \mathbf{B} \mathbf{u} \\ \mathbf{y} & = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{u} \\ \dot{\boldsymbol\varepsilon} & = \left( \mathbf{A} - \mathbf{L} \mathbf{C} \right) \boldsymbol\varepsilon \\ \dot{\mathbf{e}}_{\text{int}} & = \mathbf{e} = \mathbf{y} - \mathbf{y}_{\text{ref}} = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{u} - \mathbf{y}_{\text{ref}} \\ \end{aligned}\right.\end{split}\]

\[\hat{\mathbf{x}} = \mathbf{x} + \boldsymbol\varepsilon\]

\[\begin{split}\begin{aligned} \mathbf{u} & = - \mathbf{K}_{\mathbf{x}} \hat{\mathbf{x}} - \mathbf{K}_{\mathbf{e}} \mathbf{e}_{\text{int}} \\ & = - \mathbf{K}_{\mathbf{x}} \mathbf{x} - \mathbf{K}_{\mathbf{x}} \boldsymbol\varepsilon - \mathbf{K}_{\mathbf{e}} \mathbf{e}_{\text{int}} \\ \end{aligned}\end{split}\]

The open-loop system reads

\[\begin{split}\begin{aligned} \begin{bmatrix} \dot{\mathbf{x}} \\ \dot{\boldsymbol\varepsilon} \\ \dot{\mathbf{e}}_{\text{int}} \end{bmatrix} & = \begin{bmatrix} \mathbf{A} & \cdot & \cdot \\ \cdot & \mathbf{A} - \mathbf{L} \mathbf{C} & \cdot \\ \mathbf{C} & \cdot & \cdot \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ \boldsymbol\varepsilon \\ \mathbf{e}_{\text{int}} \end{bmatrix} + \begin{bmatrix} \mathbf{B} \\ \cdot \\ \mathbf{D} \end{bmatrix} \mathbf{u} + \begin{bmatrix} \cdot \\ \cdot \\ - \mathbf{I} \end{bmatrix} \mathbf{y}_{\text{ref}} \\ \mathbf{y} & = \begin{bmatrix} \mathbf{C} & \cdot & \cdot \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ \boldsymbol\varepsilon \\ \mathbf{e}_{\text{int}} \end{bmatrix} + \mathbf{D} \mathbf{u} \end{aligned}\end{split}\]

while the closed-loop system becomes

\[\begin{split}\begin{aligned} \begin{bmatrix} \dot{\mathbf{x}} \\ \dot{\boldsymbol\varepsilon} \\ \dot{\mathbf{e}}_{\text{int}} \end{bmatrix} & = \begin{bmatrix} \mathbf{A}-\mathbf{B}\mathbf{K}_{\mathbf{x}} & -\mathbf{B}\mathbf{K}_{\mathbf{x}} & -\mathbf{B}\mathbf{K}_{\mathbf{e}} \\ \cdot & \mathbf{A} - \mathbf{L} \mathbf{C} & \cdot \\ \mathbf{C}-\mathbf{D}\mathbf{K}_{\mathbf{x}} & -\mathbf{D}\mathbf{K}_{\mathbf{x}} & -\mathbf{D}\mathbf{K}_{\mathbf{e}} \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ \boldsymbol\varepsilon \\ \mathbf{e}_{\text{int}} \end{bmatrix} + \begin{bmatrix} \cdot \\ \cdot \\ - \mathbf{I} \end{bmatrix} \mathbf{y}_{\text{ref}} \\ \mathbf{y} & = \begin{bmatrix} \mathbf{C}-\mathbf{D}\mathbf{K}_{\mathbf{x}} & -\mathbf{D}\mathbf{K}_{\mathbf{x}} & -\mathbf{D}\mathbf{K}_{\mathbf{e}} \end{bmatrix} \begin{bmatrix} \mathbf{x} \\ \boldsymbol\varepsilon \\ \mathbf{e}_{\text{int}} \end{bmatrix} \ . \end{aligned}\end{split}\]

If the system has no direct influence of input on measurement output, i.e. \(\mathbf{D} = \mathbf{0}\), the system is triangular.

48.4.3.2.4. State estimator with process and measurement noise#

\[\begin{split}\left\{\begin{aligned} \dot{\mathbf{x}} & = \mathbf{A} \mathbf{x} + \mathbf{B} \mathbf{u} + \mathbf{B}_d \mathbf{d}\\ \mathbf{y} & = \mathbf{C} \mathbf{x} + \mathbf{D} \mathbf{u} + \mathbf{D}_d \mathbf{d} + \mathbf{D}_r \mathbf{r} \ , \end{aligned}\right.\end{split}\]

The state estimator problem with \(\mathbf{D}_d = \mathbf{0}\) and \(\mathbf{D}_\mathbf{r} = \mathbf{I}\) can be solved with the function \(\texttt{lqe}\) of the \(\texttt{control}\) library,

\[\begin{split}\begin{aligned} & \texttt{control.lqe(sys, Edd, Err, Edr)} \\ & \texttt{control.lqe(A, Bd, C, Edd, Err, Edr)} \\ \end{aligned}\end{split}\]

Optimal observer matrix
[[  3.44960563   3.08045383]
 [  3.08045383  34.19623533]
 [ 10.69448739  58.84138659]
 [ 57.12488834 589.43585336]]

"\n#> Method 2.\n# Use LQR duality: lqe(A, B, C, V, W) is dual to lqr(A.T, C.T, V, W)\n# Note: we use C.T because it replaces B in the dual problem\nL_transposed, P, E = ct.lqr(A.T, C.T, V, W)\n\n# The actual observer gain is the transpose of the 'feedback' result\nL = L_transposed.T\n"

48.4.4. Closed-loop testing#

48.4.4.1. Linearized system#

../../_images/d64123210fb8a47f6e170cab01fcd8ba24ff91b4be6cb413c18aeaf8de8fcccc.png

48.4.4.2. Non-linear system, around the unstable equilibrium#

Show code cell source Hide code cell source

from scipy.integrate import solve_ivp
from scipy.interpolate import interp1d

# solve_ivp(fun, t_span, y0, method="RK45", ..., args=tuple)

# fun(t, z, args=tuple)
def f(t, z, args):
    """
    State, z = [ x, vareps, e_int ]
    """

    #> args is supped to be a 1-element tuple with a dict
    args_di = args

    #> System parameters
    g, l = args_di["g"], args_di["l"]
    m, M = args_di["m"],  args_di["M"]
    cx, ca = args_di["cx"], args_di["ca"]

    #> Linearized system matrices
    A, B, C, D = args_di["A"], args_di["B"], args_di["C"], args_di["D"]

    #> Optimal control matrix
    Kx, Ke = args_di["Kx"], args_di["Ke"]
    #> State-estimator L matrix
    L = args_di["L"]

    #> Reference input: select the proper y_ref for the current time 't'
    y_ref_func = args_di["y_ref_func"]
    yref = y_ref_func(t) # Interpolates to get the value at time t
    # pos_ref, angle_ref = yref[0], yref[1]

    #> Motor physical limits
    motor_clipping = args_di["motor_clipping"]
    u_max = args_di["u_max"]

    #> State variables
    x, th, v, om, eps, e_int = z[0], z[1], z[2], z[3], z[4:8], z[8:]
    state = z[:4]

    #> Feed-back control
    u = - Kx @ state - Kx @ eps - Ke @ e_int

    if ( motor_clipping ):
        u = np.clip(u, -u_max, u_max)

        #> Need for anti-windup
        # if (u_ideal > u_max and (C @ x + D @ u - yref) > 0) or (u_ideal < -u_max and (C @ x + D @ u - yref) < 0):
        #     de_int = 0 # Stop integrating...
        # else:
        #     de_int = C @ x + D @ u - yref

    M_vel = np.array([
         [            M+m, -m*l*np.cos(th)],
         [-m*l*np.cos(th),          m*l**2]
    ])

    dx     = v
    dth    = om
    d_vel  = np.linalg.inv(M_vel) @ \
             np.array([
                  [-m*l*om**2*np.sin(th) + u[0] - cx*v],
                  [ m*g*l*np.sin(th) + u[1] - ca*om]
             ])
    deps   = ( A - L @ C ) @ eps
    de_int = C @ state + D @ u - yref

    return np.concatenate(([dx], [dth], d_vel.ravel(), deps, de_int))

../../_images/9f45519c042064019dde228dab8af9e914f3c2ac63d15fba30e601d877d21978.png