31.2.2. Full-state feedback (OLD)#
Optimal control can be recast as a constrained optimization problem, \(J\), where an extreme - optimum - of an objective function must be found, subject to constraints that include the equations of motion. Some constraints may be included into an augmented objective function \(\widetilde{J}\) with the methods of Lagrange multipliers.
Finite time vs. Infinite time horizon.
…
31.2.2.1. Generic ODE without exogenous inputs#
with initial condition \(\mathbf{x}(0) = \mathbf{x}_0\).
The objective function combines (weights) the error on a desired performance and the control input, in order to get the desired behavior with feasible control (that can be provided by actuators, without saturation, avoiding unnecessary high power input and too sharp behavior,…)
As an example, if the goal of the control \(\mathbf{u}\) is to keep the system around \(\mathbf{x} = \mathbf{0}\), the cost function to be minimized can be designed as
with \(\mathbf{Q} \ge 0\), and \(\mathbf{R} > 0\) and symmetric.
Constrained optimization
So that
Solution of the problem
Replacing \(\ \mathbf{u}\)
with \(\mathbf{u} = - \mathbf{R}^{-1} \left( \mathbf{S}^T \mathbf{x} + \partial_\mathbf{u} \mathbf{f}^T \boldsymbol\lambda \right)\).
Gradient descent
Start from a control law \(\mathbf{u}^{(0)}(t)\),
the state equation is integrated forward in time to get \(\mathbf{x}^{(0)}(t)\),
Lagrange multiplier equation is integrated backward in time to get \(\boldsymbol\lambda^{(0)}(t)\),
The control law is updated with an increment \(\delta \mathbf{u}\) that’s proportional to the gradient of the cost function (with “opposite direction” to get \(\delta \widetilde{J} < 0\)), i.e. \(\mathbf{u}^{(1)}(t) = \mathbf{u}^{(0)}(t) + \delta \mathbf{u}^{(0)}(t)\) with \(\delta \mathbf{u}^{(0)}(t) = - c \nabla_{\mathbf{u}(t)} J^{(0)}\), with a positive step \(c > 0\) and \(\nabla_{\mathbf{u}} J = \mathbf{S}^T \mathbf{x} + \mathbf{R} \mathbf{u} + \partial_{\mathbf{u}} \mathbf{f}^T \boldsymbol\lambda\).
Repeat previous steps with the updated control law, until convergence.
31.2.2.2. LTI#
Constrained optimization
As \(\mathbf{f}(\mathbf{x},\mathbf{u}) = \mathbf{A} \mathbf{x} + \mathbf{B}\mathbf{u}\), here \(\partial_{\mathbf{x}} \mathbf{f} = \mathbf{A}\) and \(\partial_{\mathbf{u}} \mathbf{f} = \mathbf{B}\). For the linear system (31.8) \(\mathbf{M} = \mathbf{I}\). Thus the equations (31.7) become
31.2.2.2.1. Infinite-horizon full-state feedback#
No need for an observer. The system is assumed to be stable. The augmented cost function reads
with given initial conditions \(\mathbf{x}(0) = \mathbf{x}_0\), so that \(\delta \mathbf{x}_0 = \mathbf{0}\). Using calculus of variations, the variations of the cost function w.r.t. \(\mathbf{x}\), \(\mathbf{u}\), \(\boldsymbol{\lambda}\) read
From the variation w.r.t. \(\mathbf{u}\), since \(\mathbf{R} > 0\) and thus innvertible,
Now, assuming the relation \(\boldsymbol{\lambda} = \mathbf{P} \mathbf{x}\), it follows
and comparing the two different expressions of \(\dot{\boldsymbol{\lambda}}\), if the equality holds for any \(\mathbf{x}\), the dynamical Riccati equation for \(\mathbf{P}\) is derived as
where \(\widetilde{\mathbf{A}} = \mathbf{A} - \mathbf{B} \mathbf{R}^{-1} \mathbf{S}^T \) and \(\widetilde{\mathbf{Q}} = \mathbf{Q} - \mathbf{S} \mathbf{R}^{-1} \mathbf{S}^T\). Riccati equation is a non-linear dynamical matrix equation in \(\mathbf{P}\). Algorithms for computing the solution of dynamical and algebraic equation exists, see Example 31.1.
Once \(\mathbf{P}\) is evaluated, the control law reads
For infinite-horizone, the algebraic equation (ARE) for the steady state is solved after setting \(\dot{\mathbf{P}} = \mathbf{0}\), the solution for a LTI system is a constant matrix \(\mathbf{P}\), and thus the control law is a proportional feedback on the full-state of the system,
with \(\mathbf{G} = \mathbf{R}^{-1} \left( \mathbf{S}^T + \mathbf{B}^T \mathbf{P} \right)\).
Example 31.1 (Solution of Riccati equation)
…
Properties. todo
\(\mathbf{P}\) symmetric? definite positive? …
…