Balanced truncation

40.1. Balanced truncation#

While Kalman decomposition provides the theoretical results for the existence of 4 subspaces of a linear dynamical system, and the transformation of coordinates that partition the dynamical systems using reachability and (non-)observatilibty as criteria, this decomposition is likely to be almost useless in many applications: it relies on a sharp distinction between zero and non-zero observability/controllability while these properties can be represented with non-binary numbers but real numbers, and suffers from the limits of finite algebra (the result of a numerical algorithm involving real numbers is unlikely to be \(0\), and thus it’s likely that we need to answer questions like “is \(10^{-9}\) zero enough?” in some step of the algorithm). Moreover, the most interesting part of the system is likely to be the reachable and observable part1: balanced truncation doesn’t suffer from the issues of a full Kalman decomposition while ranking dynamics of the system using controllability and reachability as a criterion.

For finite-time horizon problems, controllability Gramian \(\mathbf{W}_c\) and observability Gramian \(\mathbf{W}_o\) can be computed as the solution of the Lyapunov equation

\[\begin{split}\begin{aligned} \mathbf{0} & = \mathbf{A} \mathbf{W}_c + \mathbf{W}_c \mathbf{A}^T + \mathbf{B} \mathbf{B}^T \\ \mathbf{0} & = \mathbf{A}^T \mathbf{W}_o + \mathbf{W}_o \mathbf{A} + \mathbf{C}^T \mathbf{C} \ . \end{aligned}\end{split}\]

Balanced truncation exploits two pairs of combined scaling+rotation transformations, obtained with singular value decomposition of the Gramians of the system.

Method 1.

Singular value decomposition of the controllability Gramian \(\mathbf{W}_c = \mathbf{U}_c \boldsymbol\Sigma_c^2 \mathbf{U}_c^*\)
First coordinate transformation, \(\mathbf{x} = \mathbf{T}_c \mathbf{x}_1\), with \(\mathbf{T}_c = \mathbf{U}_c \boldsymbol\Sigma_c\), so that \(\mathbf{A}_1 = \mathbf{T}_c^{-1} \mathbf{A} \mathbf{T}_c\), and the observability Gramian of the transformed system can be expressed as \(\mathbf{W}_{o,1} = \mathbf{T}_c^{T} \mathbf{W}_o \mathbf{T}_c\)
Singular value decomposition of \(\mathbf{W}_{o,1} = \mathbf{U}_{o,1} \boldsymbol\Sigma_{o,1}^2 \mathbf{U}_{o,1}^*\)
Second transformation \(\mathbf{x}_1 = \mathbf{T}_o \mathbf{x}_2\), with \(\mathbf{T}_o = \mathbf{U}_{o,1} \boldsymbol\Sigma_{o,1}^{-1/2}\)
Thus, the Gramians in the twice transformed system, with \(\mathbf{T} = \mathbf{T}_c \mathbf{T}_o\), read

\[\begin{split}\begin{aligned} \mathbf{W}_{c,2} & = \mathbf{T}^{-1} \mathbf{W}_{c} \mathbf{T}^{-T} = \left( \boldsymbol\Sigma_{o,1}^{1/2} \mathbf{U}_{o,1}^* \boldsymbol\Sigma_c^{-1} \mathbf{U}_c^* \right) \left( \mathbf{U}_c \boldsymbol\Sigma_c^2 \mathbf{U}_c^* \right) \left( \mathbf{U}_c \boldsymbol\Sigma_c^{-1} \mathbf{U}_{o,1} \boldsymbol \Sigma_{o,1}^{1/2} \right) & = \boldsymbol\Sigma_{o,1} \\ \mathbf{W}_{o,2} & = \mathbf{T}_o^T \mathbf{W}_{o,1} \mathbf{T}_o = \left( \boldsymbol\Sigma_{o,1}^{-1/2} \mathbf{U}_{o,1}^* \right) \left( \mathbf{U}_{o,1} \boldsymbol\Sigma_{o,1}^2 \mathbf{U}_{o,1}^* \right) \left( \mathbf{U}_{o,1} \boldsymbol\Sigma_{o,1}^{-1/2} \right) & = \boldsymbol\Sigma_{o,1} \ , \end{aligned}\end{split}\]

i.e. they’re equal. The pair of transformations makes Gramians equal and diagonal, and the elements of the diagonal - called Hankel values of the system - simultaneously measure the controllability and the observability of the system.

todo here inverse of the diagonal singular value matrices are assumed to exists, i.e. there’s no non-observable or non-controllable state.

The transformed system is then partitioned as

\[\begin{split}\left\{ \begin{aligned} \frac{d}{dt} \begin{bmatrix} \mathbf{z}_1 \\ \mathbf{z}_2 \end{bmatrix} & = \begin{bmatrix} \mathbf{A}_{11} & \mathbf{A}_{12} \\ \mathbf{A}_{21} & \mathbf{A}_{22} \end{bmatrix} \begin{bmatrix} \mathbf{z}_1 \\ \mathbf{z}_2 \end{bmatrix} + \begin{bmatrix} \mathbf{B}_{1} \\ \mathbf{B}_{2} \end{bmatrix} \mathbf{u} \\ \mathbf{y} & = \begin{bmatrix} \mathbf{C}_1 & \mathbf{C}_2 \end{bmatrix} \begin{bmatrix} \mathbf{z}_1 \\ \mathbf{z}_2 \end{bmatrix} + \mathbf{D} \mathbf{u} \end{aligned} \right.\end{split}\]

with \(\mathbf{x} = \mathbf{T} \mathbf{z}\), and thus

\[\mathbf{z} = \mathbf{T}^{-1} \mathbf{x} = \boldsymbol\Sigma_{o,1}^{1/2} \mathbf{U}_{o,1}^* \boldsymbol\Sigma_c^{-1} \mathbf{U}_c^*\]

\[\begin{split}\begin{aligned} \begin{bmatrix} \mathbf{z}_1 \\ \mathbf{z}_2 \end{bmatrix} & = \begin{bmatrix} \boldsymbol\Sigma_{o,1;1}^{1/2} & \mathbf{0} \\ \mathbf{0} & \boldsymbol\Sigma_{o,1;2}^{1/2} \end{bmatrix} \begin{bmatrix} \mathbf{U}_{o,1;11}^* & \mathbf{U}_{o,1;12}^* \\ \mathbf{U}_{o,1;21}^* & \mathbf{U}_{o,1;22}^* \end{bmatrix} \begin{bmatrix} \boldsymbol\Sigma_{c ;1}^{-1 } & \mathbf{0} \\ \mathbf{0} & \boldsymbol\Sigma_{c;2}^{-1} \end{bmatrix} \begin{bmatrix} \mathbf{U}_{c;11}^* & \mathbf{U}_{c;12}^* \\ \mathbf{U}_{c;21}^* & \mathbf{U}_{c;22}^* \end{bmatrix} \mathbf{x} = \\ & = \begin{bmatrix} \boldsymbol\Sigma_{o,1;1}^{1/2}\mathbf{U}_{o,1;11}^* & \boldsymbol\Sigma_{o,1;1}^{1/2}\mathbf{U}_{o,1;12}^* \\ \boldsymbol\Sigma_{o,1;2}^{1/2} \mathbf{U}_{o,1;21}^* & \boldsymbol\Sigma_{o,1;2}^{1/2} \mathbf{U}_{o,1;22}^* \end{bmatrix} \begin{bmatrix} \boldsymbol\Sigma_{c ;1}^{-1 }\mathbf{U}_{c;11}^* & \boldsymbol\Sigma_{c ;1}^{-1 }\mathbf{U}_{c;12}^* \\ \boldsymbol\Sigma_{c;2}^{-1}\mathbf{U}_{c;21}^* & \boldsymbol\Sigma_{c;2}^{-1}\mathbf{U}_{c;22}^* \end{bmatrix} \mathbf{x} = \\ \end{aligned}\end{split}\]

todo Show if this algorithm allows to compute only the desired part of the svd, allowing the incremental computation, and not requiring the full svd computation

Method 2.

Cholesky decomposition (for positive semi-definite matrices) of the controllability and observability Gramians \(\mathbf{W}_c = \mathbf{L}_c \mathbf{L}_c^*\), \(\mathbf{W}_o = \mathbf{L}_o \mathbf{L}_o^*\)
Compute the SVD of the product of the Cholesky factors, \(\mathbf{L}_o^* \mathbf{L}_c = \mathbf{U} \boldsymbol\Sigma \mathbf{V}^*\)
Partition

\[\begin{split} \boldsymbol\Sigma = \begin{bmatrix} \boldsymbol\Sigma_1 & \mathbf{0} \\ \mathbf{0} & \boldsymbol\Sigma_2 \end{bmatrix} \quad , \quad \mathbf{U} = \begin{bmatrix} \ \mathbf{U}_1 & \mathbf{U}_2 \ \end{bmatrix} \quad , \quad \mathbf{V} = \begin{bmatrix} \ \mathbf{V}_1 & \mathbf{V}_2 \ \end{bmatrix} \end{split}\]
Projection matrices \(\mathbf{T} = \mathbf{L}_c \mathbf{V}_1 \boldsymbol\Sigma_1^{-1/2}\), \(\widetilde{\mathbf{T}} = \boldsymbol\Sigma_1^{-1/2} \mathbf{U}_1^* \mathbf{L}_o^*\)2,

\[\begin{split}\begin{aligned} \mathbf{A}_r & = \widetilde{\mathbf{T}} \mathbf{A} \mathbf{T} \\ \mathbf{B}_r & = \widetilde{\mathbf{T}} \mathbf{B} \\ \mathbf{C}_r & = \mathbf{C} \mathbf{T} \\ \mathbf{D}_r & = \mathbf{D} \\ \end{aligned}\end{split}\]

1: The non-observable or non-controllable parts of a system should not contain any unstable dynamics, otherwise it can be measured by the sensors, and/or controlled by the actuators. Observability and controllability are strictly related to the choice of sensors and actuators. If any uncontrollable unstable dynamics exists, a different choice of actuators may solve this issue. If any unobservable unstable dynamics exists, a different choice of sensors may solve the issue.
2: This can be interpreted as a Petrov-Galerkin projection (todo link to Finite Element Method?), i.e. a model reduction with different base and test basis: here, base vectors are the columns of the matrix \(\mathbf{T}\) while test vectors are the rows of the matrix \(\widetilde{\mathbf{T}}\).