Track Fit by the Kalman Filter and Smoother

As already noted, using Kalman filtering techniques is one of the basic principles of the CATS track reconstruction strategy. The Kalman filters and Kalman-type parameter estimators are embedded in

The Kalman filter addresses the general problem of trying to estimate the state vector $ R$ of a discrete-time process that is governed by the linear stochastic difference equation

$\displaystyle R_k=A_{k-1} R_{k-1} + \nu_{k-1},\qquad k=1,\dots,N,$ (4.1)

where the matrix $ A_{k-1}$ relates the state at step $ k-1$ to the state at step $ k$. $ \{\nu\}$ is a process noise, a sequence of independent Gaussian variables which can, for example, account for the multiple scattering influence on the state vector. Within the model (4.1) a track in the Pattern Tracker can be described as a straight-line motion in the presence of Gaussian disturbances:

\begin{displaymath}\begin{array}{ll} x_k=x_{k-1}+t_{x\,k-1}(z_k-z_{k-1}),& \qqua...
...{x\,k-1},& \qquad t_{y\,k}=t_{y\,k-1}+\nu_{y\,k-1}. \end{array}\end{displaymath} (4.2)

The state vector $ R_k=\Bigl(x,\,y,\,t_x,\,t_y\Bigr)^T$ describes the track parameters taken at the detector plane with $ z=z_k$. $ T$ denotes transposition, the random variables $ \nu_{x\,k}$, $ \nu_{y\,k}$ describe the influence of multiple scattering on the track when it passes through detector plane $ z=z_k$, $ k=0,1,\dots$. According to the track model (4.2), the matrix $ A$ in equation (4.1) has the form

$\displaystyle A_k=\left(\begin{array}{cccc}
1 & 0 & \Delta z_k & 0\\
0 & 1 & 0...
...0 & 0 & 1
\end{array},\right)
\quad \Delta z_k=z_k-z_{k-1},\quad k=1,2,\dots.
$

In CATS the detector volumes and insensitive walls are treated as so-called thin scatterers [#!ranger!#]. For such scatterers, the non-zero elements $ q_{ij}$ of the covariance matrix $ Q_k$ of the noise vector $ \nu=(\nu_x,\nu_y)$ equal to

\begin{displaymath}
\begin{array}{l}
q_{33}=p^2\sigma^2_{MS}(1+t_x^2)(1+t_x^2+t_...
...2_{MS}(1+t_y^2)(1+t_x^2+t_y^2)\sqrt{1+t_x^2+t_y^2},
\end{array}\end{displaymath}

where $ p$ is an external estimate of the inverse momentum of the particle, $ \sigma_{MS}^2$ is the mean variance of the multiple scattering angle for a 1 GeV particle.

The input to the filter is a sequence of measurements $ \{u\}$ which are described by a linear function of the state vector $ R$

$\displaystyle u_k=H_k R_k + \eta_k,\qquad k=1,\dots,N,$ (4.3)

where the matrix $ H_k$ in the measurement equation (4.3) relates the state to the measurement $ u_k$, $ \eta_k$ is a sequence of Gaussian random variables with the covariance matrix $ V_k$.

The ITR and OTR have different measurement models. In addition, the OTR measurement model needs linearization. The matrix $ H_k$ for the ITR reads:

$\displaystyle H_k=\Bigl[\cos\alpha_k, \:-\sin\alpha_k, \: 0, \: 0\Bigr],
$

where $ \alpha_k$ is the rotation angle of the strips in the ITR plane $ z=z_k$. The linearized measurement matrix for the OTR has the form:

$\displaystyle H_k=\frac{1}{\sqrt{1+t_{uk}^2}}
\Bigl[\cos\alpha_k, \:-\sin\alpha...
...cos\alpha_k}{1+t_{uk}^2} ,
\:\frac{\Delta u_k\sin\alpha_k}{1+t_{uk}^2}\Bigr],
$

where $ \alpha_k$ is the rotation angle of the sensitive wires in the OTR plane $ z=z_k$, and

$\displaystyle t_{uk}=t_{xk}\cos\alpha_k-t_{yk}\sin\alpha_k, \quad
\Delta u_k=x_k\cos\alpha_k-y_k\sin\alpha_k-w_k,
$

where $ w_k$ is the $ x$-coordinate of the $ k$-th wire in the rotated coordinate system.

Let's define $ \widehat{R}_k$ to be a state estimate at step $ k$ after processing measurement $ u_k$. The main idea of the Kalman filter is that the optimal (in mean-square sense) estimate $ \widehat{R}_k$ should be the sum of an extrapolated estimate $ \widetilde{R}_k$ and a weighted difference between an actual measurement $ u_k$ and a measurement prediction $ H_k\widetilde{R}_k$

$\displaystyle \widehat{R}_k=\widetilde{R}_k+K_k(u_k - H_k\widetilde{R}_k),$    where $\displaystyle \widetilde{R}_k=A_{k-1}\widehat{R}_{k-1}.
$

The matrix $ K_k$ is called the filter gain and is chosen to minimize the sum of diagonal elements of an estimation error covariance matrix $ \widehat{\Gamma}_k$. By definition,

$\displaystyle \widehat{\Gamma}_k={\bf E}\Bigl[(R_k-\widehat{R}_k)(R_k-\widehat{R}_k)^T\Bigr],
$

where E denotes the mathematical expectation. Note, that for both detectors, OTR and ITR, the measurement models are scalar. In this case the minimization leads to the following formula for $ K_k$

$\displaystyle K_k=\frac{\widetilde{\Gamma}_k H_k^T}{V_k+H_k\widetilde{\Gamma}_k H_k^T},
$

where $ \widetilde{\Gamma}_k$ is the extrapolated estimation error covariance matrix $ \widehat{\Gamma}_{k-1}$. The formula for $ \widetilde{\Gamma}_k$ follows from equation (4.1):

$\displaystyle \widetilde{\Gamma}_k=A_{k-1}\widehat{\Gamma}_{k-1}A^T_{k-1} + Q_{k-1}.
$

The new minimized value of the error covariance matrix $ \widehat{\Gamma}_k$ is defined by the equation

$\displaystyle \widehat{\Gamma}_k=\left(I-K_k H_k\right)\widetilde{\Gamma}_k,
$

where $ I$ is the unity matrix.

The computational algorithm of the discrete Kalman filter consists of two steps:

  1. Prediction step -- extrapolation of the estimate $ \widehat{R}$ and the error covariance matrix $ \widehat{\Gamma}$ to the next step of the algorithm.

    $\displaystyle \widetilde{R}_k=A_{k-1}\widehat{R}_{k-1},\qquad \widetilde{\Gamma}_k=A_{k-1}\widehat{\Gamma}_{k-1}A^T_{k-1}+Q_{k-1}.$ (4.4)

  2. Filtering step
    1. the gain matrix calculation

      $\displaystyle K_k=\frac{\widetilde{\Gamma}_k H_k^T}{V_k+H_k\widetilde{\Gamma}_k H_k^T},
$ (4.5)

    2. the updated estimate

      $\displaystyle \widehat{R}_k=\widetilde{R}_k+K_k(u_k - H_k\widetilde{R}_k),$ (4.6)

    3. the updated error covariance matrix

      $\displaystyle \widehat{\Gamma}_k=\left(I-K_k H_k\right)\widetilde{\Gamma}_k.$ (4.7)

The two steps, prediction and filtering, are repeated until all measurements are processed.

In order to speed up the CATS reconstruction procedure, all fitting routines are written using the optimized numerical implementation of the Kalman filter algorithm described in [#!CATS-NIM!#]. In general, there are several ways to reduce the computational cost of the standard Kalman filter/smoother algorithm:

The most time-consuming parts of the filtering step are the calculation of the gain matrix and the updated covariance matrix. Let's rewrite (4.5) as follows

$\displaystyle K_k=s_k^{-1}B_k,
$

where the vector $ B$ and the scalar $ s$ are

$\displaystyle B_k=\widetilde{\Gamma}_k H_k^T,\qquad s_k=V_k+B_k^T H_k^T.
$

In terms of $ B$ and $ s$ the filtering step can be simplified. The optimized update of the triangular covariance matrix reads

$\displaystyle \widehat{\Gamma}_{ij}=\widetilde{\Gamma}_{ij}-B_i K_j, \quad i=1,\dots, 4, \quad j=i,\dots 4.$ (4.8)

In order to get smoothed estimates at each point of a track, CATS employs the standard backward Kalman smoother. The smoother is an recursive algorithm that starts with the last point $ k=N$ and updates the estimates and their covariance matrix at the next point $ k=N-1$ using the estimates and the covariance matrix given by the Kalman filter at the last point. The update of the estimate $ \widehat{R}_{N-1}$ is described as follows

$\displaystyle \widehat{R}_{N-1}^s=\widehat{R}_{N-1}+C_{N-1}\left(\widehat{R}^s_N-A_{N-1}\widehat{R}_{N-1}\right),
$

where superscript ``s'' denotes a smoothed value, $ C_{N-1}$ is the smoother gain given by the equation

$\displaystyle C_{N-1}=\widehat{\Gamma}_{N-1}A_{N-1}^T\widetilde{\Gamma}_{N}^{-1}.
$

The calculation of $ C_{N-1}$ requires the inversion of the $ 4\times4$ symmetrical matrix $ \widehat{\Gamma}_N$ at each point. The smoothed covariance matrix reads

$\displaystyle \widehat{\Gamma}^s_{N-1}=\widehat{\Gamma}_{N-1}+C_{N-1}\left(\widehat{\Gamma}^s_N-\widetilde{\Gamma}_{N}
\right)C_{N-1}^T.
$

By definition $ \widehat{\Gamma}^s_N=\widehat{\Gamma}_N$ and $ \widehat{R}^s_N=\widehat{R}_N$. After updating point $ k=N-1$ the smoother proceeds with point $ k=N-2$ and so on until it reaches the first point $ k=0$. To update the $ k$-th point the smoother uses smoothed values of $ \widehat{R}^s$ and $ \widehat{\Gamma}^s$ already calculated at the previous, $ (k+1)$-th, point.

Yury Gorbunov 2010-10-21