Track Fit by the Kalman Filter and Smoother

As already noted, using Kalman filtering techniques is one of the basic principles of the CATS track reconstruction strategy. The Kalman filters and Kalman-type parameter estimators are embedded in

space-point parameter estimation during space-point reconstruction;
propagation of track candidates through empty superlayers and gathering of missed hits during the track following procedure;
outlier detection and elimination;
smoothed refit of reconstructed segments. This refit includes the treatment of multiple scattering effects using a preliminary estimate of the track momentum. It is the final step of track reconstruction.

The Kalman filter addresses the general problem of trying to estimate the state vector of a discrete-time process that is governed by the linear stochastic difference equation

$\displaystyle R_k=A_{k-1} R_{k-1} + \nu_{k-1},\qquad k=1,\dots,N,$

(4.1)

where the matrix $A_{k-1}$ relates the state at step

to the state at step

. $\{\nu\}$ is a process noise, a sequence of independent Gaussian variables which can, for example, account for the multiple scattering influence on the state vector. Within the model (4.1) a track in the Pattern Tracker can be described as a straight-line motion in the presence of Gaussian disturbances:

$\begin{displaymath}\begin{array}{ll} x_k=x_{k-1}+t_{x\,k-1}(z_k-z_{k-1}),& \qqua... ...{x\,k-1},& \qquad t_{y\,k}=t_{y\,k-1}+\nu_{y\,k-1}. \end{array}\end{displaymath}$

(4.2)

The state vector $R_k=\Bigl(x,\,y,\,t_x,\,t_y\Bigr)^T$ describes the track parameters taken at the detector plane with

denotes transposition, the random variables $\nu_{x\,k}$ , $\nu_{y\,k}$ describe the influence of multiple scattering on the track when it passes through detector plane

, $k=0,1,\dots$ . According to the track model (4.2), the matrix

in equation (4.1) has the form

$\displaystyle A_k=\left(\begin{array}{cccc} 1 & 0 & \Delta z_k & 0\\ 0 & 1 & 0... ...0 & 0 & 1 \end{array},\right) \quad \Delta z_k=z_k-z_{k-1},\quad k=1,2,\dots.$

In CATS the detector volumes and insensitive walls are treated as so-called thin scatterers [#!ranger!#]. For such scatterers, the non-zero elements $q_{ij}$ of the covariance matrix

of the noise vector $\nu=(\nu_x,\nu_y)$ equal to

$\begin{displaymath} \begin{array}{l} q_{33}=p^2\sigma^2_{MS}(1+t_x^2)(1+t_x^2+t_... ...2_{MS}(1+t_y^2)(1+t_x^2+t_y^2)\sqrt{1+t_x^2+t_y^2}, \end{array}\end{displaymath}$

where

is an external estimate of the inverse momentum of the particle, $\sigma_{MS}^2$ is the mean variance of the multiple scattering angle for a 1 GeV particle.

The input to the filter is a sequence of measurements $\{u\}$ which are described by a linear function of the state vector

$\displaystyle u_k=H_k R_k + \eta_k,\qquad k=1,\dots,N,$

(4.3)

where the matrix

in the measurement equation (4.3) relates the state to the measurement

, $\eta_k$ is a sequence of Gaussian random variables with the covariance matrix

The ITR and OTR have different measurement models. In addition, the OTR measurement model needs linearization. The matrix for the ITR reads:

$\displaystyle H_k=\Bigl[\cos\alpha_k, \:-\sin\alpha_k, \: 0, \: 0\Bigr],$

where $\alpha_k$ is the rotation angle of the strips in the ITR plane

. The linearized measurement matrix for the OTR has the form:

$\displaystyle H_k=\frac{1}{\sqrt{1+t_{uk}^2}} \Bigl[\cos\alpha_k, \:-\sin\alpha... ...cos\alpha_k}{1+t_{uk}^2} , \:\frac{\Delta u_k\sin\alpha_k}{1+t_{uk}^2}\Bigr],$

where $\alpha_k$ is the rotation angle of the sensitive wires in the OTR plane

, and

$\displaystyle t_{uk}=t_{xk}\cos\alpha_k-t_{yk}\sin\alpha_k, \quad \Delta u_k=x_k\cos\alpha_k-y_k\sin\alpha_k-w_k,$

where

is the

-coordinate of the

-th wire in the rotated coordinate system.

Let's define $\widehat{R}_k$ to be a state estimate at step after processing measurement . The main idea of the Kalman filter is that the optimal (in mean-square sense) estimate $\widehat{R}_k$ should be the sum of an extrapolated estimate $\widetilde{R}_k$ and a weighted difference between an actual measurement and a measurement prediction $H_k\widetilde{R}_k$

$\displaystyle \widehat{R}_k=\widetilde{R}_k+K_k(u_k - H_k\widetilde{R}_k),$ where $\displaystyle \widetilde{R}_k=A_{k-1}\widehat{R}_{k-1}.$

The matrix

is called the filter gain and is chosen to minimize the sum of diagonal elements of an estimation error covariance matrix $\widehat{\Gamma}_k$ . By definition,

$\displaystyle \widehat{\Gamma}_k={\bf E}\Bigl[(R_k-\widehat{R}_k)(R_k-\widehat{R}_k)^T\Bigr],$

where E denotes the mathematical expectation. Note, that for both detectors, OTR and ITR, the measurement models are scalar. In this case the minimization leads to the following formula for

$\displaystyle K_k=\frac{\widetilde{\Gamma}_k H_k^T}{V_k+H_k\widetilde{\Gamma}_k H_k^T},$

where $\widetilde{\Gamma}_k$ is the extrapolated estimation error covariance matrix $\widehat{\Gamma}_{k-1}$ . The formula for $\widetilde{\Gamma}_k$ follows from equation (4.1):

$\displaystyle \widetilde{\Gamma}_k=A_{k-1}\widehat{\Gamma}_{k-1}A^T_{k-1} + Q_{k-1}.$

The new minimized value of the error covariance matrix $\widehat{\Gamma}_k$ is defined by the equation

$\displaystyle \widehat{\Gamma}_k=\left(I-K_k H_k\right)\widetilde{\Gamma}_k,$

where

is the unity matrix.

The computational algorithm of the discrete Kalman filter consists of two steps:

Prediction step -- extrapolation of the estimate $\widehat{R}$ and the error covariance matrix $\widehat{\Gamma}$ to the next step of the algorithm.

$\displaystyle \widetilde{R}_k=A_{k-1}\widehat{R}_{k-1},\qquad \widetilde{\Gamma}_k=A_{k-1}\widehat{\Gamma}_{k-1}A^T_{k-1}+Q_{k-1}.$ (4.4)
Filtering step
1. the gain matrix calculation
  
  $\displaystyle K_k=\frac{\widetilde{\Gamma}_k H_k^T}{V_k+H_k\widetilde{\Gamma}_k H_k^T},$ (4.5)
2. the updated estimate
  
  $\displaystyle \widehat{R}_k=\widetilde{R}_k+K_k(u_k - H_k\widetilde{R}_k),$ (4.6)
3. the updated error covariance matrix
  
  $\displaystyle \widehat{\Gamma}_k=\left(I-K_k H_k\right)\widetilde{\Gamma}_k.$ (4.7)

The two steps, prediction and filtering, are repeated until all measurements are processed.

In order to speed up the CATS reconstruction procedure, all fitting routines are written using the optimized numerical implementation of the Kalman filter algorithm described in [#!CATS-NIM!#]. In general, there are several ways to reduce the computational cost of the standard Kalman filter/smoother algorithm:

calculate the covariance matrix in triangular form taking its symmetry into account;
optimize the procedure to update the covariance matrix.

The most time-consuming parts of the filtering step are the calculation of the gain matrix and the updated covariance matrix. Let's rewrite (4.5) as follows

$\displaystyle K_k=s_k^{-1}B_k,$

where the vector

and the scalar

are

$\displaystyle B_k=\widetilde{\Gamma}_k H_k^T,\qquad s_k=V_k+B_k^T H_k^T.$

In terms of

and

the filtering step can be simplified. The optimized update of the triangular covariance matrix reads

$\displaystyle \widehat{\Gamma}_{ij}=\widetilde{\Gamma}_{ij}-B_i K_j, \quad i=1,\dots, 4, \quad j=i,\dots 4.$

(4.8)

In order to get smoothed estimates at each point of a track, CATS employs the standard backward Kalman smoother. The smoother is an recursive algorithm that starts with the last point and updates the estimates and their covariance matrix at the next point using the estimates and the covariance matrix given by the Kalman filter at the last point. The update of the estimate $\widehat{R}_{N-1}$ is described as follows

$\displaystyle \widehat{R}_{N-1}^s=\widehat{R}_{N-1}+C_{N-1}\left(\widehat{R}^s_N-A_{N-1}\widehat{R}_{N-1}\right),$

where superscript ``s'' denotes a smoothed value, $C_{N-1}$ is the smoother gain given by the equation

$\displaystyle C_{N-1}=\widehat{\Gamma}_{N-1}A_{N-1}^T\widetilde{\Gamma}_{N}^{-1}.$

The calculation of $C_{N-1}$ requires the inversion of the $4\times4$ symmetrical matrix $\widehat{\Gamma}_N$ at each point. The smoothed covariance matrix reads

$\displaystyle \widehat{\Gamma}^s_{N-1}=\widehat{\Gamma}_{N-1}+C_{N-1}\left(\widehat{\Gamma}^s_N-\widetilde{\Gamma}_{N} \right)C_{N-1}^T.$

By definition $\widehat{\Gamma}^s_N=\widehat{\Gamma}_N$ and $\widehat{R}^s_N=\widehat{R}_N$ . After updating point

the smoother proceeds with point

and so on until it reaches the first point

. To update the

-th point the smoother uses smoothed values of $\widehat{R}^s$ and $\widehat{\Gamma}^s$ already calculated at the previous,

-th, point.

Yury Gorbunov 2010-10-21