Requirements for the TPC Simulator Project |
(Draft 1.0, Feb. 28, 1998)
This note defines the functionality of the new package TpcResponseSimulator. It is intended to serve as the blueprint for the writing of the general TPC repsonse simulator package in C++.
The detail with which the package treats each of the these effects should be flexible, allowing optimization of the balance of cpu time and precision of results for a given analysis task. In particular, the two approaches to the tpc simulation embodied in the current packages tss (explcit generation of electronics signal, with output of raw, digitized data) and tfs (bypassing the generation of raw data and cluster finding, outputting space points instead) should be included in the new package.
The algorithmic basis for the repsonse simulations should be that developed in detail by the ALEPH TPC group (described in the book by Blum and Rolandi, more schematically in the ALEPH handbook), expanded in ref (ii). The package should contain parametrizations of the TPC response at a variety of levels of detail, with clear paths to tuning the relevant parameters in the model to all available test data.
The main development of the package should be in the direction of a production level tool that accurately simulates the interaction between tracks in order to estimate the effects on physics of the environment (especially high track density) in the chamber. This is in contrast to a near-first-principles calculation of the chamber response to isolated tracks. Where these two requirements conflict, hooks should be created to allow the development of detailed simulations to occur separately or at a later date.
Two sources of ionization of the TPC gas should be treated: charged tracks and laser events. The charged tracks create clusters whose distribution of amplitudes (= number of electrons in cluster) is Landau-like, whereas the laser creates uniformly distributed ionization, resulting in Poisson distributed cluster amplitudes.
The magnetic field should be continuously variable from zero to the nominal maximum of the STAR TPC. Both ideal (cylindrical) fields and field maps should be handled.
The following field-related physics should be accounted for:
The essential parameters of the gas for these calculations (overall drift velocity, which may vary from event; omega-tau) should not be parametrized as a function of E and B but should be read from a database or set externally.
There should be no hard-wired geometry constants: all geometry constants should be read from an external database, even if they will never or rarely change (such as readout chamber wire pitch, clock frequency, or number of time buckets). All geometrical parameters must be read from a database which is also common to gstar and the reconstruction routines. The development of the geometry database is urgent and critical.
A variety of coordinate systems are needed by TpcResponseSimulator, corresponding for instance to those local to a pad row, a sector, a supersector, and the TPC. An interface should be developed between TPC and the geometry database, isolating the TpcResponseSimulator modules from the actual implementation of the database, and delivering coordinates in the required coordinate systems.
For TpcResponseSimulator, essential geometrical parameters include:
There should be the option to apply the inverse of calibration constants appropriate to a real or fictional data run, with the option to dither these quantities with a resolution appropriate to the calibration data. Examples of calibration constants are:
Tables of hot, dead and noisy pads, chips, cards or sectors should be optionally applied when calculating pad responses.
An interface to the calibration database should be developed to isolate TpcResponseSimulator code from actual implementation of the database.
Conversion from dE/dx to electrons requires knowledge of the ionization potential of gas.
How to set the average gain? Perhaps by comparison to mean of maximum ADC value in a cluster for min ionizing tracks.
Pad response function: formulae for the induction of charge on the padrow parallel and perpendicular to the wire direction and for convolution of the diffused charge profile with the shaper response are given in ref. (ii). Care should be taken to minimize the calls to the special functions, for instance by calculating or loading tables of the integrals for the expected range of parameters upon program initialization. During program execution, the integrals are then calculated by interpolation of the table entries. Standard math libraries exist to carry out this procedure to the required resolution.
Charge that is nominally deposited outside a padrow "volume" can drift into it or induce charge on its pads from the adjacent wires. For the inner sectors, charge in the "pseudopadrows" defined in GSTAR should be drifted and amplified to account for this effect. For the outer sectors, charge in each padrow should contribute also to neighbouring padrows. The consequence is that a window of three (pseudo)padrow volumes must be accessible at once: the padrow in which the current hit lies, and those adjacent to it. Care must be exercised to ensure efficient handling of data. See the description of data access and looping in reference (iii).
An important issue to investigate is the detail with which the charge deposition is modelled across the pad. For finite pad crossing angle or dip angle, the ionization statistics generate an irresolution in the induced charge pattern within the padrows, as well as correlations between adjacent padrows in the outer sectors. See discussion in Appendix A by Roy Bossingham. The current tss calculates the helical trajectory over the padrow from the information in the hit generated by GSTAR (mean location and 3-momemtum and total energy deposition of track segment) and divides up the energy deposition randomly, conserving the total. The compelling arguments in favour of this approach are the calculation of correlations in cluster shape and simplicity of parametrization. However, the computational cost vs. gain in precision must be evaluated in comparison to coarser parametrizations, and both should be implemented for the purpose of this comparison.
The effect of diffusion on the "declustering" (see Blum and Rolandi, section 6.2.3) should be taken into account. This interacts with the previous point. Due to the absence of field wires, an irresolution in the time of arrival at the pad can occur due to varying path lengths of drifting electrons in the vicinity of the amplification cell (see Fig. 6.2 of Blum and Rolandi; also measurements by W. Betts et al.). However, diffusion may wash this effect out. What is expected magnitude and should it be implemented?
The difference in practice of correlated vs uncorrelated noise should be established (is there a STAR Note on this?). If correlated noise is important, an appropriate model should be implemented (or parametrized?). What about multiple sources of noise?
Optionally, pulse shape up to long drift time (10s of musec) should be available to calculate possible baseline shift at high track density.
Optionally, charge should be integrated as a function of time. Preamp saturation should be accounted for (what is known from bench tests on preamp saturation?). The main reason to implement this is that low energy delta electrons spiral in the magnetic field, circling over and over above a limited number of pads. These pads will saturate and the charge coming later in the drift time will not be registered. In other words, the delta electrons drill holes in the charge deposition. This effect will increase with increasing track density. (To make a proper estimate of this effect, GSTAR must properly generate and propagate the delta electrons. This is known to be reasonably well done in Geant, but must be retested.)
Z to time conversion is done asuming drift velocity and clock frequency. Effect of gate delay should be accounted for in defining effective drift volume.
Digitization occurs after addition of noise, possible baseline shift, etc. Data should be packed appropriately into 10 bits (nonlinear ADC), including saturation effects.
Zero suppression algorithm should be applied last.
What are appropriate output tables?
The current tfs parametrizes sections 8-10 above plus the effect of the cluster finder. What is appropriate implementation in this framework?
Sampling from real cluster distributions: extraction of real clusters from isolated tracks (low multiplicity events) in place of sections 8-10 above. While this technique has the obvious advantage of reality, it also has some problems: either it draws from an infinite library of clusters, or the clusters are only approximately appropriate to the MC track to which they are assigned (PID, momentum, dip angle, phasing wrt sector bondaries and pads, hot, dead and noisy pads, etc.) How is the library catalogued for efficient and accurate access to data? These are solvable but not simple problems. Dave H. points out the special importance of this approach near sector edges. Needs investigation.
**************************************************
The present ionization treatment in tss, due to Mike Lisa, starts with the Geant energy loss over a step (one pad row), then allows successive binary subdivision of the step, with each subdivision drawing its fractional share of the total energy loss according to Mike's parameterization of a cluster-model subroutine (mine). In principle, this should be accurate: Geant's PAI model is pretty good, as long as the total energy loss over a step is above the Ar atomic levels (see SN0249); and cluster-size measurements exist for Ar/CH4. The results should be better than one would get from using equally short steps directly in Geant.
In practice, the shape of the tss resolution vs. crossing angle curves is not quite what we expect, and we ran into some convergence problems in the numerical routine---so the specific implementation should be rechecked---but I don't know of anything conceptually wrong with the approach.
We also know that tss runs slowly, but the code has not been profiled. At one point, Iwona made some changes to the noise routines and got a large speedup, suggesting that the dominant CPU usage could be unrelated to the ionization treatment. An off-the-cuff reaction is that we shouldn't simplify tss before we see realistic results; one can always provide faster options if the full-blown simulation is shown to be unnecessary for a particular application.
Questions for a new tss are:
I can only answer (1) for myself: I'd like to use it to improve the tcl cluster finder; we currently have no other way to study its operation in a dense hit environment. Similarly, I would like to be able to reproduce tracking in a dense track environment, including the full range of ionization.
We can't really answer (3), because we lack basic information, but on the one hand, we should make tss as simple as possible and no simpler; on the other hand, we shouldn't put in more detail than we can afford. These are probably contradictory, but 10-20% accuracy may be practical.
So, consider (2): what do we get with ionization simulation, compared to sampling from a parameterized resolution? The short answer is that one automatically includes several significant correlations:
Aside from correlations, we get some other things: