Offline QA Shifts

Peter Jacobs, July 11, 2000

This document is a first try at describing procedures for the Offline QA shift crew. As you will see, there are a number of open questions concerning what should be done during this shift and how to do it, whose answers we will have only after we gain experience with real data. Please give feedback to the QA experts on what you find confusing, what could be done better, and what doesn't make any sense to you.

Scope of the Offline QA shift activities

Large scale production of real data: data that will be used for physics analysis
Large scale production of MC data: MC data that will be used for detailed physics studies and corrections for data analysis.
Nightly tests of real and MC data: limited number of events run in the DEV or NEW branches of the library. These are used to test the libraries and validate them prior to a new release and migration DEV->NEW->PRO.
Express queue of real data: a small fraction (~5%?) of real data will be channeled to an express production queue during the running of the experiment, to serve as feedback to the crews running the experiment. The results of this production should be reported as soon as possible, typically at the 5 p.m. meeting in the counting house.

The autoQA system can apply arbitrary sets of "tests" to the scalars extracted from the data by the QA macros, raising errors or warnings when these tests are failed. Which tests and what cuts to apply to real data are complex issues that can only be addressed after we gain some experience. Consequently, the automated testing aspects of the autoQA framework will be applied only at a very low level for real data for this summer's run. The decision about the quality of the data will have to be made by the shift crew, i.e. you, by looking at the data in detail.

Use of autoQA

autoQA web page

here

There have however been many changes behind the scenes. The major changes are

autoQA now interfaces to the MySQL databases. It queries the Production File Catalog for completed jobs, and writes QA information back to a QA database. The latter can be used in future in the tag DB or some other mechanism, once a reliable QA cycle is established.
autoQA can now handle the range of data classes specified in the introduction.
All QA ROOT jobs are now run on rcas under LSF. This change was necessary in anticipation of a large volume of QA processes once large scale data taking starts. This of course also introduces another layer of complexity into the QA framework, and monitoring of autoQA jobs on rcas will be part of the QA shift work.

Offline QA Shift Tasks

Which runs to examine?

Since the autoQA mechanism queries the File Catalog once an hour (for real data, less frequently for other data classes) and submits QA batch jobs on rcas, there may be a significant delay between when production is run and when the QA results become available. We will have to monitor this process and adjust the procedures as necessary. Feedback on this point from the shift crew is essential.

How to look at a run

Select "Real Data Production" from the pulldown menu in the banner.
Use the pulldown menus to compose a DB query that includes the run you are interested in. The simplest procedure at the moment is to specify the runID and leave all other fields at "any". In the near future these selections will include trigger, calibration and geometry information. Note that the default for "QA status" is "done".
Press "Display Datasets". A listing of all catalogued runs corresponding to you query will appear in the upper frame.
To examine the QA histograms, press the "QA details" button. In the lower panel, a set of links to the histogram files will appear. The format is gzipped postscript. If your browser is set up to launch ghostview for files of type "ps", these files will be automatically unzipped and displayed. Otherwise, you will have to do something more complicated, such as save the file and view it another way. Note that if the macro "bfcread_hist_to_ps" is reported to have crashed, some or all histograms may be missing.
To examine the QA scalars and tests, scroll past the histogram links in the lower panel and push the button. Tables of scalars for all the data branches will appear in the auxilliary window.
To commpare the QA scalars to similar runs, press the "Compare reports" button. Details on how to procede are found in the autoQA documentation. Note that until more refined selections are available for real data (e.g. comparing runs with idenitical trigger conditions and processing chains), this facility will be of limited utility. Note also that the planned functionality of automatically comparing to a standard reference run has not yet been implemented, for similar reasons.

What QA data to examine

The principal QA tool is the histograms, generated by bfcread_hist_to_ps. The number of QA histograms has grown enormously over the past six months and needs to be pruned back to be useful to the non-expert. This work is going on now (week of July 10) and more information will be forthcoming.

Description of all the macros run by autoQA is found here. This documentation is important for understanding the meaning of the QA scalars.

Here are some general guidelines on what to report:

Status of run - completed, if not give error status (segmentation violation etc)
Macros that crashed
Macros whose QA status is not "O.K." (At present, this means simply that there is no data in the branch that macro is trying to read. No additional tests are applied to the data.)
Anomalous histograms and scalars - this is necessarily vague at this point.

How to report results

QA hypernews forum

starqa-hn@coburn.star.bnl.gov

The autoQA framework has a "comment" facility that allows the user to annotate particular runs or to enter a "global comment" that will appear chronologically in the listing of all runs. These are displayed together with the datasets, and while not appropriate for lengthy reports, can serve as flags for specific problems and supply hyperlinks to longer reports. Note that this is not a high security system (anyone can alter or delete you messages).

You do not need the QA Expert's password to use this facility. Press the button "Add or edit comments" in the upper right part of the upper panel. You will be asked for some identifying string that will be attached to your comments. Enter you name and press return. You will have to press "Display Datasets" again, at which point a button "Add global comment" will appear below the pulldown menus, and each run listing will have an "Add comment" button. Follow the instructions. Messages are interpreted as html, so links to other pages can be introduced. One possibility is to enter the hyperlink to the QA report you have sent to starqa-hn. This can obviously be automated, but it isn't yet and doing it by hand should be straightforward.

Checking QA jobs on rcas

Peter Jacobs