STAR Computing STAR HyperNews  
Forums - Membership - Subscriptions - Login - Search - Feedback - Admin - Help
[ Next-in-Thread ]  [ Next Message ] 

Re: DB: are there issues or not? 

Forum: EMC2 Sub-Systems
Re: DB: are there issues or not? (James Dunlop)
Date: 2006, Feb 01
From: David Relyea relyea

(moved thread as per Jamie's request)

Hey Jerome - I totally misunderstood what you were getting at, but now I
get it.  My code *does* take specific snapshots at specific timestamps
throughout the database.  Those timestamps do not change.  Basically, I
use the timestamps which correspond to the status files, and they have
been final for quite a while.  Moreover:

1.  The "no" referred to the answer I was waiting for Alex to give me
about the BEMC channel swap code.  EMPHASIS I want to know whether or not
BEMC channel swap code could affect this result, since it's entirely
possible that StEmcDbHandler is giving different channels than it was
before.  But I need his verification on that, and frankly, he committed
the code far earlier (and I recompile pretty regularly), so I'm dubious.

2.  I log my keystrokes, my files have timestamps, and I save all my ROOT
sessions.  I'm anal.  The record is:

On Jan 19, I compared my ped files to the db, and saw minor differences.

On Jan 24, I did a second comparison, prompted by Mike+Adam, since my
pedestal checker had omitted abs().  This comparison differed with the one
on the 19th in that I noticed bad pedestals in one crate for the last two
weeks of the run.  We'd seen this long ago, and due to me leaving out the
abs(), I hadn't caught it until Mike+Adam noticed it.  I asked Alex to
insert my pedestal files for those two weeks.

At some point, he and Michael exchanged email (which I got) about deleting
those records.  They were deleted.  He then inserted the new pedestal
files.  He announced it (on the BEMC list) on Jan 26 or 27.

On Jan 27, I compared my pedestals to the db, and saw *completely*
different differences than what I'd seen three days prior.  *Not* for the
files that were just uploaded - those tables match.

First, in the one day before those tables, there are bizarre zeroes in
several channels.

Second, throughout the run, some channels whose pedestals had formerly
disagreed with my own (491) no longer disagreed.  Some still did.  And
many more new disagreement had popped up.  The smoking gun is:

(old)
2005-05-13 03:16:41     1588    6386    3055    173.358
(new)
2005-05-13 03:16:41     1588    6386    3170    165.424

The columns are timestamp, channel number, my pedestal, db pedestal, and
sqrt(my error^2 + db error^2).  The db value for this channel has changed.
Not by much, but it changed.  And its timestamp is nowhere near any that
were changed (the changes done by Alex were for times in June).

So, according to the output from StEmcDbHandler, something has changed in
the db.  I have concrete evidence of this.  I don't know how the changes
occurred.  The changes are for timestamps unrelated to those which Michael
+ Alex altered.


There is a db problem.


Until that problem is solved, the spin group, under very real time
pressure, would like to use the pedestals created by the status code.

***On 1/19, there were *no* significant differences between this code and
the db, except for channel 491 (which the db had wrong) and channels near
the end of the run (which again, were wrong in the db).  These files can
be viewed, within 3 pedestal widths, as a mirror of the db pedestals as of
1/19.***

There is a second issue regarding the pedestals for 2004 released results,
being prepared for publication.  I have not checked them anywhere near as
thoroughly as I have 2005.  I'm not prepared to comment on the validity of
the 2004 db pedestals yet.

But for 2005, please figure out what happened with the db.  I'm sure
people have logged what they did, so it should be straightforward to
figure out what might have happened.

Dave


On Tue, 31 Jan 2006, Jerome LAURET wrote:

>     No = no you did not use a timestamp right ??
>
>     Using this macro, I would not be able what tables you compared with
> what. As mentioned (although I messed up the description a little),
> there WERE entries added to the DB as late as 2006-01-27 15:01:36 and
> you would agree this would explain the perceived discrepancies without
> more precisions ... In general, it is not a reliable method (for the
> purpose to making a statement on DB fluctuation) to do a time-line sweep
> of all db values and compare with a previous snapshot ... By nature, it
> changes with realtime so, the only statement you can make are (a)
> something have changed / was added to the database and (b) a relative
> statement on quality if any [QA].
>
>     You can ONLY make useful statements as per reproduceability (an
> entirely different story) based on a snapshot at T0 with a new check for
> the same T=T0 timestamp.
>
>     If I missed something, let me know.
>
> David Relyea wrote:
> > Hey Jerome - really weird - I didn't see your original email.  Sorry
> > about not replying (since this is important!).
> >
> > Before anything - Alex - I just remembered that the BEMC channel swap
> > code must sit somewhere.  Is that code being used by StEmcDbHandler?
> > In other words, are there channel swaps that are messing up Marcia's
> > code?
> >
> > Let's pretend the answer is no, since I've already written this email.
> > =)
> >
> > When I originally ran tests, my code spat out differences of 3 sigma
> > between my peds and the database.  However, I didn't have an abs() in
> > my code, so I was missing a bunch of them.  Regardless, the
> > differences it spat out are in
> > /protected/spin/relyea/pedestaldifferences/2005pedDiffsPREdbChange.txt
> > This is what I've always gotten.
> >
> > And then I requested, and we properly went through the steps of,
> > Michael deleting some rows from the db, followed by insertion of new
> > pedestal files.  I got all emails - it happened smoothly.
> >
> > On Friday, I checked to make sure (just for redundancy's sake) that
> > all the new pedestal files went in properly.  I added to my code the
> > TMath::Abs() statement as well.  I got completely different results,
> > so I undid the TMath::Abs() (not hard! lol) and ran it again.
> > Identical code as before.
> >
> > The results of that check are sitting in
> > /protected/spin/relyea/pedestaldifferences/2005pedDiffsPOSTdbChange.txt
> >
> > As can be seen, they're different.  First, as for the recent change,
> > something clearly happened to tables on 2005-06-09, since a large bank of
> > channels in those files has a perfectly zero pedestal.  Second, if you
> > look at these files, look at this line:
> >
> > (old)
> > 2005-05-13 03:16:41 1588 6386 3055 173.358
> > (new)
> > 2005-05-13 03:16:41 1588 6386 3170 165.424
> >
> > The columns are timestamp, channel number, my pedestal, db pedestal,
> > and sqrt(my error^2 + db error^2).  In other words, the db value for
> > this channel has changed.  Not by much, but it changed.
> >
> > The more I look at this, the more it looks like the value of every
> > pedestal being read back from the db shifted slightly.  Makes no
> > sense, but explains a lot of the barely-three-sigma differences I now
> > see (which I did not see before).  I could be wrong - something else
> > could be the problem.  The most urgent problem is the bank of channels
> > on June 9 that are zero.  But beyond that, I'd like to know why these
> > little changes occurred.
> >
> > Please let me know if you want anything.  Just fyi, my code (which is
> > just Marcia's code modified) is in
> > /star/u/relyea/star/2005/StRoot/StEmcUtil/macros/checkDbTable.C
> >
> > Dave
> >
> > On Tue, 31 Jan 2006, Jerome LAURET wrote:
> >
> >
> >>  David,
> >>  Jut asking again: did you use a timestamp and if
> >> so, which timestamp was this and what data range was used??
> >> If the "problem" is resolved, let me know too ...
> >>
> >>  Thanks,
> >>
> >> Jerome LAURET wrote:
> >>
> >>>  Question:
> >>>
> >>> -> David: Do you use a timestamp ; what was it?? What are the
> >>>     event time ranges ??
> >>> -> Alex: [see below first] why don't we have any entry
> >>>     times in 2005 or 2006 for bemcPed ??
> >>>
> >>>  Apart from that, we need to agree on a more stringent
> >>> procedure. Since I was just informed of this, here are a few
> >>> observations:
> >>>
> >>> - The latest values in the db for bemcPed have currently
> >>>    entryTime 2006-01-27 15:01:36 (a few days ago). So, a
> >>>    timestanp for analysis IS always needed.
> >>>
> >>> - A request was made last week to delete values between
> >>>    bemcPed from  20050610 to 20050621 - David, you were on this
> >>>    Email. 69 rows were dropped from the db.
> >>>
> >>> - Database are syncrhronized
> >>>
> >>>  I propose the following which will apply to BOTH B-EMC
> >>> and E-EMC:
> >>>
> >>> * From now on, we will NOT delete values in the db in bulks or
> >>>    specific but will rather de-activate them at worst.
> >>> * All de-activation requests will be duly documented
> >>> * All requests to delete entries shall come and be explained
> >>>    at global meetings (analysis or collaboration) and therefore
> >>>    self-documented.
> >>> * We will double check that there are NO WAYS for ANYONE (apart
> >>>    from the db Leader) to delete ANY entries from those db and
> >>>    those sub-system.
> >>> * We would rather have the used timestamps clearly indicated
> >>>    for any analysis as reference (not "I run last week" but "I ran
> >>>    last week using timestamp XXXXX").
> >>>
> >>>  I further suggest this to be the start of a "pragmatic
> >>> database procedure" as we have code procedures ensuring stability.
> >>>
> >>>  OK?



On Tue, 31 Jan 2006, James Dunlop wrote:

> Hi All,
> Renaming the thread so others go beyond the subject.
> Please respond if there are issues
> or not with reproducibility of the db.  Right now, that
> is the big deal.
>                          --Jamie
>
> >     As far as I understand, I do not see any evidence of DB issues but
> > rather agitations/irritation at this stage. My question stands in trying
> > to understand the real issues (if any). If users do not use timestamps
> > for example, replacing one db by another home-grown db will not
> > help much ...
> >
> >     Additionally, I proposed
> >
> > - a more stringent db approach where no records would ever be deleted
> >   without a prior documentation of such action (self-documenting and
> >   reversible db actions).
> >
> > - a "please stop" referencing to db without mentioning the conditions
> >   under which you have used it (timestamp). The information is NOT
> >   useful as the db entries BY NATURE, will evolve.
> >
> > Joanna Kiryluk wrote:
> > > Hi Alex, Jerome,
> > >
> > > At today's jet meeting it was reported that there are still "problems with
> > > DB", and a decision was made that people involved in jet analysis
> > > (2005 and 2004) do not use public DB, but a local directory instead,
> > > see agenda item 1d and action item 1 and 2 (bottom of the page):
> > > http://www.star.bnl.gov/protected/spin/mmiller/JWGmeetings/jet_01312006.htm
> > >
> > > Mike promised to summarize DB "issues" (I don't know what current problems
> > > are, and if they are only 2005 related, since I have not run my
> > > 2004 analysis code recently; what I heard was that "results are
> > > not reproducible"), which if I understand Alex's reply correctly,
> > > Alex is not aware of?
> > >
> > > The question is do we still have db problems?  I think this is what Jerome
> > > is trying to find out ...
> > >
> > > Thanks, Joanna
> > >
> > > On Tue, 31 Jan 2006, Jerome LAURET wrote:
> > >
> > >
> > >>     Hello Alex,
> > >>
> > >>     There is then a pending issue with bulk table insertion: we know that
> > >> one by one do not lead to corruption as it stands. Whatever the cause
> > >> (TBF), this is not a show-stopper.
> > >>
> > >>     The initial post was however one of those "I had a result last week
> > >> and now it is different". This can come from three sources (independent
> > >> of the first issue):
> > >> - records were added and the analysis was NOT using a timestamp
> > >> - records were deleted, exposing values becoming new defaults
> > >> - miss-interpretation all the way and offset between expectations and
> > >>   realities
> > >>
> > >>     The first one is a question for Dave. The second prompts to the
> > >> proposal of never deleting records but rather de-activate (keep
> > >> trace of things nicely). We can delete later as proposed (public
> > >> information / wide audience ; no longer any issues with speculations
> > >> and time waste).  I cannot do anything for the third bullet apart from
> > >> more documentations.
> > >>
> > >>     Would the general emotionally re-assuring plan for db use be
> > >> fine by the B/E-EMC software coordinators?
> > >>
> > >> Alexandre Suaide wrote:
> > >>
> > >>> Hi Jerome
> > >>>
> > >>> I agree that the DB should be more stable since the first table is
> > >>> saved in there. The problems we were having in the last weeks
> > >>> are mostly due to corrupted tables because we were saving them
> > >>> in batch.
> > >>>
> > >>> In the future we do not intend to save/delete many tables as we did
> > >>> recently. Of course one/two tables can be bad because the data that
> > >>> were used to create them are bad and in this case they need to be
> > >>> deactivated. I just see no reason for keeping tables that have
> > >>> corruption in DB and nobody had used them before.
> > >>>
> > >>> About the calibration issue we are discussing in the last few days we
> > >>> finally found that everything is ok. We were just seeing anti-protons
> > >>> that, because Nature plays with us, give a very nice electron-like
> > >>> peak in the wrong place :(
> > >>>
> > >>> Regards
> > >>>
> > >>> Alex
> > >>>
> > >>> Jerome LAURET wrote:
> > >>>
> > >>>
> > >>>
> > >>>> Question:
> > >>>>
> > >>>> -> David: Do you use a timestamp ; what was it?? What are the
> > >>>>    event time ranges ??
> > >>>> -> Alex: [see below first] why don't we have any entry
> > >>>>    times in 2005 or 2006 for bemcPed ??
> > >>>>
> > >>>> Apart from that, we need to agree on a more stringent
> > >>>> procedure. Since I was just informed of this, here are a few
> > >>>> observations:
> > >>>>
> > >>>> - The latest values in the db for bemcPed have currently
> > >>>>   entryTime 2006-01-27 15:01:36 (a few days ago). So, a
> > >>>>   timestanp for analysis IS always needed.
> > >>>>
> > >>>> - A request was made last week to delete values between
> > >>>>   bemcPed from  20050610 to 20050621 - David, you were on this
> > >>>>   Email. 69 rows were dropped from the db.
> > >>>>
> > >>>> - Database are syncrhronized
> > >>>>
> > >>>> I propose the following which will apply to BOTH B-EMC
> > >>>> and E-EMC:
> > >>>>
> > >>>> * From now on, we will NOT delete values in the db in bulks or
> > >>>>   specific but will rather de-activate them at worst.
> > >>>> * All de-activation requests will be duly documented
> > >>>> * All requests to delete entries shall come and be explained
> > >>>>   at global meetings (analysis or collaboration) and therefore
> > >>>>   self-documented.
> > >>>> * We will double check that there are NO WAYS for ANYONE (apart
> > >>>>   from the db Leader) to delete ANY entries from those db and
> > >>>>   those sub-system.
> > >>>> * We would rather have the used timestamps clearly indicated
> > >>>>   for any analysis as reference (not "I run last week" but "I ran
> > >>>>   last week using timestamp XXXXX").
> > >>>>
> > >>>> I further suggest this to be the start of a "pragmatic
> > >>>> database procedure" as we have code procedures ensuring stability.
> > >>>>
> > >>>> OK?
> > >>>>
> > >>>> David Relyea wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>> Hey - I've sent email to Alex separately - I'm seeing very, very weird
> > >>>>> results in the 205 bemc pedestal table right now.  I'm bascially gettin
> g
> > >>>>> values back which were not present in my prior tests, and my code hasn'
> t
> > >>>>> changed.
> > >>>>>
> > >>>>> When did you do this analysis?  If it's several days old, I can attest
> > >>>>> that the pedestals were fine then.  If it was Thursday, I'm wondering i
> f
> > >>>>> you were hit by the same thing I'm seeing.
> > >>>>>
> > >>>>> Alex - please look at my email.  I want to know if something has change
> d,
> > >>>>> since... well, something has changed!
> > >>>>>
> > >>>>> Dave
> > >>>>>
> > >>>>> On Fri, 27 Jan 2006, Mauro Cosentino wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> Hello All,
> > >>>>>> (I'm sending this to both lists, HF and EMC-2)
> > >>>>>>
> > >>>>>> yesterday I acomplished a small analysis on the issue of the strange
> > >>>>>> behavior of the p/E distribution for "pp200/2005 trgsetupname=Jpsi" da
> ta
> > >>>>>> set.
> > >>>>>> First of all, I "split" the BEMC into East and West (modules [1-60],
> > >>>>>> [61-120]), and I've got this
> > >>>>>>
> > >>>>>> http://www.dfn.if.usp.br/~mcosent/doutor/star/JPsi/PoE_East.gif
> > >>>>>> http://www.dfn.if.usp.br/~mcosent/doutor/star/JPsi/PoE_West.gif
> > >>>>>>
> > >>>>>> As it can be seen, the 2 peak pattern remains!  The following step the
> n
> > >>>>>> was to check out a point that some defended that the peak ~0.5 should
> be
> > >>>>>> the maximum of the hadron p/E distribution. So I made a p/E plot for
> > >>>>>> hadron over the same data set, and scaled by the right tale of the
> > >>>>>> electron p/E. Drawing them together we have this
> > >>>>>>
> > >>>>>> http://www.dfn.if.usp.br/~mcosent/doutor/star/JPsi/pOverE_plushadr.gif
> > >>>>>>
> > >>>>>> it's easy to see that the maximum of the hadron p/E distribution is
> > >>>>>> above 1, so the point stated above do not seem to be consistent!
> > >>>>>> To check possible different calibration gains I made two 2D plots, one
> > >>>>>> with p/E vs. eta and another with p/E vs. phi. The results are
> > >>>>>>
> > >>>>>> http://www.dfn.if.usp.br/~mcosent/doutor/star/JPsi/PoE_x_eta.gif
> > >>>>>> http://www.dfn.if.usp.br/~mcosent/doutor/star/JPsi/PoE_x_phi.gif
> > >>>>>>
> > >>>>>> It is possible to see two different groups of maxima, one around 1.0 a
> nd
> > >>>>>> the other one close to 0.5. I think this confirms the hypothesis of 2
> > >>>>>> different calibration and helps us to find the calibration for each to
> wer.
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Mauro
> > >>>>>>
> > >>>>>>
> > >>>>>> -------------------------------------------------------------
> > >>>>>> Visit this STAR HyperNews message (to reply or unsubscribe) at:
> > >>>>>> http://www.star.bnl.gov/HyperNews-star/get/emc2/1916.html
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>> -------------------------------------------------------------
> > >>>>> Visit this STAR HyperNews message (to reply or unsubscribe) at:
> > >>>>> http://www.star.bnl.gov/HyperNews-star/get/emc2/1916/5.html
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>> -------------------------------------------------------------
> > >>> Visit this STAR HyperNews message (to reply or unsubscribe) at:
> > >>> http://www.star.bnl.gov/HyperNews-star/get/emc2/1916/5/1/2.html
> > >>>
> > >>>
> > >> --
> > >>
> > >>              ,,,,,
> > >>             ( o o )
> > >>          --m---U---m--
> > >>              Jerome
> > >>
> > >>
> > >> -------------------------------------------------------------
> > >> Visit this STAR HyperNews message (to reply or unsubscribe) at:
> > >> http://www.star.bnl.gov/HyperNews-star/get/emc2/1916/5/1/2/1.html
> > >>
> > >>
> > >
> > >
> > > -------------------------------------------------------------
> > > Visit this STAR HyperNews message (to reply or unsubscribe) at:
> > > http://www.star.bnl.gov/HyperNews-star/get/emc2/1916/5/1/2/1/1.html
> > >
> >
> > --
> >
> >              ,,,,,
> >             ( o o )
> >          --m---U---m--
> >              Jerome
> >
> >
> > -------------------------------------------------------------
> > Visit this STAR HyperNews message (to reply or unsubscribe) at:
> > http://www.star.bnl.gov/HyperNews-star/get/emc2/1916/5/1/2/1/1/1.html
> >
> --
> James C Dunlop                            dunlop@bnl.gov
> Room 1-168, Bldg. 510                     Ph: (631)344-7781
> Brookhaven National Laboratory            Fax: (631)344-4206
> Upton, NY 11973                           Mobile: (631)834-7782
>
>
> -------------------------------------------------------------
> Visit this STAR HyperNews message (to reply or unsubscribe) at:
> http://www.star.bnl.gov/HyperNews-star/get/emc2/1921.html
>

[ Next-in-Thread ]  [ Next Message ] 

[ Add Message ]  to: "Re: DB: are there issues or not?"

[ Members ]  [ Subscribe ]  [ Admin Mode ] 
[ Show Frames ]  [ Help for STAR HyperNews 1.10 ] 

Messages Inline: [ 1 ]  [ All ]  Outline: [ 1 ]  [ 2 ]  [ 3 ] 
  Messages : 10000   All  100  200  500  1000  2000  4000  8000 
  Contact Jerome to set up e-mail posting
Show subscribers

1. Re: DB: are there issues or not? by Adam Kocoloski, 2006, Feb 01
(_ Re: DB: are there issues or not? by Alexandre A. P. Suaide, 2006, Feb 01
(_ Re: DB: are there issues or not? by Jerome Lauret, 2006, Feb 01
(_ Re: DB: are there issues or not? by David Relyea, 2006, Feb 01

[ Add Message ]  to: "Re: DB: are there issues or not?"

[ Members ]  [ Subscribe ]  [ Admin Mode ] 
[ Show Frames ]  [ Help for STAR HyperNews 1.10 ] 

STAR Computing STAR HyperNews Privacy & Security  
Forums - Membership - Subscriptions - Login - Search - Feedback - Admin - Help