Search for memory leaks in STAR software
This page is intended to
help people solving their problems with memory leaks in their STAR software.
It's a summary of what I've learned by debugging StRoot/StFtpcTrackMaker
and StRoot/StFtpcClusterMaker. Therefore it doesn't cover all possible problems!
It is always a good idea to make sure that you have done everything alright in
your code (concerning memory allocation). For myself I was 100% sure that we
were leak free in our software. I thought Insure++ did not know what it was
talking about... But I had to realize that we had several severe problems
(ie. leaks) and that Insure++ kept quiet after fixing these. Conclusion: If
Insure++ tells you something is wrong, you have a problem! Fix it.
Tools to use
There are several (more or less) usefull tools to help you
finding problems in your code. To get started I suggest to compile your code
with Insure++
and to run it. How to do this can be found on this page set up by Akio
(see also here). Usually
Insure++ produces lots of lines of output. Sometimes the messages you get
are so cryptic that you don't know what to do, but believe me: Insure++ is right
in (almost) every statement. Still, the only thing I learned from using Insure++
was that I had problems in my code. To track these problems down is another
story.
The problem of getting a good advice from Insure++ arises because of the memory
management of ROOT. Especially TObjArray and TClonesArray allocate
memory themselves and while they are doing that they confuse Insure++.
What I did was to use StRoot/StarClassLibrary/StMemoryInfo. Documentation on that can be found here
(StarClassLibrary p. 71). I took two StMemoryInfo::snapshot()'s for every new
and delete command and then I did StMemoryInfo::print(). (Look at 'total
allocated space'. Only the differences are important.) I took these snapshots
before and after every new and delete statement to be sure to count only the
bytes allocated from that specific command.
I know, usually you have hundreds of executions of new's and delete's,
especially if they are within loops. But I can't help it. You may start with
putting snapshots in StRoot/StChain/StMaker.cxx
before and after ret = maker->Make() (in Int_t
StMaker::Make()). (Make sure you print the name of the current Maker as
well, otherwise you get confused by the output of every maker.) I don't think
this will help you a lot because as soon as you write something in some StAF
tables the memory isn't released before the next cycle, so you count a lot of
bytes you usually don't care about. Still, by running several events you get a
feeling if memory usage is growing all the time and which piece of code to
blame.
Analysis of output
If you have prepared your code with all these
snapshots and print commands you run it... and you end up with several lines of
output. I redirected this output stream to a file. This was necessary because
generally I produced 1.3 million lines of output for processing one event! As a
result I had to develop a small macro to handle all that stuff. Therefore I
printed not only 'StMemoryInfo::print()' but also some unique line to identify
the position in the code. The macro should add up everything which belongs
together (like all occurrences of 'new Track'). This should cancel with
everything like 'delete Track'. Unfortunately this isn't so easy because some
(most?) classes allocate memory internally as well. But that's life.
My macro cooked down these 1.3 million lines to something about 10 pages of
output. Within these 10 pages I looked for lines which canceled themselves to
0. Again: usually it doesn't but it should! Finally (after several weeks of
working on that) I was able to find every missing byte. Or let's
say it like this: I realized where my leaking bytes went!
Known problems and traps
With the method explained above several things
make your search for memory leaks a difficult task. This is due to the usage of
code not equipped with 'snapshots' (usally outside of your own classes). Here I
present a list of what I've found to be difficult to track down:
StMessageManager: Every string you send to >StMessageManager
(StMessMgr)
allocates memory! Because the size of these strings may depend on some runtime
conditions, the amount of memory allocated may change for each run. To take this
into account put snapshot()'s around StMessageManager lines.
TClonesArray: In contrast to TObjArray TClonesArray
allocates some additional memory to know on which class it is working. This
memory looks like a leak because it isn't released before the end of the ROOT
session. (I'm not sure if this is entirely true but it isn't released inside
your maker, for sure!)
Templates: Code utilizing the use of templates produces 'leaks'
(at least they look like those) and confuses Insure++ as well. Take
care!
Arrays: Insure++ doesn't know what to do with arrays passed to
a function. Even if you pass the array size as well it sees just the pointer
to the first array element and thinks you are pointing (inside the function) to
some arbitrary location in memory.
StAF tables: These
are black holes for your memory! Just hope that somebody has done it right.
If you have traced down every byte and found every leak you should run Inure++
again. There should be no messages left over except for cases were you pass
arrays to functions and the usage of templates, probably.
That's it. Good luck.
Markus Oldenburg Last modified: Wed Apr 20 11:25:17 CEST 2005