The RooUnfold package is now documented and maintained on GitLab.
Please refer to https://gitlab.cern.ch/RooUnfold/RooUnfold and update your bookmarks.
This web page refers to old versions of the software and is kept for archival interest only.
Please let me know if you use this software. This will further encourage me to continue working on it, and I will let you know about any future updates.
See this overview of RooUnfold
or the references below for more information.
To cite the RooUnfold package in a publication, you can refer
to this web page and/or the paper:
Tim Adye, in Proceedings of the PHYSTAT 2011 Workshop on Statistical Issues Related to Discovery
Claims in Search Experiments and Unfolding, CERN, Geneva, Switzerland, 17–20 January 2011,
edited by H.B. Prosper and L. Lyons, CERN–2011–006,
pp. 313–318.
The unfolding procedure reconstructs the true Tj distribution from the measured Mi distribution, taking into account the measurement uncertainties due to statistical fluctuations in the finite measured sample (without these uncertainties, the problem could be solved uniquely by inverting the response matrix). RooUnfold provides several algorithms for solving this problem.
The iterative and SVD unfolding algorithms require a regularisation parameter to prevent the statistical fluctuations being interpreted as structure in the true distribution. It is therefore necessary to optimise this parameter for the number of bins and sample size, using Monte Carlo samples of the same size as the data. These samples can also be used to measure the effectiveness of the unfolding and hence provide estimates of the systematic errors that result from the procedure (testing).
Note that for this last step (in particular), it is important to use Monte Carlo samples with truth distributions that are statistically and systematically independent of the sample used in training (such samples would anyway be used in a systematics analysis, eg. using a different generator, or reweighting variations within the a-priori uncertainties in the truth distribution). After all, if the Monte Carlo were a perfect model of the data, we could use the Monte Carlo truth information directly and dispense with unfolding altogether!
The bin-by-bin method assumes no migration of events between bins (eg. resolution is much smaller than the bin size and no systematic shifts). This is of course trivial to implement without resorting to the RooUnfold machinery, but is included in the package to allow simple comparison with the other methods.
To use RooUnfold, we must first supply the response matrix object RooUnfoldResponse
.
It can be constructed like this:
RooUnfoldResponse response (nbins, x_lo, x_hi);or, if different truth and measured binning is required,
RooUnfoldResponse response (nbins_measured, x_lo_measured, x_hi_measured, nbins_true, x_lo_true, x_hi_true);or, if different binning is required,
RooUnfoldResponse response (hist_measured, hist_truth);
In that last case, hist_measured
and hist_truth
are used to specify
the dimensions of the distributions (the histogram contents are not used here), eg.
for 2D or 3D distributions or non-uniform binning.
This RooUnfoldResponse
object is often most easily filled by looping over the training sample
and calling response.Fill(x_measured,x_true)
or,
for events that were not measured due to detection inefficiency,
response.Miss(x_true)
if (measurement_ok) response.Fill (x_measured, x_true); else response.Miss (x_true);Alternatively, the response matrix can be constructed from a pre-existing
TH2D
2-dimensional histogram (with truth and measured distribution TH1D
histograms
for normalisation).
This response
object can be passed directly to the unfolding
object, or written to a ROOT file for use at a later stage (search for
examples/RooUnfoldTest.cxx
's
stage
parameter for an example of how to do this).
To do the unfolding (either to try different regularisation parameters,
for testing, or for real data), create a RooUnfold
object and pass it the
test / measured distribution (as a histogram) and the response
object.
RooUnfoldBayes unfold (&response, hist_measured, iterations);or
RooUnfoldSvd unfold (&response, hist_measured, kterm);or
RooUnfoldBinByBin unfold (&response, hist_measured);
hist_measured
is a pointer to a TH1D
(or TH2D
for the 2D case)
histogram of the measured distribution (it should have the same binning as the response matrix).
The classes RooUnfoldBayes
, RooUnfoldSvd
, and RooUnfoldBinByBin
all inherit from RooUnfold
and implement the different algorithms.
The integer iterations
(for RooUnfoldBayes) or kterm
(RooUnfoldSvd) is
the regularisation parameter. (Note that RooUnfoldSvd's kterm
parameter
is also known as tau
in the code. That usage is incompatible with the literature,
so we adopt k here.)
The reconstructed truth distribution (with errors) can be
obtained with the Hreco()
method.
TH1D* hist_reco= (TH1D*) unfold.Hreco();The result can also be obtained as as a
TVectorD
with
full TMatrixD
covariance matrix.
Multi-dimensional distributions can also be unfolded, though this does not work for the SVD method, and the interface is rather clumsy (we hope to improve this).
See the class documentation for details of the
RooUnfold
and RooUnfoldResponse
public methods.
A very simple example of RooUnfold's use is given
in examples/RooUnfoldExample.cxx
.
More complete tests, using different toy MC distributions, are
in examples/RooUnfoldTest.cxx
and examples/RooUnfoldTest2D.cxx
.
For RooUnfoldBayes, the regularisation parameter specifies the number of iterations, starting with the training sample truth (iterations=0). You should choose a small integer greater than 0 (we use 4 in the examples). Since only a few iterations are needed, a reasonable performance can usually be obtained without fine-tuning the parameter.
The optimal regularisation parameter can be selected by finding the largest value up to which the errors remain reasonable (ie. do not become much larger than previous values). This will give the smallest systematic errors (reconstructed distribution least biased by the training truth), without too-large statistical errors. Since the statistical errors grow quite rapidly beyond this point, but the systematic bias changes quite slowly below it, it can be prudent to reduce the regularisation parameter a little below this optimal point.
For RooUnfoldSvd, the unfolding is something
like a Fourier expansion in "result to be obtained" vs "MC truth input".
Low frequencies are assumed to be systematic differences between the training MC
and the data, which should be retained in the output. High frequencies are assumed to
arise from statistical fluctuations in data and unfortunately get
numerically enhanced without proper regularization.
Choosing the regularization parameter, k (kterm
),
effectively determines up to
which frequencies the terms in the expansion are kept. (Actually,
this is not quite true, we don't use a hard cut-off but a smooth one.)
The correct choice of k is of particular importance for the SVD method. A too-small value will bias the unfolding result towards the MC truth input, a too-large value will give a result that is dominated by unphysically enhanced statistical fluctuations. This needs to be tuned for any given distribution, number of bins, and approximate sample size — with k between 2 and the number of bins. (Using k=1 means you get only the training truth input as result without any corrections. You basically regularise away any differences, and only keep the leading term which is, by construction, the MC truth input.)
Höcker and Kartvelishvili's paper (section 7) describes how to choose the optimum value for k.
$ROOTSYS
environment variable
should point to the top-level ROOT directory, $ROOTSYS/bin
should be
in your $PATH
, and $ROOTSYS/lib
should be in
your library path ($LD_LIBRARY_PATH
on most Unix systems).
In recent versions of ROOT (from 5.18), this can be most easily achieved using ROOT's
thisroot.
(c
)sh
setup script.
Eg. to use the CERN AFS version 5.28/00a on
Scientific Linux 4/5 (x32) from a Bourne-type shell:
shell> . /afs/cern.ch/sw/lcg/app/releases/ROOT/5.28.00a/slc4_ia32_gcc34/root/bin/thisroot.shFor further details, consult the ROOT "Getting Started" documentation, or your local system administrator.
Download RooUnfold-1.1.1.tar.gz
(or other versions here) and unpack
shell> tar zxf RooUnfold-1.1.1.tar.gz shell> cd RooUnfold-1.1.1Use GNU make. Just type
shell> maketo build the RooUnfold shared library.
If using ROOT (CINT or ACLiC), the library can be loaded automatically when
a RooUnfold class is first used. This only works if your current directory is the
RooUnfold top-level directory (containing libRooUnfold.so) or that directory has been
added to your dynamic path. Otherwise, you can load the library with gSystem->Load()
, eg.
root [0] gSystem->Load("/home/RooUnfold-1.1.1/libRooUnfold"); root [1] RooUnfoldResponse response(10,-1,1);
To use ACLiC, you also need to add the RooUnfold headers to the include path.
This can be done with .include src
from the ROOT command prompt, eg.
root [0] .include src root [1] .x MyCode.cxx+
To build stand-alone, you need to specify the RooUnfold headers in the src
subdirectory and specify the -lRooUnfold
library.
Alternatively, you can use make
(with RooUnfold's GNUmakefile
)
to compile and link your own code, eg.
make MyProgram
will compile MyProgram.cxx
and link with RooUnfold.
examples/RooUnfoldExample.cxx
makes a simple test of RooUnfold.
shell> root root [0] .x examples/RooUnfoldExample.cxxMore involved tests, allowing different toy MC PDFs to be used for training and testing, can be found in
examples/RooUnfoldTest.cxx
(which uses test class RooUnfoldTestHarness
).
To run RooUnfoldTest from within ROOT:
root [1] .x examples/RooUnfoldTest.cxx
The example programs can also be run from the shell command line.
shell> make bin shell> ./RooUnfoldTestand the output appears in
RooUnfoldTest.ps
.
You can specify parameters for RooUnfoldTest (either as an argument to the routine, or as parameters to the program), eg.
root [2] RooUnfoldTest("method=2 ftestx=3")or
shell> ./RooUnfoldTest method=2 ftestx=3Use
RooUnfoldTest -h
or RooUnfoldTest("-h")
to list all the parameters and their defaults.
method
specifies the unfolding algorithm to use:
0 no unfolding (output copied from measured input)
1 Bayes
2 SVD
3 bin-by-bin
4 TUnfold
5 matrix inversion
ftrainx
and ftestx
specify training and test PDFs:
The centre and width scale of these signal PDFs can be specified with the
0 flat distribution 1 top-hat distribution over mean ± 3 x width
2 Gaussian
3 Double-sided exponential decay
4 Breit-Wigner
5 Double Breit-Wigner, peaking at mean-width and mean+width 6 exponential 7 Gaussian resonance on an exponential background
mtrainx
and wtrainx
(and mtestx
and wtestx
) parameters respectively.
A flat background of fraction btrainx
(and btestx
) is added.
Detector effects are modelled with a variable shift
(xbias
in the centre, dropping to 0 near the edges),
a smearing of xsmear
bins,
as well as a variable efficiency (between effxlo
at xlo
and effxhi
at xhi
).
For 2D and 3D examples look at RooUnfoldTest2D
and RooUnfoldTest3D
.
examples/RooUnfoldTest.cxx
, examples/RooUnfoldTest2D.cxx
, and examples/RooUnfoldTest3D.cxx
use RooFit to generate the toy distributions.
(RooFit is not required to use the RooUnfold classes from another program,
eg. examples/RooUnfoldExample.cxx
).
Hand-coded alternatives are provided if ROOT was not build with RooFit enabled
(eg. --enable-roofit
not specified).
This version generates peaked signal events over their full
range, so may have a fewer events within the range than requested.
To disable the use of RooFit, #define NOROOFIT
before loading RooUnfoldTest*.cxx
root [0] #define NOROOFIT 1 root [1] .x examples/RooUnfoldTest.cxxFor the stand-alone case, use
shell> make bin NOROOFIT=1to build (this is the default if RooFit is not available).
RooUnfoldSvd
does not work correctly for multi-dimensional distributions (gives a warning).
UseOverflow()
setting) only works for 1D histograms.
The principal RooUnfold developer and contact is Tim Adye (RAL, T.J.Adye@rl.ac.uk).
The TUnfold interface, matrix inversion, and bin-by-bin algorithms as well as the error and parameter analysis frameworks were written by Richard Claridge (RAL).
The SVD algorithm was written by Kerstin Tackmann (LBNL) for the unfolding of the hadronic mass spectrum in B→Xulν.
An initial implementation of the iterative Bayesian algorithm was written by Fergus Wilson (RAL).
The latest development version of RooUnfold is maintained in the project's Subversion code repository, and can be viewed here (WebSVN, or viewvc). It can be checked out using:
shell> svn co https://svnsrv.desy.de/public/unfolding/RooUnfold/trunk RooUnfold
11 October 2012: Add formal citation to PHYSTAT 2011 paper.
17 November 2011: Add Loading RooUnfold section to detail how to load the RooUnfold libraries.
10 October 2011: update to version 1.1.1:
RooUnfoldResponse(measured,truth,response)
constructor and Setup
method to take 0
or empty histogram for measured
and/or truth
in
the case where there are no fakes and/or inefficiencies.
With 0
, a 1D histogram is assumed.
RooUnfoldResponse::Vfakes
. This fixes a crash
reported by Katharina Bierwagen.
30 September 2011: update to version 1.1.0:
RooUnfold::SetMeasured
can also be passed vectors for measured distribution
and errors, or a matrix for the covariance matrix. SetMeasuredCov
just
sets the covariance matrix. GetMeasuredCov
retrieves it.bincorr=-0.5
gives an anti-correlation of 0.5 between neighbouring bins,
a correlation of 0.25 between next-to-neighbours, etc. This input correlation matrix is also plotted
for comparison with the unfolded output correlation matrix.
Fake(xmeas)
to fill, or else add fakes to measured input
histogram.RooUnfoldTest fakexlo=0.1 fakexhi=0.2
(and similarly for y
and z
for 2D/3D) for linearly varying background level
between 0.1 at xlo and 0.2 at xhi as a fraction of the number of measured events.
RooUnfoldBayes::UnfoldingMatrix
accessor.
SetMeasuredCov
or in the histogram errors)
instead of assumed multinomial errors based on the number of measured events in each bin
(as given by D'Agostini). If the user didn't specify any errors, then ROOT's default assumption of
√N is used for each bin, which is similar to (and perhaps more appropriate than)
a multinomial distribution if there are more than a few bins.
libRooUnfold.rootmap
, which can be used to load RooUnfold in PyROOT.
Added examples/RooUnfoldExample.py
, equivalent to RooUnfoldExample.cxx
.
If ROOT was built with PyROOT support and the correct version of Python is used,
examples/RooUnfoldExample.py
can be executed as-is.
make help
" target.
RooUnfold::RunToy()
returns new RooUnfold object.
RunToy()
replaces old Runtoy()
method, which returned a histogram.
This also fixes a memory leak reported by Seth Zenz.
RooUnfold::New
when name but no title is specified.
RooUnfold::PrintTable
for when truth histogram isn't specified.
RooUnfoldTest.root
(previously just response object was written there).
RooUnfoldTest ploterrors=1
no longer calculates χ2 distribution.
This allows the comparison of the errors to be run alone, which can be very much faster.
Use ploterrors=2
to include the χ2 plot.
verbose=2
.
6 May 2011: reference RooUnfold write-up from PHYSTAT 2011.
9 February 2011: reference the PHYSTAT 2011 workshop on unfolding and deconvolution.
14 January 2011: update to version 1.0.3:
bayes.for
for comparison with our RooUnfoldBayes (they implement the same algorithm).
RooUnfoldDagostini is not normally compiled, but will be if bayes.for
and
bayes_c.for
(download)
are copied into the src
directory.
RooUnfoldResponse::Add
method, suggested by Seth Zenz.
13 September 2010: update to version 1.0.2:
RooUnfold::ErrorTreatment
enum: no error treatment
(RooUnfold::kNoError
),
use bin-by-bin errors (RooUnfold::kErrors
) or full covariance matrix
(RooUnfold::kCovariance
) propagated through the unfolding, or
covariance matrix from the variation of the results in toy MC tests (RooUnfold::kCovToy
).
This last method should be more accurate, especially for RooUnfoldBayes.
RooUnfoldTUnfold
is a
simple (though not yet fully-featured) interface to
ROOT's TUnfold class (requires ROOT 5.22 and above).
RooUnfoldInvert
performs a simple inversion of the response matrix.
RooUnfoldBinByBin
code (no longer uses RooUnfoldBayesImpl
).
RooUnfoldResponse::UseOverflow()
.
This currently only works for 1D histograms.
RooUnfold::New()
, creates unfolding object
based on the RooUnfold::Algorithm
enum (RooUnfold::kNone
,
RooUnfold::kBayes
, RooUnfold::kSVD
, RooUnfold::kBinByBin
,
RooUnfold::kTUnfold
, or RooUnfold::kInvert
).
Clone()
method for RooUnfold
and its subclasses.
Chi2(hTrue)
calculates the χ2 of the
unfolded results with respect to a true distribution,
Vmeasured()
and Emeasured()
return the measured distribution and its errors as vectors,
ErecoV()
returns unfolding errors as a vector,
SetResponse()
and SetMeasured()
allow the unfolding inputs to be changed separately,
SetRegParm()
and GetRegParm()
provide a common method to access the regularisation parameter, and
SetNToys()
and NToys()
access the number of MC tests used in error calculation
with the RooUnfold::kCovToy
setting.
Impl()
methods return the unfolding implementation object for some algorithms.
RooUnfold::PrintTable()
shows also the error, residual, and pull for each bin.
wpaper
and hpaper
(plot paper width and height),
draw=0
(disables histogram drawing), ploterrors=1
(error analysis using RooUnfoldErrors
),
and plotparms=1
(regularisation parameter analysis using RooUnfoldParms
),
overflow
(1
=unfolding uses overflows,
2
=show under/overflow bins on test histograms).
30 July 2010: use thisroot.sh
in ROOT setup example. Update RooUnfoldTest help files.
3 June 2010: reference the Alliance Workshop on Unfolding and Data Correction.
20 May 2010: update to version 0.2.2:
kterm
and ntoys
correctly (broken in 0.2.1).
RooUnfold::SetVerbose(level)
:
0=warnings, 1=verbose (default, as before), 2=debug, 3=detailed.
19 May 2010: update to version 0.2.1:
Setup
and Clear
) and to perform unfolding on demand, rather than on construction.
root [0] gSystem->Load("libRooUnfold") root [1] .include src root [2] .L examples/RooUnfoldTest.cxx+ root [3] RooUnfoldTest()
22nd January 2010: update to version 0.2.0:
RooUnfoldBayesImpl::train
should use data input rather than MC input for
n(Ej). This is _nEstj[]
rather _nEj[]
. The upshot of this bug was that only
the final iteration of the Bayesian unfolding had any effect on the result (though a single iteration
still goes some way!). Problem reported by Jan Kapitan.
RooUnfoldBayesImpl
.
Client no longer uses train()
or trainBinByBin()
directly.
This is now done in unfold()
or
unfoldBinByBin()
(which specify the unfolding algorithm parameters,
iterations
and smoothit
), since the training now requires the unfolding
input. Of course users of RooUnfoldBayes
and RooUnfoldBinByBin
won't see any
difference, since they wrap RooUnfoldBayesImpl
.
RooUnfoldBayesImpl::train
.
RooUnfoldBayesImpl::train
should normalise _nCi
to 1 for the initial P0(C)
rather than Nobs.
RooUnfoldBayes
: fix if there are fewer measured than truth bins.
RooUnfoldBayesImpl::getCovariance
option doUnfoldSystematic
to enable
systematic calculation. It remains disabled by default: I'm not sure it is
correct, it is very slow, and the effect should be small with good MC
statistics.
RooUnfoldSvd
may not work very well for multi-dimensional distributions,
so print warning if it's tried.
RooUnfold::PrintTable
improvements: Don't show residuals and pulls for "empty" bins
(both content=0 and error=0) and don't include them in the χ2.
Fix bin numbering (no under/overflow, which aren't included in the unfolding) and show 2D/3D
bin numbers. Also fixed for different number of measured and truth bins. Print test truth (hTrue
), which
is now optional.
RooUnfoldBayesImpl::train
: remove redundant TStopwatch
timer.
examples/RooUnfoldTest
, RooUnfoldTest2D
, and new
RooUnfoldTest3D
. They now use test harness classes, RooUnfoldTestHarness
,
RooUnfoldTestHarness2D
, and RooUnfoldTestHarness3D
respectively.
Test parameters can be specified on the command-line (or ROOT prompt):
use RooUnfoldTest -h
or RooUnfoldTest("-h")
for details.
New PDFs, which now include a constant background by default. Improved plots.
make html
target.
14th October 2009: update to version 0.1.9:
RooUnfold::PrintTable
prints a table of the results and χ2/DF.
This is called from RooUnfoldTest
.
RooUnfold
derived classes. Don't repeat RooUnfold::Setup
when constructing derived classes.
kterm
is negative or greater than the number of bins.
RooUnfoldResponse::Fill()
and Miss()
.
Defaults to 1.0, so no change if not specified. Support for variable binning in 1D case as
suggested by Seth Zenz.
RooUnfoldResponse::ApplyToTruth
based on idea and code from Seth Zenz.
RooUnfoldBayesImpl
does not call
getCovariance
after unfolding — now done automatically in RooUnfoldBayes
.
RooUnfoldResponse::GetBin
and FindBin
, which return global bin for multi-dimensional
histogram corresponding to vector index or x value.
Bug reported by Peter Waller.
RooUnfoldBayesImpl
: added a getChi2
method. Added 1D smoothing. Speeded
up covarinace matrix.
RooUnfoldTest
.
RooUnfoldTest
can test different numbers of bins in truth and measured distributions.
RooUnfoldTest2D
only calculates errors with the Bayes algorithm
for 25 or fewer bins. It takes a long time for more bins (goes as the 4th power
of the number of bins).
RooUnfoldTest
: Added a pulls histogram. Draw a line at y=0 on residual plot. Added simple
checking of some command line parameters.
RooUnfoldTest
smear method so it now works for different x-axis ranges.
New test distribution of exponential decaying background and a resonance (i.e. Higgs-like).
GNUmakefile
. If $ROOTSYS/test/Makefile.arch
doesn't exist, it
gets settings from root-config
as suggested by Peter Waller.
Use -lRooFitCore
if available (seems to be needed with ROOT 5.18 Cygwin).
make ROOTBUILD=debug
for debug build. Allow make VERBOSE=1
to display compilation commands.
13th August 2008: add brief instructions for setting up ROOT.
13th May 2008: updated RooUnfold slides for a talk I gave today. Updated SPIRES URL.
23rd January 2008: update to version 0.1.5:
RooUnfoldTest2D.ps
file name.
tau
parameter to kterm
to match Hoecker and
Kartvelishvili's usage (they use k for the last term used in the expansion).
2nd August 2007: update to version 0.1.4 with these changes to the SVD algorithm from Kerstin Tackmann:
Clone
.
This should get rid of the segfaults that Jochen has been seeing.
TUnfHisto::GetCov
that could lead to small
numerical changes in the estimated uncertainties.
12th July 2007: mention TH1D
and TH2D
classes explicitly. RooUnfold only supports histograms of doubles, not eg. TH1F
.
17th April 2007: first public version.