[Download] [Method] [Use] [Regularisation] [Building] [Running] [Examples] [Class docs] [Limitations] [Authors] [Development] [References] [History]
Please let me know if you use this software. This will further encourage me to continue working on it, and I will let you know about any future updates.
See this overview of RooUnfold
or the references below for more information.
To cite the RooUnfold package in a publication, you can refer
to this web page and/or the paper:
Tim Adye, in Proceedings of the PHYSTAT 2011 Workshop on Statistical Issues Related to Discovery Claims in Search Experiments and Unfolding, CERN, Geneva, Switzerland, 17–20 January 2011, edited by H.B. Prosper and L. Lyons, CERN–2011–006, pp. 313–318.
The unfolding procedure reconstructs the true Tj distribution from the measured Mi distribution, taking into account the measurement uncertainties due to statistical fluctuations in the finite measured sample (without these uncertainties, the problem could be solved uniquely by inverting the response matrix). RooUnfold provides several algorithms for solving this problem.
The iterative and SVD unfolding algorithms require a regularisation parameter to prevent the statistical fluctuations being interpreted as structure in the true distribution. It is therefore necessary to optimise this parameter for the number of bins and sample size, using Monte Carlo samples of the same size as the data. These samples can also be used to measure the effectiveness of the unfolding and hence provide estimates of the systematic errors that result from the procedure (testing).
Note that for this last step (in particular), it is important to use Monte Carlo samples with truth distributions that are statistically and systematically independent of the sample used in training (such samples would anyway be used in a systematics analysis, eg. using a different generator, or reweighting variations within the a-priori uncertainties in the truth distribution). After all, if the Monte Carlo were a perfect model of the data, we could use the Monte Carlo truth information directly and dispense with unfolding altogether!
The bin-by-bin method assumes no migration of events between bins (eg. resolution is much smaller than the bin size and no systematic shifts). This is of course trivial to implement without resorting to the RooUnfold machinery, but is included in the package to allow simple comparison with the other methods.
To use RooUnfold, we must first supply the response matrix object
It can be constructed like this:
RooUnfoldResponse response (nbins, x_lo, x_hi);or, if different truth and measured binning is required,
RooUnfoldResponse response (nbins_measured, x_lo_measured, x_hi_measured, nbins_true, x_lo_true, x_hi_true);or, if different binning is required,
RooUnfoldResponse response (hist_measured, hist_truth);
In that last case,
hist_truth are used to specify
the dimensions of the distributions (the histogram contents are not used here), eg.
for 2D or 3D distributions or non-uniform binning.
RooUnfoldResponse object is often most easily filled by looping over the training sample
for events that were not measured due to detection inefficiency,
if (measurement_ok) response.Fill (x_measured, x_true); else response.Miss (x_true);Alternatively, the response matrix can be constructed from a pre-existing
TH2D2-dimensional histogram (with truth and measured distribution
TH1Dhistograms for normalisation).
response object can be passed directly to the unfolding
object, or written to a ROOT file for use at a later stage (search for
stage parameter for an example of how to do this).
To do the unfolding (either to try different regularisation parameters,
for testing, or for real data), create a
RooUnfold object and pass it the
test / measured distribution (as a histogram) and the
RooUnfoldBayes unfold (&response, hist_measured, iterations);or
RooUnfoldSvd unfold (&response, hist_measured, kterm);or
RooUnfoldBinByBin unfold (&response, hist_measured);
hist_measuredis a pointer to a
TH2Dfor the 2D case) histogram of the measured distribution (it should have the same binning as the response matrix). The classes
RooUnfoldBinByBinall inherit from
RooUnfoldand implement the different algorithms. The integer
iterations(for RooUnfoldBayes) or
kterm(RooUnfoldSvd) is the regularisation parameter. (Note that RooUnfoldSvd's
ktermparameter is also known as
tauin the code. That usage is incompatible with the literature, so we adopt k here.)
The reconstructed truth distribution (with errors) can be
obtained with the
TH1D* hist_reco= (TH1D*) unfold.Hreco();The result can also be obtained as as a
Multi-dimensional distributions can also be unfolded, though this does not work for the SVD method, and the interface is rather clumsy (we hope to improve this).
See the class documentation for details of the
RooUnfoldResponse public methods.
A very simple example of RooUnfold's use is given
More complete tests, using different toy MC distributions, are
For RooUnfoldBayes, the regularisation parameter specifies the number of iterations, starting with the training sample truth (iterations=0). You should choose a small integer greater than 0 (we use 4 in the examples). Since only a few iterations are needed, a reasonable performance can usually be obtained without fine-tuning the parameter.
The optimal regularisation parameter can be selected by finding the largest value up to which the errors remain reasonable (ie. do not become much larger than previous values). This will give the smallest systematic errors (reconstructed distribution least biased by the training truth), without too-large statistical errors. Since the statistical errors grow quite rapidly beyond this point, but the systematic bias changes quite slowly below it, it can be prudent to reduce the regularisation parameter a little below this optimal point.
For RooUnfoldSvd, the unfolding is something
like a Fourier expansion in "result to be obtained" vs "MC truth input".
Low frequencies are assumed to be systematic differences between the training MC
and the data, which should be retained in the output. High frequencies are assumed to
arise from statistical fluctuations in data and unfortunately get
numerically enhanced without proper regularization.
Choosing the regularization parameter, k (
effectively determines up to
which frequencies the terms in the expansion are kept. (Actually,
this is not quite true, we don't use a hard cut-off but a smooth one.)
The correct choice of k is of particular importance for the SVD method. A too-small value will bias the unfolding result towards the MC truth input, a too-large value will give a result that is dominated by unphysically enhanced statistical fluctuations. This needs to be tuned for any given distribution, number of bins, and approximate sample size — with k between 2 and the number of bins. (Using k=1 means you get only the training truth input as result without any corrections. You basically regularise away any differences, and only keep the leading term which is, by construction, the MC truth input.)
Höcker and Kartvelishvili's paper (section 7) describes how to choose the optimum value for k.
$ROOTSYSenvironment variable should point to the top-level ROOT directory,
$ROOTSYS/binshould be in your
$ROOTSYS/libshould be in your library path (
$LD_LIBRARY_PATHon most Unix systems). In recent versions of ROOT (from 5.18), this can be most easily achieved using ROOT's
shsetup script. Eg. to use the CERN AFS version 5.28/00a on Scientific Linux 4/5 (x32) from a Bourne-type shell:
shell> . /afs/cern.ch/sw/lcg/app/releases/ROOT/5.28.00a/slc4_ia32_gcc34/root/bin/thisroot.shFor further details, consult the ROOT "Getting Started" documentation, or your local system administrator.
(or other versions here) and unpack
shell> tar zxf RooUnfold-1.1.1.tar.gz shell> cd RooUnfold-1.1.1Use GNU make. Just type
shell> maketo build the RooUnfold shared library.
If using ROOT (CINT or ACLiC), the library can be loaded automatically when
a RooUnfold class is first used. This only works if your current directory is the
RooUnfold top-level directory (containing libRooUnfold.so) or that directory has been
added to your dynamic path. Otherwise, you can load the library with
root  gSystem->Load("/home/RooUnfold-1.1.1/libRooUnfold"); root  RooUnfoldResponse response(10,-1,1);
To use ACLiC, you also need to add the RooUnfold headers to the include path.
This can be done with
.include src from the ROOT command prompt, eg.
root  .include src root  .x MyCode.cxx+
To build stand-alone, you need to specify the RooUnfold headers in the
subdirectory and specify the
Alternatively, you can use
make (with RooUnfold's
to compile and link your own code, eg.
make MyProgram will compile
MyProgram.cxx and link with RooUnfold.
examples/RooUnfoldExample.cxxmakes a simple test of RooUnfold.
shell> root root  .x examples/RooUnfoldExample.cxxMore involved tests, allowing different toy MC PDFs to be used for training and testing, can be found in
examples/RooUnfoldTest.cxx(which uses test class
RooUnfoldTestHarness). To run RooUnfoldTest from within ROOT:
root  .x examples/RooUnfoldTest.cxx
The example programs can also be run from the shell command line.
shell> make bin shell> ./RooUnfoldTestand the output appears in
You can specify parameters for RooUnfoldTest (either as an argument to the routine, or as parameters to the program), eg.
root  RooUnfoldTest("method=2 ftestx=3")or
shell> ./RooUnfoldTest method=2 ftestx=3Use
RooUnfoldTest("-h")to list all the parameters and their defaults.
method specifies the unfolding algorithm to use:
0 no unfolding (output copied from measured input)
5 matrix inversion
ftestxspecify training and test PDFs:
The centre and width scale of these signal PDFs can be specified with the
0 flat distribution 1 top-hat distribution over mean ± 3 x width
3 Double-sided exponential decay
5 Double Breit-Wigner, peaking at mean-width and mean+width 6 exponential 7 Gaussian resonance on an exponential background
wtestx) parameters respectively. A flat background of fraction
btestx) is added. Detector effects are modelled with a variable shift (
xbiasin the centre, dropping to 0 near the edges), a smearing of
xsmearbins, as well as a variable efficiency (between
For 2D and 3D examples look at
examples/RooUnfoldTest3D.cxxuse RooFit to generate the toy distributions. (RooFit is not required to use the RooUnfold classes from another program, eg.
examples/RooUnfoldExample.cxx). Hand-coded alternatives are provided if ROOT was not build with RooFit enabled (eg.
--enable-roofitnot specified). This version generates peaked signal events over their full range, so may have a fewer events within the range than requested.
To disable the use of RooFit,
#define NOROOFIT before loading
root  #define NOROOFIT 1 root  .x examples/RooUnfoldTest.cxxFor the stand-alone case, use
shell> make bin NOROOFIT=1to build (this is the default if RooFit is not available).
RooUnfoldSvddoes not work correctly for multi-dimensional distributions (gives a warning).
UseOverflow()setting) only works for 1D histograms.
The principal RooUnfold developer and contact is Tim Adye (RAL, T.J.Adye@rl.ac.uk).
The TUnfold interface, matrix inversion, and bin-by-bin algorithms as well as the error and parameter analysis frameworks were written by Richard Claridge (RAL).
The SVD algorithm was written by Kerstin Tackmann (LBNL) for the unfolding of the hadronic mass spectrum in B→Xulν.
An initial implementation of the iterative Bayesian algorithm was written by Fergus Wilson (RAL).
The latest development version of RooUnfold is maintained in the project's Subversion code repository, and can be viewed here (WebSVN, or viewvc). It can be checked out using:
shell> svn co https://svnsrv.desy.de/public/unfolding/RooUnfold/trunk RooUnfold
11 October 2012: Add formal citation to PHYSTAT 2011 paper.
17 November 2011: Add Loading RooUnfold section to detail how to load the RooUnfold libraries.
10 October 2011: update to version 1.1.1:
Setupmethod to take
0or empty histogram for
truthin the case where there are no fakes and/or inefficiencies. With
0, a 1D histogram is assumed.
RooUnfoldResponse::Vfakes. This fixes a crash reported by Katharina Bierwagen.
30 September 2011: update to version 1.1.0:
RooUnfold::SetMeasuredcan also be passed vectors for measured distribution and errors, or a matrix for the covariance matrix.
SetMeasuredCovjust sets the covariance matrix.
bincorr=-0.5gives an anti-correlation of 0.5 between neighbouring bins, a correlation of 0.25 between next-to-neighbours, etc. This input correlation matrix is also plotted for comparison with the unfolded output correlation matrix.
Fake(xmeas)to fill, or else add fakes to measured input histogram.
RooUnfoldTest fakexlo=0.1 fakexhi=0.2(and similarly for
zfor 2D/3D) for linearly varying background level between 0.1 at xlo and 0.2 at xhi as a fraction of the number of measured events.
SetMeasuredCovor in the histogram errors) instead of assumed multinomial errors based on the number of measured events in each bin (as given by D'Agostini). If the user didn't specify any errors, then ROOT's default assumption of √N is used for each bin, which is similar to (and perhaps more appropriate than) a multinomial distribution if there are more than a few bins.
libRooUnfold.rootmap, which can be used to load RooUnfold in PyROOT. Added
examples/RooUnfoldExample.py, equivalent to
RooUnfoldExample.cxx. If ROOT was built with PyROOT support and the correct version of Python is used,
examples/RooUnfoldExample.pycan be executed as-is.
make help" target.
RooUnfold::RunToy()returns new RooUnfold object.
Runtoy()method, which returned a histogram. This also fixes a memory leak reported by Seth Zenz.
RooUnfold::Newwhen name but no title is specified.
RooUnfold::PrintTablefor when truth histogram isn't specified.
RooUnfoldTest.root(previously just response object was written there).
RooUnfoldTest ploterrors=1no longer calculates χ2 distribution. This allows the comparison of the errors to be run alone, which can be very much faster. Use
ploterrors=2to include the χ2 plot.
6 May 2011: reference RooUnfold write-up from PHYSTAT 2011.
9 February 2011: reference the PHYSTAT 2011 workshop on unfolding and deconvolution.
14 January 2011: update to version 1.0.3:
bayes.forfor comparison with our RooUnfoldBayes (they implement the same algorithm). RooUnfoldDagostini is not normally compiled, but will be if
bayes_c.for(download) are copied into the
RooUnfoldResponse::Addmethod, suggested by Seth Zenz.
13 September 2010: update to version 1.0.2:
RooUnfold::ErrorTreatmentenum: no error treatment (
RooUnfold::kNoError), use bin-by-bin errors (
RooUnfold::kErrors) or full covariance matrix (
RooUnfold::kCovariance) propagated through the unfolding, or covariance matrix from the variation of the results in toy MC tests (
RooUnfold::kCovToy). This last method should be more accurate, especially for RooUnfoldBayes.
RooUnfoldTUnfoldis a simple (though not yet fully-featured) interface to ROOT's TUnfold class (requires ROOT 5.22 and above).
RooUnfoldInvertperforms a simple inversion of the response matrix.
RooUnfoldBinByBincode (no longer uses
RooUnfoldResponse::UseOverflow(). This currently only works for 1D histograms.
RooUnfold::New(), creates unfolding object based on the
RooUnfoldand its subclasses.
Chi2(hTrue)calculates the χ2 of the unfolded results with respect to a true distribution,
Emeasured()return the measured distribution and its errors as vectors,
ErecoV()returns unfolding errors as a vector,
SetMeasured()allow the unfolding inputs to be changed separately,
GetRegParm()provide a common method to access the regularisation parameter, and
NToys()access the number of MC tests used in error calculation with the
Impl()methods return the unfolding implementation object for some algorithms.
RooUnfold::PrintTable()shows also the error, residual, and pull for each bin.
hpaper(plot paper width and height),
draw=0(disables histogram drawing),
ploterrors=1(error analysis using
plotparms=1(regularisation parameter analysis using
1=unfolding uses overflows,
2=show under/overflow bins on test histograms).
30 July 2010: use
thisroot.sh in ROOT setup example. Update RooUnfoldTest help files.
3 June 2010: reference the Alliance Workshop on Unfolding and Data Correction.
20 May 2010: update to version 0.2.2:
ntoyscorrectly (broken in 0.2.1).
RooUnfold::SetVerbose(level): 0=warnings, 1=verbose (default, as before), 2=debug, 3=detailed.
19 May 2010: update to version 0.2.1:
Clear) and to perform unfolding on demand, rather than on construction.
root  gSystem->Load("libRooUnfold") root  .include src root  .L examples/RooUnfoldTest.cxx+ root  RooUnfoldTest()
22nd January 2010: update to version 0.2.0:
RooUnfoldBayesImpl::trainshould use data input rather than MC input for n(Ej). This is
_nEj. The upshot of this bug was that only the final iteration of the Bayesian unfolding had any effect on the result (though a single iteration still goes some way!). Problem reported by Jan Kapitan.
RooUnfoldBayesImpl. Client no longer uses
trainBinByBin()directly. This is now done in
unfoldBinByBin()(which specify the unfolding algorithm parameters,
smoothit), since the training now requires the unfolding input. Of course users of
RooUnfoldBinByBinwon't see any difference, since they wrap
_nCito 1 for the initial P0(C) rather than Nobs.
RooUnfoldBayes: fix if there are fewer measured than truth bins.
doUnfoldSystematicto enable systematic calculation. It remains disabled by default: I'm not sure it is correct, it is very slow, and the effect should be small with good MC statistics.
RooUnfoldSvdmay not work very well for multi-dimensional distributions, so print warning if it's tried.
RooUnfold::PrintTableimprovements: Don't show residuals and pulls for "empty" bins (both content=0 and error=0) and don't include them in the χ2. Fix bin numbering (no under/overflow, which aren't included in the unfolding) and show 2D/3D bin numbers. Also fixed for different number of measured and truth bins. Print test truth (
hTrue), which is now optional.
RooUnfoldBayesImpl::train: remove redundant
RooUnfoldTest2D, and new
RooUnfoldTest3D. They now use test harness classes,
RooUnfoldTestHarness3Drespectively. Test parameters can be specified on the command-line (or ROOT prompt): use
RooUnfoldTest("-h")for details. New PDFs, which now include a constant background by default. Improved plots.
14th October 2009: update to version 0.1.9:
RooUnfold::PrintTableprints a table of the results and χ2/DF. This is called from
RooUnfoldderived classes. Don't repeat
RooUnfold::Setupwhen constructing derived classes.
ktermis negative or greater than the number of bins.
Miss(). Defaults to 1.0, so no change if not specified. Support for variable binning in 1D case as suggested by Seth Zenz.
RooUnfoldResponse::ApplyToTruthbased on idea and code from Seth Zenz.
RooUnfoldBayesImpldoes not call
getCovarianceafter unfolding — now done automatically in
FindBin, which return global bin for multi-dimensional histogram corresponding to vector index or x value. Bug reported by Peter Waller.
RooUnfoldBayesImpl: added a
getChi2method. Added 1D smoothing. Speeded up covarinace matrix.
RooUnfoldTestcan test different numbers of bins in truth and measured distributions.
RooUnfoldTest2Donly calculates errors with the Bayes algorithm for 25 or fewer bins. It takes a long time for more bins (goes as the 4th power of the number of bins).
RooUnfoldTest: Added a pulls histogram. Draw a line at y=0 on residual plot. Added simple checking of some command line parameters.
RooUnfoldTestsmear method so it now works for different x-axis ranges. New test distribution of exponential decaying background and a resonance (i.e. Higgs-like).
$ROOTSYS/test/Makefile.archdoesn't exist, it gets settings from
root-configas suggested by Peter Waller. Use
-lRooFitCoreif available (seems to be needed with ROOT 5.18 Cygwin).
make ROOTBUILD=debugfor debug build. Allow
make VERBOSE=1to display compilation commands.
13th August 2008: add brief instructions for setting up ROOT.
13th May 2008: updated RooUnfold slides for a talk I gave today. Updated SPIRES URL.
23rd January 2008: update to version 0.1.5:
ktermto match Hoecker and Kartvelishvili's usage (they use k for the last term used in the expansion).
2nd August 2007: update to version 0.1.4 with these changes to the SVD algorithm from Kerstin Tackmann:
Clone. This should get rid of the segfaults that Jochen has been seeing.
TUnfHisto::GetCovthat could lead to small numerical changes in the estimated uncertainties.
12th July 2007: mention
classes explicitly. RooUnfold only supports histograms of doubles, not eg.
17th April 2007: first public version.