Tags:
create new tag
view all tags

PIDCalib Package

About The Package

This package enables the user to properly calibrate the DLL distributions of their signal mode, and thus calculate the efficiency of a given cut accurately.

It is assumed that the DLL of a particular track depends on small set of kinematic or event variables. The set of these variables has changed during development of the package, at the time of writing track momentum, pseudorapidity and number of tracks in the event are the most commonly used.

Through "golden modes" that can be reconstructed without the use of the RICH detectors, it is possible to acquire pure calibration samples of pions and kaons. The DLL distributions of these tracks shows the true response of the RICH detectors to tracks with the same kinematics.

The calibration and signal samples under study (acquired by some sort of fit, and appropriate background subtraction) are then binned in the chosen variables. In the limit where the bins are in finitely fine and the assumption of DLL depending on only the kinematic and event variables used being correct, then the DLL distribution of tracks in a particular bin of the calibration sample should be single-valued (this is the crucial principle on which the analysis is built). The population of each bin in the calibration and signal samples is compared, the ratio of these is assigned as the weight to that particular bin. The true DLL distribution of the signal tracks is then given by the DLL distribution of the calibration sample weighted bin by bin.

The weighting process is repeated for each track in the final state of the signal mode, to obtain the DLL distributions for all tracks.

Getting the Code

How to get and compile the code and some more technical info can be found here.

Usage

Test_Data.cpp

Test_Data.cpp is the script provided with the package to do the weighting with data. You give it input of your signal sample kinematics from data, and it weights the data calibration samples using just momentum and pseudorapidity as the weighting variables. I have attached a heavily annotated version of Test_Data.cpp, which hopefully leaves any usage questions beyond doubt.

Other scripts to do the weighting procedure that are provided are Test_MC.cpp, Test_EvtMC.cpp and Test_EvtData.cpp. Test_MC.cpp performs the same as Test_Data.cpp on Monte Carlo, and is used primarily for validation of the procedure and systematic calculations. Test_EvtData.cpp does the same as Test_Data.cpp but number of tracks in the event is used as a weighting variable in addition to track momentum and pseudorapidity. You can guess what Test_EvtMC.cpp does...

Example Results

Here are some example plots that should be the result of Test_Data.cpp. These are drawn by running the PlotWeightedDLLs.cpp script located in PIDPerfScripts/scripts/ in the package.

Example results when weighting the calibration sample of pions with the kinematics of a signal track.Results of weighting the calibration sample of kaons with kinematics of another mode.

On these plots the black points are the DLL disitributions of the unweighted calibration samples, and the red points show the DLL distributions of the calibration sample once is has been weighted with a signal sample. The plot on the left is for a pion track, the one on the right for a kaon. Please ignore the "LHCb Preliminary" and "LHCb Monte Carlo" labels.

Note that, although PlotWeightedDLLs.cpp gives you the option of plotting the DLL distribution of your signal, this is meaningless when plotting results of weighting with data. Since the DLL distribution of the signal is what we have found with the weighting, and the one already stored is probably incorrect, since it hadn't been calibrated with this procedure. Plotting the Signal DLL distribution is mainly for Monte Carlo validation of the procedure, and systematic calculations, discussed in a subsequent section. The use of the PlotWeightedDLLs script is illustrated with another annotated attachment.

Plotting Performance Curves

Use Test_PerfCalc.cpp to do this, located in PIDPerfScripts/src. The code here is fairly self explanatory, please check the member functions of Perf_Calculator (in PIDPerfTools/PIDPerfTools/PerfCalculator.h) to see what kinds of graphs you can draw.

A simple example is the efficiency of a DLL cut as a function of the cut, drawn with the "Perf_Scan" function, example shown below.

This curve is drawn from the fractional integral of the red curve in the kaon example in the previous section at different points.

Systematics

There is a systematic contribution to the uncertainty on the DLL distributions that comes from the weighting procedure itself, for example using a coarse binning will cause loss of information since the appropriate weight is averaged over the whole bin, or the assumption that DLL depends only on the kinematic and event variables used, while it could have a dependence on other variables.

The systematic inherent in the weighting is determined using Monte Carlo. The Monte Carlo version of the calibration sample used for computing the effi ciencies in section 4 is weighted with Monte Carlo kinematic distributions of the signal tracks. This is done using Test_MC.cpp or Test_EvtMC.cpp and they work in similar ways to Test_Data, which is explained in the attachment.

Example output of Test_MC and subsquent plotting of the DLL distributions is shown below.

MC signal and Unweighted calibration sample DLL distributions example.MC signal and weighted calibration samples DLL distributions.

The blue points are the Monte Carlo Signal, black are that of the unweighted calibration sample and the red are the calibration sample after weighting with the kinematics of the signal sample. Notice the better agreement of the red and blue than black and blue. This is how the process is validated with Monte Carlo, but a difference still exists between the red and blue distributions and this is the systematic uncertainty.

The systematic then is the difference in efficiencies calculated from the red and blue DLL distributions above. So one must run the output of Test_MC through Test_PerfCalc (in exactly the same way that you would to plot the efficiency curves in data) and subtract the 2 graphs plotted by the Perf_Scan function. An example of the systematic uncertainty in efficiency plotted as a function of DLL cut is shown below.

example of a weighting systematic as a function of DLL cut.

As you can see, in the reasonable region of interest (around cuts of zero), the systematic uncertainty is approximately 0.5%. Although this value varies according to the specific kinematics of the track being studied and may not always be so low.

Analysis Specific Details

This section contains details of how this package was used in the measurement of the branching ratio of Bs -> DK* w.r.t. Bd -> DRho, the analysis page is here.

First, the binning scheme used was 32 bins in momentum in the range (0,150)GeV/c and 4 bins in pseudorapidity the range being (1.5,5). The number of tracks per event was not used because it was found that the first 2 variables had the dominant effect on PID efficiency. Since there were limited statistics in this analysis only binning in 2 variables greatly reduced the statistical error.

In fact, since this analysis was done with a very limited number of events, the weighting was not done with events from data at all. Instead the kinematics of Monte Carlo samples were used to weight the calibration sample, with the assumption that momentum and pseudorapidity were well simulated in Monte Carlo. This, however, did not remove the problem of low statistics because the use of Monte Carlo kinematics to weight the calibration sample introduced a second systematic uncertainty from the possible differences between these distributions in data and Monte Carlo.

This second systematic required the data (only 36 events in Bs -> DK*) to be binned in 2 dimensions and used in the weighting (although this couldn't be used to calculate the efficiency, which is an unnecessary technical detail...).

In the end, in this analysis we were able to quote efficiencies calculated from this procedure with maximum total errors of roughly 5%. This is an improvement on previous figures, and without the hack of using Monte Carlo kinematics instead of data this error would be a lot lower.

-- EdmundSmith - 2011-08-10

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng MCsig_Wgt_D0Kbs.png r1 manage 13.7 K 2011-08-11 - 13:47 EdmundSmith MC signal and weighted calibrations samples DLL.
PNGpng MCsig_unWgt_D0Kbs.png r1 manage 13.8 K 2011-08-11 - 13:47 EdmundSmith MC signal and Unweighted calibrations samples DLL.
C source code filecpp PlotWeightedDLLs.cpp r1 manage 9.8 K 2011-08-11 - 12:43 EdmundSmith Annotated version of PlotWeightedDLLs.cpp
C source code filecpp Test_Data.cpp r1 manage 13.2 K 2011-08-10 - 15:15 EdmundSmith Heavily annotated version of Test_Data.cpp
PNGpng Wgt_unWgt_D0Kbs.png r1 manage 13.7 K 2011-08-11 - 11:50 EdmundSmith Example results of the weighting.
PNGpng Wgt_unWgt_RhoPiPlus.png r1 manage 14.0 K 2011-08-11 - 12:12 EdmundSmith Example results of the weighting for pions.
PNGpng eff_curve_D0Kbs.png r1 manage 14.6 K 2011-08-11 - 13:19 EdmundSmith Efficiency as a function of DLL cut.
PNGpng weight_sys_D0Kbs.png r1 manage 19.4 K 2011-08-11 - 13:56 EdmundSmith Weighting systematic example as a function of DLL cut.
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r4 - 2011-12-06 - EdmundSmith
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback