Difference: HowToUseThePIDCalibrationPackage (1 vs. 4)

Revision 42011-12-06 - EdmundSmith

 
META TOPICPARENT name="HowTos"

PIDCalib Package

About The Package

This package enables the user to properly calibrate the DLL distributions of their signal mode, and thus calculate the efficiency of a given cut accurately.

It is assumed that the DLL of a particular track depends on small set of kinematic or event variables. The set of these variables has changed during development of the package, at the time of writing track momentum, pseudorapidity and number of tracks in the event are the most commonly used.

Through "golden modes" that can be reconstructed without the use of the RICH detectors, it is possible to acquire pure calibration samples of pions and kaons. The DLL distributions of these tracks shows the true response of the RICH detectors to tracks with the same kinematics.

The calibration and signal samples under study (acquired by some sort of fit, and appropriate background subtraction) are then binned in the chosen variables. In the limit where the bins are in finitely fine and the assumption of DLL depending on only the kinematic and event variables used being correct, then the DLL distribution of tracks in a particular bin of the calibration sample should be single-valued (this is the crucial principle on which the analysis is built). The population of each bin in the calibration and signal samples is compared, the ratio of these is assigned as the weight to that particular bin. The true DLL distribution of the signal tracks is then given by the DLL distribution of the calibration sample weighted bin by bin.

The weighting process is repeated for each track in the final state of the signal mode, to obtain the DLL distributions for all tracks.

Getting the Code

How to get and compile the code and some more technical info can be found here.

Usage

Test_Data.cpp

Test_Data.cpp is the script provided with the package to do the weighting with data. You give it input of your signal sample kinematics from data, and it weights the data calibration samples using just momentum and pseudorapidity as the weighting variables. I have attached a heavily annotated version of Test_Data.cpp, which hopefully leaves any usage questions beyond doubt.

Other scripts to do the weighting procedure that are provided are Test_MC.cpp, Test_EvtMC.cpp and Test_EvtData.cpp. Test_MC.cpp performs the same as Test_Data.cpp on Monte Carlo, and is used primarily for validation of the procedure and systematic calculations. Test_EvtData.cpp does the same as Test_Data.cpp but number of tracks in the event is used as a weighting variable in addition to track momentum and pseudorapidity. You can guess what Test_EvtMC.cpp does...

Example Results

Here are some example plots that should be the result of Test_Data.cpp. These are drawn by running the PlotWeightedDLLs.cpp script located in PIDPerfScripts/scripts/ in the package.

Example results when weighting the calibration sample of pions with the kinematics of a signal track.Results of weighting the calibration sample of kaons with kinematics of another mode.

On these plots the black points are the DLL disitributions of the unweighted calibration samples, and the red points show the DLL distributions of the calibration sample once is has been weighted with a signal sample. The plot on the left is for a pion track, the one on the right for a kaon. Please ignore the "LHCb Preliminary" and "LHCb Monte Carlo" labels.

Note that, although PlotWeightedDLLs.cpp gives you the option of plotting the DLL distribution of your signal, this is meaningless when plotting results of weighting with data. Since the DLL distribution of the signal is what we have found with the weighting, and the one already stored is probably incorrect, since it hadn't been calibrated with this procedure. Plotting the Signal DLL distribution is mainly for Monte Carlo validation of the procedure, and systematic calculations, discussed in a subsequent section. The use of the PlotWeightedDLLs script is illustrated with another annotated attachment.

Plotting Performance Curves

Use Test_PerfCalc.cpp to do this, located in PIDPerfScripts/src. The code here is fairly self explanatory, please check the member functions of Perf_Calculator (in PIDPerfTools/PIDPerfTools/PerfCalculator.h) to see what kinds of graphs you can draw.

A simple example is the efficiency of a DLL cut as a function of the cut, drawn with the "Perf_Scan" function, example shown below.

This curve is drawn from the fractional integral of the red curve in the kaon example in the previous section at different points.

Systematics

There is a systematic contribution to the uncertainty on the DLL distributions that comes from the weighting procedure itself, for example using a coarse binning will cause loss of information since the appropriate weight is averaged over the whole bin, or the assumption that DLL depends only on the kinematic and event variables used, while it could have a dependence on other variables.

The systematic inherent in the weighting is determined using Monte Carlo. The Monte Carlo version of the calibration sample used for computing the effi ciencies in section 4 is weighted with Monte Carlo kinematic distributions of the signal tracks. This is done using Test_MC.cpp or Test_EvtMC.cpp and they work in similar ways to Test_Data, which is explained in the attachment.

Example output of Test_MC and subsquent plotting of the DLL distributions is shown below.

MC signal and Unweighted calibration sample DLL distributions example.MC signal and weighted calibration samples DLL distributions.

The blue points are the Monte Carlo Signal, black are that of the unweighted calibration sample and the red are the calibration sample after weighting with the kinematics of the signal sample. Notice the better agreement of the red and blue than black and blue. This is how the process is validated with Monte Carlo, but a difference still exists between the red and blue distributions and this is the systematic uncertainty.

The systematic then is the difference in efficiencies calculated from the red and blue DLL distributions above. So one must run the output of Test_MC through Test_PerfCalc (in exactly the same way that you would to plot the efficiency curves in data) and subtract the 2 graphs plotted by the Perf_Scan function. An example of the systematic uncertainty in efficiency plotted as a function of DLL cut is shown below.

example of a weighting systematic as a function of DLL cut.

As you can see, in the reasonable region of interest (around cuts of zero), the systematic uncertainty is approximately 0.5%. Although this value varies according to the specific kinematics of the track being studied and may not always be so low.

Analysis Specific Details

This section contains details of how this package was used in the measurement of the branching ratio of Bs -> DK* w.r.t. Bd -> DRho, the analysis page is here.

First, the binning scheme used was 32 bins in momentum in the range (0,150)GeV/c and 4 bins in pseudorapidity the range being (1.5,5). The number of tracks per event was not used because it was found that the first 2 variables had the dominant effect on PID efficiency. Since there were limited statistics in this analysis only binning in 2 variables greatly reduced the statistical error.

In fact, since this analysis was done with a very limited number of events, the weighting was not done with events from data at all. Instead the kinematics of Monte Carlo samples were used to weight the calibration sample, with the assumption that momentum and pseudorapidity were well simulated in Monte Carlo. This, however, did not remove the problem of low statistics because the use of Monte Carlo kinematics to weight the calibration sample introduced a second systematic uncertainty from the possible differences between these distributions in data and Monte Carlo.

This second systematic required the data (only 36 events in Bs -> DK*) to be binned in 2 dimensions and used in the weighting (although this couldn't be used to calculate the efficiency, which is an unnecessary technical detail...).

In the end, in this analysis we were able to quote efficiencies calculated from this procedure with maximum total errors of roughly 5%. This is an improvement on previous figures, and without the hack of using Monte Carlo kinematics instead of data this error would be a lot lower.

-- EdmundSmith - 2011-08-10

META FILEATTACHMENT attachment="Test_Data.cpp" attr="" comment="Heavily annotated version of Test_Data.cpp" date="1312989338" name="Test_Data.cpp" path="Test_Data.cpp" size="13498" stream="Test_Data.cpp" tmpFilename="/usr/tmp/CGItemp29225" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="Wgt_unWgt_D0Kbs.png" attr="" comment="Example results of the weighting." date="1313063457" name="Wgt_unWgt_D0Kbs.png" path="Wgt_unWgt_D0Kbs.png" size="14047" stream="Wgt_unWgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp29013" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="Wgt_unWgt_RhoPiPlus.png" attr="" comment="Example results of the weighting for pions." date="1313064739" name="Wgt_unWgt_RhoPiPlus.png" path="Wgt_unWgt_RhoPiPlus.png" size="14309" stream="Wgt_unWgt_RhoPiPlus.png" tmpFilename="/usr/tmp/CGItemp28861" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="PlotWeightedDLLs.cpp" attr="" comment="Annotated version of PlotWeightedDLLs.cpp" date="1313066608" name="PlotWeightedDLLs.cpp" path="PlotWeightedDLLs.cpp" size="10034" stream="PlotWeightedDLLs.cpp" tmpFilename="/usr/tmp/CGItemp28908" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="eff_curve_D0Kbs.png" attr="" comment="Efficiency as a function of DLL cut." date="1313068791" name="eff_curve_D0Kbs.png" path="eff_curve_D0Kbs.png" size="14988" stream="eff_curve_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp32892" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="MCsig_unWgt_D0Kbs.png" attr="" comment="MC signal and Unweighted calibrations samples DLL." date="1313070430" name="MCsig_unWgt_D0Kbs.png" path="MCsig_unWgt_D0Kbs.png" size="14103" stream="MCsig_unWgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp32951" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="MCsig_Wgt_D0Kbs.png" attr="" comment="MC signal and weighted calibrations samples DLL." date="1313070458" name="MCsig_Wgt_D0Kbs.png" path="MCsig_Wgt_D0Kbs.png" size="13989" stream="MCsig_Wgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp33102" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="weight_sys_D0Kbs.png" attr="" comment="Weighting systematic example as a function of DLL cut." date="1313071019" name="weight_sys_D0Kbs.png" path="weight_sys_D0Kbs.png" size="19860" stream="weight_sys_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp33081" user="EdmundSmith" version="1"

Revision 32011-08-11 - EdmundSmith

 
META TOPICPARENT name="HowTos"

PIDCalib Package

About The Package

This package enables the user to properly calibrate the DLL distributions of their signal mode, and thus calculate the efficiency of a given cut accurately.

It is assumed that the DLL of a particular track depends on small set of kinematic or event variables. The set of these variables has changed during development of the package, at the time of writing track momentum, pseudorapidity and number of tracks in the event are the most commonly used.

Through "golden modes" that can be reconstructed without the use of the RICH detectors, it is possible to acquire pure calibration samples of pions and kaons. The DLL distributions of these tracks shows the true response of the RICH detectors to tracks with the same kinematics.

Changed:
<
<
The calibration and signal samples under study (acquired by some sort of fit, and appropriate background subtraction) are then binned in the chosen variables. In the limit where the bins are in finitely fine and the assumption of DLL depending on only the kinematic and event variables used being correct, then the DLL distribution of tracks in a particular bin of the calibration sample should be single-valued (this is important, make sure you understand this). The population of each bin in the calibration and signal samples is compared, the ratio of these is assigned as the weight to that particular bin. The true DLL distribution of the signal tracks is then given by the DLL distribution of the calibration sample weighted bin by bin.
>
>
The calibration and signal samples under study (acquired by some sort of fit, and appropriate background subtraction) are then binned in the chosen variables. In the limit where the bins are in finitely fine and the assumption of DLL depending on only the kinematic and event variables used being correct, then the DLL distribution of tracks in a particular bin of the calibration sample should be single-valued (this is the crucial principle on which the analysis is built). The population of each bin in the calibration and signal samples is compared, the ratio of these is assigned as the weight to that particular bin. The true DLL distribution of the signal tracks is then given by the DLL distribution of the calibration sample weighted bin by bin.
  The weighting process is repeated for each track in the final state of the signal mode, to obtain the DLL distributions for all tracks.

Getting the Code

How to get and compile the code and some more technical info can be found here.

Usage

Test_Data.cpp

Test_Data.cpp is the script provided with the package to do the weighting with data. You give it input of your signal sample kinematics from data, and it weights the data calibration samples using just momentum and pseudorapidity as the weighting variables. I have attached a heavily annotated version of Test_Data.cpp, which hopefully leaves any usage questions beyond doubt.

Other scripts to do the weighting procedure that are provided are Test_MC.cpp, Test_EvtMC.cpp and Test_EvtData.cpp. Test_MC.cpp performs the same as Test_Data.cpp on Monte Carlo, and is used primarily for validation of the procedure and systematic calculations. Test_EvtData.cpp does the same as Test_Data.cpp but number of tracks in the event is used as a weighting variable in addition to track momentum and pseudorapidity. You can guess what Test_EvtMC.cpp does...

Example Results

Here are some example plots that should be the result of Test_Data.cpp. These are drawn by running the PlotWeightedDLLs.cpp script located in PIDPerfScripts/scripts/ in the package.

Example results when weighting the calibration sample of pions with the kinematics of a signal track.Results of weighting the calibration sample of kaons with kinematics of another mode.

On these plots the black points are the DLL disitributions of the unweighted calibration samples, and the red points show the DLL distributions of the calibration sample once is has been weighted with a signal sample. The plot on the left is for a pion track, the one on the right for a kaon. Please ignore the "LHCb Preliminary" and "LHCb Monte Carlo" labels.

Note that, although PlotWeightedDLLs.cpp gives you the option of plotting the DLL distribution of your signal, this is meaningless when plotting results of weighting with data. Since the DLL distribution of the signal is what we have found with the weighting, and the one already stored is probably incorrect, since it hadn't been calibrated with this procedure. Plotting the Signal DLL distribution is mainly for Monte Carlo validation of the procedure, and systematic calculations, discussed in a subsequent section. The use of the PlotWeightedDLLs script is illustrated with another annotated attachment.

Plotting Performance Curves

Use Test_PerfCalc.cpp to do this, located in PIDPerfScripts/src. The code here is fairly self explanatory, please check the member functions of Perf_Calculator (in PIDPerfTools/PIDPerfTools/PerfCalculator.h) to see what kinds of graphs you can draw.

A simple example is the efficiency of a DLL cut as a function of the cut, drawn with the "Perf_Scan" function, example shown below.

This curve is drawn from the fractional integral of the red curve in the kaon example in the previous section at different points.

Systematics

There is a systematic contribution to the uncertainty on the DLL distributions that comes from the weighting procedure itself, for example using a coarse binning will cause loss of information since the appropriate weight is averaged over the whole bin, or the assumption that DLL depends only on the kinematic and event variables used, while it could have a dependence on other variables.

The systematic inherent in the weighting is determined using Monte Carlo. The Monte Carlo version of the calibration sample used for computing the effi ciencies in section 4 is weighted with Monte Carlo kinematic distributions of the signal tracks. This is done using Test_MC.cpp or Test_EvtMC.cpp and they work in similar ways to Test_Data, which is explained in the attachment.

Example output of Test_MC and subsquent plotting of the DLL distributions is shown below.

MC signal and Unweighted calibration sample DLL distributions example.MC signal and weighted calibration samples DLL distributions.

The blue points are the Monte Carlo Signal, black are that of the unweighted calibration sample and the red are the calibration sample after weighting with the kinematics of the signal sample. Notice the better agreement of the red and blue than black and blue. This is how the process is validated with Monte Carlo, but a difference still exists between the red and blue distributions and this is the systematic uncertainty.

The systematic then is the difference in efficiencies calculated from the red and blue DLL distributions above. So one must run the output of Test_MC through Test_PerfCalc (in exactly the same way that you would to plot the efficiency curves in data) and subtract the 2 graphs plotted by the Perf_Scan function. An example of the systematic uncertainty in efficiency plotted as a function of DLL cut is shown below.

example of a weighting systematic as a function of DLL cut.

Changed:
<
<
As you can see, in the reasonable region of interest (around cuts of zero), the systematic uncertainty is approximately 0.5%.
>
>
As you can see, in the reasonable region of interest (around cuts of zero), the systematic uncertainty is approximately 0.5%. Although this value varies according to the specific kinematics of the track being studied and may not always be so low.
 

Analysis Specific Details

This section contains details of how this package was used in the measurement of the branching ratio of Bs -> DK* w.r.t. Bd -> DRho, the analysis page is here.

First, the binning scheme used was 32 bins in momentum in the range (0,150)GeV/c and 4 bins in pseudorapidity the range being (1.5,5). The number of tracks per event was not used because it was found that the first 2 variables had the dominant effect on PID efficiency. Since there were limited statistics in this analysis only binning in 2 variables greatly reduced the statistical error.

In fact, since this analysis was done with a very limited number of events, the weighting was not done with events from data at all. Instead the kinematics of Monte Carlo samples were used to weight the calibration sample, with the assumption that momentum and pseudorapidity were well simulated in Monte Carlo. This, however, did not remove the problem of low statistics because the use of Monte Carlo kinematics to weight the calibration sample introduced a second systematic uncertainty from the possible differences between these distributions in data and Monte Carlo.

This second systematic required the data (only 36 events in Bs -> DK*) to be binned in 2 dimensions and used in the weighting (although this couldn't be used to calculate the efficiency, which is an unnecessary technical detail...).

In the end, in this analysis we were able to quote efficiencies calculated from this procedure with maximum total errors of roughly 5%. This is an improvement on previous figures, and without the hack of using Monte Carlo kinematics instead of data this error would be a lot lower.

-- EdmundSmith - 2011-08-10

META FILEATTACHMENT attachment="Test_Data.cpp" attr="" comment="Heavily annotated version of Test_Data.cpp" date="1312989338" name="Test_Data.cpp" path="Test_Data.cpp" size="13498" stream="Test_Data.cpp" tmpFilename="/usr/tmp/CGItemp29225" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="Wgt_unWgt_D0Kbs.png" attr="" comment="Example results of the weighting." date="1313063457" name="Wgt_unWgt_D0Kbs.png" path="Wgt_unWgt_D0Kbs.png" size="14047" stream="Wgt_unWgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp29013" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="Wgt_unWgt_RhoPiPlus.png" attr="" comment="Example results of the weighting for pions." date="1313064739" name="Wgt_unWgt_RhoPiPlus.png" path="Wgt_unWgt_RhoPiPlus.png" size="14309" stream="Wgt_unWgt_RhoPiPlus.png" tmpFilename="/usr/tmp/CGItemp28861" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="PlotWeightedDLLs.cpp" attr="" comment="Annotated version of PlotWeightedDLLs.cpp" date="1313066608" name="PlotWeightedDLLs.cpp" path="PlotWeightedDLLs.cpp" size="10034" stream="PlotWeightedDLLs.cpp" tmpFilename="/usr/tmp/CGItemp28908" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="eff_curve_D0Kbs.png" attr="" comment="Efficiency as a function of DLL cut." date="1313068791" name="eff_curve_D0Kbs.png" path="eff_curve_D0Kbs.png" size="14988" stream="eff_curve_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp32892" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="MCsig_unWgt_D0Kbs.png" attr="" comment="MC signal and Unweighted calibrations samples DLL." date="1313070430" name="MCsig_unWgt_D0Kbs.png" path="MCsig_unWgt_D0Kbs.png" size="14103" stream="MCsig_unWgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp32951" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="MCsig_Wgt_D0Kbs.png" attr="" comment="MC signal and weighted calibrations samples DLL." date="1313070458" name="MCsig_Wgt_D0Kbs.png" path="MCsig_Wgt_D0Kbs.png" size="13989" stream="MCsig_Wgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp33102" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="weight_sys_D0Kbs.png" attr="" comment="Weighting systematic example as a function of DLL cut." date="1313071019" name="weight_sys_D0Kbs.png" path="weight_sys_D0Kbs.png" size="19860" stream="weight_sys_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp33081" user="EdmundSmith" version="1"

Revision 22011-08-11 - EdmundSmith

 
META TOPICPARENT name="HowTos"
Changed:
<
<

This topic is under construction....

>
>

PIDCalib Package

 

About The Package

This package enables the user to properly calibrate the DLL distributions of their signal mode, and thus calculate the efficiency of a given cut accurately.

It is assumed that the DLL of a particular track depends on small set of kinematic or event variables. The set of these variables has changed during development of the package, at the time of writing track momentum, pseudorapidity and number of tracks in the event are the most commonly used.

Through "golden modes" that can be reconstructed without the use of the RICH detectors, it is possible to acquire pure calibration samples of pions and kaons. The DLL distributions of these tracks shows the true response of the RICH detectors to tracks with the same kinematics.

The calibration and signal samples under study (acquired by some sort of fit, and appropriate background subtraction) are then binned in the chosen variables. In the limit where the bins are in finitely fine and the assumption of DLL depending on only the kinematic and event variables used being correct, then the DLL distribution of tracks in a particular bin of the calibration sample should be single-valued (this is important, make sure you understand this). The population of each bin in the calibration and signal samples is compared, the ratio of these is assigned as the weight to that particular bin. The true DLL distribution of the signal tracks is then given by the DLL distribution of the calibration sample weighted bin by bin.

The weighting process is repeated for each track in the final state of the signal mode, to obtain the DLL distributions for all tracks.

Getting the Code

How to get and compile the code and some more technical info can be found here.

Usage

Test_Data.cpp

Test_Data.cpp is the script provided with the package to do the weighting with data. You give it input of your signal sample kinematics from data, and it weights the data calibration samples using just momentum and pseudorapidity as the weighting variables. I have attached a heavily annotated version of Test_Data.cpp, which hopefully leaves any usage questions beyond doubt.

Other scripts to do the weighting procedure that are provided are Test_MC.cpp, Test_EvtMC.cpp and Test_EvtData.cpp. Test_MC.cpp performs the same as Test_Data.cpp on Monte Carlo, and is used primarily for validation of the procedure and systematic calculations. Test_EvtData.cpp does the same as Test_Data.cpp but number of tracks in the event is used as a weighting variable in addition to track momentum and pseudorapidity. You can guess what Test_EvtMC.cpp does...

Example Results

Here are some example plots that should be the result of Test_Data.cpp. These are drawn by running the PlotWeightedDLLs.cpp script located in PIDPerfScripts/scripts/ in the package.

Changed:
<
<
I will put the plots here at a later date.
>
>
Example results when weighting the calibration sample of pions with the kinematics of a signal track.Results of weighting the calibration sample of kaons with kinematics of another mode.
 
Added:
>
>
On these plots the black points are the DLL disitributions of the unweighted calibration samples, and the red points show the DLL distributions of the calibration sample once is has been weighted with a signal sample. The plot on the left is for a pion track, the one on the right for a kaon. Please ignore the "LHCb Preliminary" and "LHCb Monte Carlo" labels.

Note that, although PlotWeightedDLLs.cpp gives you the option of plotting the DLL distribution of your signal, this is meaningless when plotting results of weighting with data. Since the DLL distribution of the signal is what we have found with the weighting, and the one already stored is probably incorrect, since it hadn't been calibrated with this procedure. Plotting the Signal DLL distribution is mainly for Monte Carlo validation of the procedure, and systematic calculations, discussed in a subsequent section. The use of the PlotWeightedDLLs script is illustrated with another annotated attachment.

Plotting Performance Curves

Use Test_PerfCalc.cpp to do this, located in PIDPerfScripts/src. The code here is fairly self explanatory, please check the member functions of Perf_Calculator (in PIDPerfTools/PIDPerfTools/PerfCalculator.h) to see what kinds of graphs you can draw.

A simple example is the efficiency of a DLL cut as a function of the cut, drawn with the "Perf_Scan" function, example shown below.

This curve is drawn from the fractional integral of the red curve in the kaon example in the previous section at different points.

Systematics

There is a systematic contribution to the uncertainty on the DLL distributions that comes from the weighting procedure itself, for example using a coarse binning will cause loss of information since the appropriate weight is averaged over the whole bin, or the assumption that DLL depends only on the kinematic and event variables used, while it could have a dependence on other variables.

The systematic inherent in the weighting is determined using Monte Carlo. The Monte Carlo version of the calibration sample used for computing the effi ciencies in section 4 is weighted with Monte Carlo kinematic distributions of the signal tracks. This is done using Test_MC.cpp or Test_EvtMC.cpp and they work in similar ways to Test_Data, which is explained in the attachment.

Example output of Test_MC and subsquent plotting of the DLL distributions is shown below.

MC signal and Unweighted calibration sample DLL distributions example.MC signal and weighted calibration samples DLL distributions.

The blue points are the Monte Carlo Signal, black are that of the unweighted calibration sample and the red are the calibration sample after weighting with the kinematics of the signal sample. Notice the better agreement of the red and blue than black and blue. This is how the process is validated with Monte Carlo, but a difference still exists between the red and blue distributions and this is the systematic uncertainty.

The systematic then is the difference in efficiencies calculated from the red and blue DLL distributions above. So one must run the output of Test_MC through Test_PerfCalc (in exactly the same way that you would to plot the efficiency curves in data) and subtract the 2 graphs plotted by the Perf_Scan function. An example of the systematic uncertainty in efficiency plotted as a function of DLL cut is shown below.

example of a weighting systematic as a function of DLL cut.

As you can see, in the reasonable region of interest (around cuts of zero), the systematic uncertainty is approximately 0.5%.

 

Analysis Specific Details

Changed:
<
<
This section contains details of how this package was used for this analysis.
>
>
This section contains details of how this package was used in the measurement of the branching ratio of Bs -> DK* w.r.t. Bd -> DRho, the analysis page is here.
 
Changed:
<
<
The binning scheme used is 32 bins in p in the range (0,150)GeV=c and 4 bins in in the range (1.5,5). This provides a way of knowing what the true ?LL distributions of the tracks in both our signal and normalisation modes are, and therefore what the e ciency of a given PID cut is, with systematic uncertainties associated with the procedure. However, the observed numbers of events in these two modes are too small (35 in B0s ! D0K0 and 154 in B0d ! D00) to perform this weighting procedure with data. Therefore, p and distributions from higher statistics Monte Carlo samples 4 are used to weight the calibration sample, which however introduces an additional systematic from any di erences that exist between these distributions in data and Monte Carlo.
>
>
First, the binning scheme used was 32 bins in momentum in the range (0,150)GeV/c and 4 bins in pseudorapidity the range being (1.5,5). The number of tracks per event was not used because it was found that the first 2 variables had the dominant effect on PID efficiency. Since there were limited statistics in this analysis only binning in 2 variables greatly reduced the statistical error.
 
Added:
>
>
In fact, since this analysis was done with a very limited number of events, the weighting was not done with events from data at all. Instead the kinematics of Monte Carlo samples were used to weight the calibration sample, with the assumption that momentum and pseudorapidity were well simulated in Monte Carlo. This, however, did not remove the problem of low statistics because the use of Monte Carlo kinematics to weight the calibration sample introduced a second systematic uncertainty from the possible differences between these distributions in data and Monte Carlo.

This second systematic required the data (only 36 events in Bs -> DK*) to be binned in 2 dimensions and used in the weighting (although this couldn't be used to calculate the efficiency, which is an unnecessary technical detail...).

In the end, in this analysis we were able to quote efficiencies calculated from this procedure with maximum total errors of roughly 5%. This is an improvement on previous figures, and without the hack of using Monte Carlo kinematics instead of data this error would be a lot lower.

 -- EdmundSmith - 2011-08-10
Added:
>
>
 
META FILEATTACHMENT attachment="Test_Data.cpp" attr="" comment="Heavily annotated version of Test_Data.cpp" date="1312989338" name="Test_Data.cpp" path="Test_Data.cpp" size="13498" stream="Test_Data.cpp" tmpFilename="/usr/tmp/CGItemp29225" user="EdmundSmith" version="1"
Added:
>
>
META FILEATTACHMENT attachment="Wgt_unWgt_D0Kbs.png" attr="" comment="Example results of the weighting." date="1313063457" name="Wgt_unWgt_D0Kbs.png" path="Wgt_unWgt_D0Kbs.png" size="14047" stream="Wgt_unWgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp29013" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="Wgt_unWgt_RhoPiPlus.png" attr="" comment="Example results of the weighting for pions." date="1313064739" name="Wgt_unWgt_RhoPiPlus.png" path="Wgt_unWgt_RhoPiPlus.png" size="14309" stream="Wgt_unWgt_RhoPiPlus.png" tmpFilename="/usr/tmp/CGItemp28861" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="PlotWeightedDLLs.cpp" attr="" comment="Annotated version of PlotWeightedDLLs.cpp" date="1313066608" name="PlotWeightedDLLs.cpp" path="PlotWeightedDLLs.cpp" size="10034" stream="PlotWeightedDLLs.cpp" tmpFilename="/usr/tmp/CGItemp28908" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="eff_curve_D0Kbs.png" attr="" comment="Efficiency as a function of DLL cut." date="1313068791" name="eff_curve_D0Kbs.png" path="eff_curve_D0Kbs.png" size="14988" stream="eff_curve_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp32892" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="MCsig_unWgt_D0Kbs.png" attr="" comment="MC signal and Unweighted calibrations samples DLL." date="1313070430" name="MCsig_unWgt_D0Kbs.png" path="MCsig_unWgt_D0Kbs.png" size="14103" stream="MCsig_unWgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp32951" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="MCsig_Wgt_D0Kbs.png" attr="" comment="MC signal and weighted calibrations samples DLL." date="1313070458" name="MCsig_Wgt_D0Kbs.png" path="MCsig_Wgt_D0Kbs.png" size="13989" stream="MCsig_Wgt_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp33102" user="EdmundSmith" version="1"
META FILEATTACHMENT attachment="weight_sys_D0Kbs.png" attr="" comment="Weighting systematic example as a function of DLL cut." date="1313071019" name="weight_sys_D0Kbs.png" path="weight_sys_D0Kbs.png" size="19860" stream="weight_sys_D0Kbs.png" tmpFilename="/usr/tmp/CGItemp33081" user="EdmundSmith" version="1"
 

Revision 12011-08-10 - EdmundSmith

 
META TOPICPARENT name="HowTos"

This topic is under construction....

About The Package

This package enables the user to properly calibrate the DLL distributions of their signal mode, and thus calculate the efficiency of a given cut accurately.

It is assumed that the DLL of a particular track depends on small set of kinematic or event variables. The set of these variables has changed during development of the package, at the time of writing track momentum, pseudorapidity and number of tracks in the event are the most commonly used.

Through "golden modes" that can be reconstructed without the use of the RICH detectors, it is possible to acquire pure calibration samples of pions and kaons. The DLL distributions of these tracks shows the true response of the RICH detectors to tracks with the same kinematics.

The calibration and signal samples under study (acquired by some sort of fit, and appropriate background subtraction) are then binned in the chosen variables. In the limit where the bins are in finitely fine and the assumption of DLL depending on only the kinematic and event variables used being correct, then the DLL distribution of tracks in a particular bin of the calibration sample should be single-valued (this is important, make sure you understand this). The population of each bin in the calibration and signal samples is compared, the ratio of these is assigned as the weight to that particular bin. The true DLL distribution of the signal tracks is then given by the DLL distribution of the calibration sample weighted bin by bin.

The weighting process is repeated for each track in the final state of the signal mode, to obtain the DLL distributions for all tracks.

Getting the Code

How to get and compile the code and some more technical info can be found here.

Usage

Test_Data.cpp

Test_Data.cpp is the script provided with the package to do the weighting with data. You give it input of your signal sample kinematics from data, and it weights the data calibration samples using just momentum and pseudorapidity as the weighting variables. I have attached a heavily annotated version of Test_Data.cpp, which hopefully leaves any usage questions beyond doubt.

Other scripts to do the weighting procedure that are provided are Test_MC.cpp, Test_EvtMC.cpp and Test_EvtData.cpp. Test_MC.cpp performs the same as Test_Data.cpp on Monte Carlo, and is used primarily for validation of the procedure and systematic calculations. Test_EvtData.cpp does the same as Test_Data.cpp but number of tracks in the event is used as a weighting variable in addition to track momentum and pseudorapidity. You can guess what Test_EvtMC.cpp does...

Example Results

Here are some example plots that should be the result of Test_Data.cpp. These are drawn by running the PlotWeightedDLLs.cpp script located in PIDPerfScripts/scripts/ in the package.

I will put the plots here at a later date.

Analysis Specific Details

This section contains details of how this package was used for this analysis.

The binning scheme used is 32 bins in p in the range (0,150)GeV=c and 4 bins in in the range (1.5,5). This provides a way of knowing what the true ?LL distributions of the tracks in both our signal and normalisation modes are, and therefore what the e ciency of a given PID cut is, with systematic uncertainties associated with the procedure. However, the observed numbers of events in these two modes are too small (35 in B0s ! D0K0 and 154 in B0d ! D00) to perform this weighting procedure with data. Therefore, p and distributions from higher statistics Monte Carlo samples 4 are used to weight the calibration sample, which however introduces an additional systematic from any di erences that exist between these distributions in data and Monte Carlo.

-- EdmundSmith - 2011-08-10

META FILEATTACHMENT attachment="Test_Data.cpp" attr="" comment="Heavily annotated version of Test_Data.cpp" date="1312989338" name="Test_Data.cpp" path="Test_Data.cpp" size="13498" stream="Test_Data.cpp" tmpFilename="/usr/tmp/CGItemp29225" user="EdmundSmith" version="1"
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback