# For user : Running the workflows ![structure](figs/comm_wf.png) All in one script : `scripts/suball.py` All the steps are summarized in the [`suball.py`](#all-in-one-script--scriptssuballpy) scripts for the existing workflows. You can simply just run ```python python scripts/suball.py --scheme ${SCHEME_FOR_STUDY} --campaign ${CAMPAIGN_FOR_SF} --year ${YEAR} --DAS_campaign "$DATA_CAMPAIGN_RGX,$MC_CAMPAIGN_RGX" #Example with 2023 Summer23 campaign python scripts/suball.py --scheme default_comissioning --campaign Summer23 --year 2023 --DAS_campaign "*Run2023D*Sep2023*,*Run3Summer23BPixNanoAODv12-130X*" #Example with 2024 Summer24 campaign in NanoAODv15 python scripts/suball.py --scheme Validation --campaign Summer24 --year 2024 --DAS_campaign "*Run2024*BTV*,*Summer24NanoAODv15-BTV*" ``` This wrap up the steps mentioned above as a streamline to obtained the required info The only missing item need to do manually is to change the updated correction in `AK4_parameter.py` as written [here](#correction-files-configurations) Each steps are also explained in detailed below, this can be obtain by ## 0. Make the dataset json files Use `fetch.py` in folder `scripts/` to obtain your samples json files for the predefined workflow with the refined MC. For more flexible usage please find [details](scripts.md#fetchpy--create-input-json). The fetch script reads the predefine data & MC samples dataset name and output the json file to `metadata/$CAMPAIGN/`, but to find the exact dataset for BTV studies, we usually need to specify the `DAS_campaign`. ``` python scripts/fetch.py -c {campaign} --year {args.year} --from_workflow {wf} --DAS_campaign {DAS_campaign} {--overwrite} {--executor futures -w 12} # campaign : the campaign name like Summer23,Winter22 # year : data taking years 2022/2023... # wf: workflow name like ttdilep_sf, ctag_Wc_sf # DAS_campaign: Input the campaign name for DAS to search appropriate campaigns, use in dataset construction , please do `campaign1,campaign2,campaign3`. Also supports "auto" (hard-coded!) if campaign and year are specified. # overwrite (bool): recreate the exist json file # Use `--executor futures -j 12 -w 12` to parallelize with, e.g., 12 cores ``` :::{caution} Do not make the file list greater than 4k files to avoid scaleout issues in various site (file open limit). ::: :::{tip} If `gfal-ls` does not work on your machine, reset the gfal-python with `GFAL_PYTHONBIN=/usr/bin/python3`. :::
Quick exercise — fetch example (Summer24) Try running the following command: ```python python scripts/fetch.py -c Summer24 --year 2024 -wf ttdilep_sf --DAS_campaign "*Run2024*-BTV*,*Summer24NanoAODv15-BTV*" --skipvalidation --overwrite ``` If you need more information, please run with the `--verbose` flag. Can you spot the difference if you run with `--executor futures`? Also try out a different workflow, for example, `ctag_DY_sf` See some samples missing? Since 2024 DY samples are lepton flavor split, so you would have to replace the present samples in the `src/BTVNanoCommissioning/utils/sample.py` by the current: `"DYto2E-4Jets_Bin-MLL-50_TuneCP5_13p6TeV_madgraphMLM-pythia8", "DYto2Mu-4Jets_Bin-MLL-50_TuneCP5_13p6TeV_madgraphMLM-pythia8"`. Please, add them respectively to the `src/BTVNanoCommissioning/helpers/xsection.py`
## 1. Correction files configurations & add new correction files (Optional) If the correction files are not supported yet by jsonpog-integration, you can still try with custom input data. All the `lumiMask`, correction files (SFs, pileup weight), and JEC, JER files are under `BTVNanoCommissioning/src/data/` following the substructure `${type}/${year}_${campaign}/${files}`(except `DC` and `Prescales`). | Type | File type | Comments| | :---: | :---: | :---: | | `DC` |`.json` | Masked good lumi-section used for physics analysis| | `Prescales` | `.json.` | HLT paths for prescaled triggers| | `PU` | `.pkl.gz` or `.histo.root` | Pileup reweight files, matched MC to data| | `MUO` | `.histo.root` | Muon ID/Iso/Reco/Trigger SFs| | `EGM` | `.histo.root` | Electron ID/Iso/Reco/Trigger SFs| | `BTV` | `.csv` or `.root` | b-tagger, c-tagger SFs| | `JME` | `.txt` | JER, JEC files| | `JPCalib` | `.root` | Jet probablity calibration, used in LTSV methods| Create a `dict` entry under `correction_config` with dedicated campaigns in `BTVNanoCommissioning/src/utils/AK4_parameters.py`. The official correction files collected in [jsonpog-integration](https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration) is updated by POG, except `lumiMask` and `JME` still updated by by the BTVNanoCommissioning framework user/developer. For centrally maintained correction files, no input files have to be defined anymore in the `correction_config`. The example to implemented new corrections from POG can be found in [git](https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration/-/blob/master/examples/), and the contents of the correction files are in the [summary](https://cms-analysis-corrections.docs.cern.ch/).
Example of a Run 2 corrections dictionary:

```python "2017_UL": { # Same with custom config "DC": "Cert_294927-306462_13TeV_UL2017_Collisions17_MuonJSON.json", "JME": { "MC": "Summer19UL17_V5_MC", "Run2017F": "Summer19UL17_RunF_V5_DATA", }, ### Alternatively, take the txt files in https://github.com/cms-jet/JECDatabase/tree/master/textFiles "JME": { # specified the name of JEC "name": "V1_AK4PFPuppi", # dictionary of jec text files "MC": [ "Summer23Prompt23_V1_MC_L1FastJet_AK4PFPuppi", "Summer23Prompt23_V1_MC_L2Relative_AK4PFPuppi", "Summer23Prompt23_V1_MC_L2Residual_AK4PFPuppi", "Summer23Prompt23_V1_MC_L3Absolute_AK4PFPuppi", "Summer23Prompt23_V1_MC_UncertaintySources_AK4PFPuppi", "Summer23Prompt23_V1_MC_Uncertainty_AK4PFPuppi", "Summer23Prompt23_JRV1_MC_SF_AK4PFPuppi", "Summer23Prompt23_JRV1_MC_PtResolution_AK4PFPuppi", ], "dataCv123": [ "Summer23Prompt23_RunCv123_V1_DATA_L1FastJet_AK4PFPuppi", "Summer23Prompt23_RunCv123_V1_DATA_L2Relative_AK4PFPuppi", "Summer23Prompt23_RunCv123_V1_DATA_L3Absolute_AK4PFPuppi", "Summer23Prompt23_RunCv123_V1_DATA_L2L3Residual_AK4PFPuppi", ], "dataCv4": [ "Summer23Prompt23_RunCv4_V1_DATA_L1FastJet_AK4PFPuppi", "Summer23Prompt23_RunCv4_V1_DATA_L2Relative_AK4PFPuppi", "Summer23Prompt23_RunCv4_V1_DATA_L3Absolute_AK4PFPuppi", "Summer23Prompt23_RunCv4_V1_DATA_L2L3Residual_AK4PFPuppi", ], }, ### # no config need to be specify for PU weights "LUM": None, # Alternatively, take root file as input "LUM": "puwei_Summer23.histo.root", # Btag SFs - specify $TAGGER : $TYPE-> find [$TAGGER_$TYPE] in json file "BTV": {"deepCSV": "shape", "deepJet": "shape"}, "roccor": None, # JMAR, IDs from JME- Following the scheme: "${SF_name}": "${WP}" "JMAR": {"PUJetID_eff": "L"}, "EGM": { # Electron SF - Following the scheme: "${SF_name} ${year}": "${WP}" # https://github.com/cms-egamma/cms-egamma-docs/blob/master/docs/EgammaSFJSON.md "ele_ID 2017": "wp90iso", "ele_Reco 2017": "RecoAbove20", }, "MUO":{ # Muon SF - Following the scheme: "${SF_name} ${year}": "${WP}" "mu_Reco 2017_UL": "NUM_TrackerMuons_DEN_genTracks", "mu_HLT 2017_UL": "NUM_IsoMu27_DEN_CutBasedIdTight_and_PFIsoTight", "mu_ID 2017_UL": "NUM_TightID_DEN_TrackerMuons", "mu_Iso 2017_UL": "NUM_TightRelIso_DEN_TightIDandIPCut", }, # use for BTA production, jet probablity "JPCalib": { "Run2022E": "calibeHistoWrite_Data2022F_NANO130X_v1.root", "Run2022F": "calibeHistoWrite_Data2022F_NANO130X_v1.root", "Run2022G": "calibeHistoWrite_Data2022G_NANO130X_v1.root", "MC": "calibeHistoWrite_MC2022EE_NANO130X_v1.root", }, }, ```

Quick exercise — corrections example (Summer24) Try leaving out lepton scale factors and JME corrections here: ```python "Summer24": { "DC": "Cert_Collisions2024_378981_386951_Golden.json", "LUM": "PU_weights_Summer24.histo.root", "JME": { ###<--- Try running without it # TODO: JER are a placeholder for now (July 2025) "MC": "Summer24Prompt24_V1 Summer23BPixPrompt23_RunD_JRV1", "Run2024C": "Summer24Prompt24_V1", "Run2024D": "Summer24Prompt24_V1", "Run2024E": "Summer24Prompt24_V1", "Run2024F": "Summer24Prompt24_V1", "Run2024G": "Summer24Prompt24_V1", "Run2024H": "Summer24Prompt24_V1", "Run2024I": "Summer24Prompt24_V1", }, "jetveto": {"Summer24Prompt24_RunBCDEFGHI_V1": "jetvetomap"}, "MUO": { ###<--- Try running without it "mu_ID": "NUM_TightID_DEN_TrackerMuons", "mu_Iso": "NUM_TightPFIso_DEN_TightID", }, "EGM":{ "ele_Reco 2024 Electron-ID-SF": "", "ele_ID 2024 Electron-ID-SF": "wp80iso", }, "muonSS": "", "electronSS": ["EGMScale_Compound_Ele_2024", "EGMSmearAndSyst_ElePTsplit_2024"], } ``` Can you spot the difference in distributions?
## 2. Run the workflow to get coffea files The `runner.py` handles the options to select the workflow with dedicated configuration for each campaign. The miniumum required info is ``` python runner.py --wf {wf} --json metadata/{args.campaign}/{types}_{args.campaign}_{args.year}_{wf}.json {overwrite} --campaign {args.campaign} --year {args.year} ``` :::{tip} - In case just to test your program, you can limit only one file with one chunk using iterative executor to avoid overwriting error message by `--max 1 --limit 1 --executor iterative` - In case you only want to run particular sample in your json `--only $dataset_name`, i.e. `--only TT*` or `--only MuonEG_Run2023A` - Change the numbers of scale job by `-s $NJOB` - Store the arrays by setting the flag `--isArray` - Modifying chunk size in case the jobs is to big `--chunk $N_EVENTS_PER_CHUNK` - Sometimes the global redirector is insufficient, you can increase the numbers of retries (only in parsl/dask) `--retries 30`, or skip the files `--skipbadfiles` and later reprocess the missing info by create the json with skipped files. Methods to create the json files discussed in the next part. :::
Other runner options

```python ### ====> REQUIRED <======= # --wf {validation,ttdilep_sf,ttsemilep_sf,c_ttsemilep_sf,emctag_ttdilep_sf,ctag_ttdilep_sf,ectag_ttdilep_sf,ctag_ttsemilep_sf,ectag_ttsemilep_sf,QCD_sf,QCD_smu_sf,ctag_Wc_sf,ectag_Wc_sf,ctag_DY_sf,ectag_DY_sf,BTA,BTA_addPFMuons,BTA_addAllTracks,BTA_ttbar}, --workflow {validation,ttdilep_sf,ttsemilep_sf,c_ttsemilep_sf,emctag_ttdilep_sf,ctag_ttdilep_sf,ectag_ttdilep_sf,ctag_ttsemilep_sf,ectag_ttsemilep_sf,QCD_sf,QCD_smu_sf,ctag_Wc_sf,ectag_Wc_sf,ctag_DY_sf,ectag_DY_sf,BTA,BTA_addPFMuons,BTA_addAllTracks,BTA_ttbar} # Which processor to run # -o OUTPUT, --output OUTPUT # Output histogram filename (default: hists.coffea) # --json SAMPLEJSON JSON file containing dataset and file locations (default: dummy_samples.json) # --year YEAR Year # --campaign CAMPAIGN Dataset campaign, change the corresponding correction files #=======Optional====== # ==> configurations for storing info # --isSyst {False,all,weight_only,JEC_full,JEC_reduced,JEC_reduced_JER_split,JEC_total,JP_MC} # Run with systematics (default: False) # --isArray Output root files # --noHist Not output coffea histogram # --overwrite Overwrite existing files # ==> scale out options # --executor {iterative,futures,parsl/slurm,parsl/condor,parsl/condor/naf_lite,dask/condor,dask/condor/brux,dask/slurm,dask/lpc,dask/lxplus,dask/casa,condor_standalone} # The type of executor to use (default: futures). Other options can be implemented. For example see https://parsl.readthedocs.io/en/stable/userguide/configuring.html- # `parsl/slurm` - tested at DESY/Maxwell- `parsl/condor` - tested at DESY, RWTH- `parsl/condor/naf_lite` - tested at DESY- `dask/condor/brux` - tested at BRUX (Brown U)- # `dask/slurm` - tested at DESY/Maxwell- `dask/condor` - tested at DESY, RWTH- `dask/lpc` - custom lpc/condor setup (due to write access restrictions)- `dask/lxplus` - custom # lxplus/condor setup (due to port restrictions) # -j WORKERS, --workers WORKERS # Number of workers (cores/threads) to use for multi-worker executors (e.g. futures or condor) (default: 3) # -s SCALEOUT, --scaleout SCALEOUT # Number of nodes to scale out to if using slurm/condor. Total number of concurrent threads is ``workers x scaleout`` (default: 6) # --memory MEMORY Memory used in jobs default ``(default: 4.0) # --disk DISK Disk used in jobs default ``(default: 4) # --voms VOMS Path to voms proxy, made accessible to worker nodes. By default a copy will be made to $HOME. # --chunk N Number of events per process chunk # --retries N Number of retries for coffea processor # --fsize FSIZE (Specific for dask/lxplus file splitting, default: 50) Numbers of files processed per dask-worker # --index INDEX (Specific for dask/lxplus file splitting, default: 0,0) Format: $dict_index_start,$file_index_start,$dict_index_stop,$file_index_stop. Stop indices are optional. $dict_index # refers to the index, splitted $dict_index and $file_index with ','$dict_index refers to the sample dictionary of the samples json file. $file_index refers to the N-th batch # of files per dask-worker, with its size being defined by the option --index. The job will start (stop) submission from (with) the corresponding indices. # ==> debug option # --validate Do not process, just check all files are accessible # --skipbadfiles Skip bad files. # --only ONLY Only process specific dataset or file # --limit N Limit to the first N files of each dataset in sample JSON # --max N Max number of chunks to run in total ```

Quick exercise — runner example (Summer24) Try running the following command: ```python python runner.py --wf ctag_DY_sf --json metadata/Summer24/data_Summer24_2024_ctag_DY_sf.json --overwrite --campaign Summer24 --year 2024 --outputdir Commissioning_tutorial/ ``` If you make a test run to see if the code works properly, please add `--limit 1` as a flag. Otherwise, try applying scale out on some cluster.
## 3. Dump processed information to obtain luminoisty and processed files After obtaining the `.coffea` file, we can check the processed files and obtain the luminosity in the processed files. To get the run & luminosity information for the processed events from the `.coffea` output files use the `scripts/dump_processed.py` script. This script helps you to dump the processed luminosity into a json file that can be used by the brilcalc tool to calculate the output luminosity. A list of failed lumi sections can then be obtained by comparing the original json input to the one from the `.coffea` files. We will see the luminosity info in `/pb` and the files skipped by the runner flag `--skipbadfiles` as new json ready for resubmission. ```bash python scripts/dump_processed.py -t all -c INPUT_COFFEA --json ORIGINAL_JSON_INPUT -n {args.campaign}_{args.year}_{wf} # -t {all,lumi,failed}, --type {all,lumi,failed} # Choose the function for dump luminosity(`lumi`)/failed files(`failed`) into json # -c COFFEA, --coffea COFFEA # Processed coffea files, splitted by ,. Wildcard option * available as well. # -n FNAME, --fname FNAME # Output name of jsons(with _lumi/_dataset) # -j JSONS, --jsons JSONS # Original input json files, splitted by ,. Wildcard option * available as well. ```
Quick exercise — postprocessing example (Summer24) Try running the following command: ```python python scripts/dump_processed.py -c Commissioning_tutorial/hists_ctag_DY_sf_data_Summer24_2024_ctag_DY_sf/hists_ctag_DY_sf_data_Summer24_2024_ctag_DY_sf.coffea -n ctag_DY_sf_2024 -t lumi ``` If you encounter any problems when running the script, make sure you activated [BRIL](https://twiki.cern.ch/twiki/bin/viewauth/CMS/BrilcalcQuickStart#Getting_started_without_requirin) environment properly!
## 4. Obtain data/MC plots We can obtain data/MC plots from coffea via the `scripts/plotdataMC.py` plotting script. For other possible plotting scripts see [plotting scripts](scripts.md#Plotting-code). You can specify `-v all` to plot all the variables in the `.coffea` file, or use wildcard options, e.g. `-v "*DeepJet*"` for the input variables containing `DeepJet`. :new: non-uniform rebinning is possible, specify the bins with list of edges `--autorebin 50,80,81,82,83,100.5`. ```bash python scripts/plotdataMC.py -i a.coffea,b.coffea --lumi 41500 -p ttdilep_sf -v z_mass,z_pt python scripts/plotdataMC.py -i "test*.coffea" --lumi 41500 -p ttdilep_sf -v z_mass,z_pt # with wildcard option need "" ```
Options:

``` -h, --help show this help message and exit --lumi LUMI luminosity in /pb --com COM sqrt(s) in TeV -p {ttdilep_sf,ttsemilep_sf,ctag_Wc_sf,ctag_DY_sf,ctag_ttsemilep_sf,ctag_ttdilep_sf}, --phase {dilep_sf,ttsemilep_sf,ctag_Wc_sf,ctag_DY_sf,ctag_ttsemilep_sf,ctag_ttdilep_sf} which phase space --log LOG log on y axis --norm NORM Use for reshape SF, scale to same yield as no SFs case -v VARIABLE, --variable VARIABLE variables to plot, splitted by ,. Wildcard option * available as well. Specifying `all` will run through all variables. --SF make w/, w/o SF comparisons --ext EXT prefix name -i INPUT, --input INPUT input coffea files (str), splitted different files with ','. Wildcard option * available as well. --autorebin AUTOREBIN Rebin the plotting variables, input `int` or `list`. int: merge N bins. list of number: rebin edges(non-uniform bin is possible) --xlabel XLABEL rename the label for x-axis --ylabel YLABEL rename the label for y-axis --splitOSSS SPLITOSSS Only for W+c phase space, split opposite sign(1) and same sign events(-1), if not specified, the combined OS-SS phase space is used --xrange XRANGE custom x-range, --xrange xmin,xmax --flow FLOW str, optional {None, 'show', 'sum'} Whether plot the under/overflow bin. If 'show', add additional under/overflow bin. If 'sum', add the under/overflow bin content to first/last bin. --split {flavor,sample,sample_flav} Decomposition of MC samples. Default is split to jet flavor(udsg, pu, c, b), possible to split by group of MC samples. Combination of jetflavor+ sample split is also possible ```

Quick exercise — plotting example (Summer24) Try running the following command: ```python python scripts/plotdataMC.py -i Commissioning_tutorial/hists_ctag_DY_sf_data_Summer24_2024_ctag_DY_sf/hists_ctag_DY_sf_data_Summer24_2024_ctag_DY_sf.coffea,Commissioning_tutorial/hists_ctag_DY_sf_MC_Summer24_2024_ctag_DY_sf/hists_ctag_DY_sf_MC_Summer24_2024_ctag_DY_sf.coffea --lumi YOUR_LUMI -p ctag_DY_sf -v jet0_pt ``` Run `-v all` if you want a full collection of variables. Try exploring the way to present your plots by splitting them by sample or making them log-scaled
## Reading coffea `hist` Quick tutorial to go through coffea files. Example coffea files can be found in `testfile/`. ### Structure of the file The coffea file contains histograms wrapped in a dictionary with `$dataset:{$histname:hist}`, where the `hist` is a [hist](https://hist.readthedocs.io/en/latest/) histogram that allows multidimensional binning with different datatypes, for example: ```python {'WW_TuneCP5_13p6TeV-pythia8':{ 'btagDeepFlavB_b_0': Hist( IntCategory([0, 1, 4, 5, 6], name='flav', label='Genflavour'), IntCategory([1, -1], name='osss', label='OS(+)/SS(-)'), StrCategory(['noSF'], growth=True, name='syst'), Regular(30, -0.2, 1, name='discr', label='btagDeepFlavB_b'), storage=Weight()) # Sum: WeightedSum(value=140, variance=140), 'btagDeepFlavB_bb_0': Hist( IntCategory([0, 1, 4, 5, 6], name='flav', label='Genflavour'), IntCategory([1, -1], name='osss', label='OS(+)/SS(-)'), StrCategory(['noSF'], growth=True, name='syst'), Regular(30, -0.2, 1, name='discr', label='btagDeepFlavB_bb'), storage=Weight()) # Sum: WeightedSum(value=140, variance=140), 'btagDeepFlavB_lepb_0': Hist( IntCategory([0, 1, 4, 5, 6], name='flav', label='Genflavour'), IntCategory([1, -1], name='osss', label='OS(+)/SS(-)'), StrCategory(['noSF'], growth=True, name='syst'), Regular(30, -0.2, 1, name='discr', label='btagDeepFlavB_lepb'), storage=Weight()) # Sum: WeightedSum(value=140, variance=140)}} ``` The processed file and lumi/run info are also stored for each dataset in the file. This information is used in [dump_processed info](user.md#3-dump-processed-information-to-obtain-luminoisty-and-processed-files). The stored histograms are multidimensional histograms, with the axes such as: ```python Hist( IntCategory([0, 1, 4, 5, 6], name='flav', label='Genflavour'),# different genflavor, 0 for light, 1 for PU, 2 for c, 3 for b. Always 0 for data. IntCategory([1, -1], name='osss', label='OS(+)/SS(-)'),# opposite sign or same sign, only appears in W+c workflow StrCategory(['noSF','PUUp','PUDown'], growth=True, name='syst'),# systematics variations, Regular(30, -0.2, 1, name='discr', label='btagDeepFlavB_lepb'),# discriminator distribution, the last axis is always the variable storage=Weight()) # Sum: WeightedSum(value=140, variance=140)# Value is sum of the entries, Variances is sum of the variances. ``` ### Read coffea files and explore the histogram ```python from coffea.util import load # open coffea file output=load("hists_ctag_Wc_sf_VV.coffea") # get the histogram and read the info hist=output['WW_TuneCP5_13p6TeV-pythia8']['btagDeepFlavB_lepb_0'] # addition for two histogram is possible if the axis is the same histvv=output['WW_TuneCP5_13p6TeV-pythia8']['btagDeepFlavB_lepb_0']+ output['WZ_TuneCP5_13p6TeV-pythia8']['btagDeepFlavB_lepb_0']+ output['ZZ_TuneCP5_13p6TeV-pythia8']['btagDeepFlavB_lepb_0'] # To get 1D histogram, we need to reduce the dimention # we can specify the axis we want to read, e.g. read charm jet, opposite sign events with noSF axis={'flav':3,'os':0,'syst':'noSF'} hist1d=hist[axis] #--> this is the 1D histogram Hist # you can also sum over the axis, e.g. here shows no jet flavor split and sum os+ss axis={'flav':sum,'os':sum,'syst':'noSF'} # rebin the axis is also possible, rebin the discrimnator by merged two bins into one axis={'flav':sum,'os':sum,'syst':'noSF','discr':hist.rebin(2)} ``` ### Plot the histogram You can simply plot the histogram using [mplhep](https://mplhep.readthedocs.io/en/latest/) ```python import mplhep as hep import matplotlib.pyplot as plt # set the plot style like tdr style plt.style.use(hep.style.ROOT) # make 1D histogram plot hep.histplot(hist1D) ``` ### convert coffea hist to ROOT TH1 You can use `scripts/make_template.py` to convert the histograms in the coffea files into 1D/2D ROOT histograms: ```python python scripts/make_template.py -i $INPUT_COFFEA --lumi $LUMI_IN_invPB -o $ROOTFILE_NAME -v $VARIABLE -a $HIST_AXIS ## Example python scripts/make_template.py -i "testfile/*.coffea" --lumi 7650 -o test.root -v mujet_pt -a '{"flav":0,"osss":"sum"}' --mergemap ##### # -h, --help show this help message and exit # -i INPUT, --input INPUT # Input coffea file(s), can be a regular expression contains '*' # -v VARIABLE, --variable VARIABLE # Variables to store(histogram name) # -a AXIS, --axis AXIS dict, put the slicing of histogram, specify 'sum' option as string # --lumi LUMI Luminosity in /pb , normalize the MC yields to corresponding luminosity # -o OUTPUT, --output OUTPUT # output root file name # --mergemap MERGEMAP Specify mergemap as dict, '{merge1:[dataset1,dataset2]...}' Also works with the json file with dict #### EXAMPLE MERGEMAP { "WJets": ["WJetsToLNu_TuneCP5_13p6TeV-madgraphMLM-pythia8"], "VV": [ "WW_TuneCP5_13p6TeV-pythia8", "WZ_TuneCP5_13p6TeV-pythia8", "ZZ_TuneCP5_13p6TeV-pythia8"], "TT": [ "TTTo2J1L1Nu_CP5_13p6TeV_powheg-pythia8", "TTTo2L2Nu_CP5_13p6TeV_powheg-pythia8"], "ST":[ "TBbarQ_t-channel_4FS_CP5_13p6TeV_powheg-madspin-pythia8", "TbarWplus_DR_AtLeastOneLepton_CP5_13p6TeV_powheg-pythia8", "TbarBQ_t-channel_4FS_CP5_13p6TeV_powheg-madspin-pythia8", "TWminus_DR_AtLeastOneLepton_CP5_13p6TeV_powheg-pythia8"], "data":[ "Muon_Run2022C-PromptReco-v1", "SingleMuon_Run2022C-PromptReco-v1", "Muon_Run2022D-PromptReco-v1", "Muon_Run2022D-PromptReco-v2"] } ```