Unbinned multivariate observables for global SMEFT analyses from machine learning

Raquel Ambrosio (Dipartimento di Fisica “G. Occhialini”, Universita degli Studi di Milano-Bicocca and INFN, Sezione di Milano Bicocca, Piazza della Scienza 3, Milano, I-20126, Italy; Dipartimento di Fisica, Università degli Studi di Torino and INFN, Sezione di Torino, Via P. Giuria 1, Torino, 10125, Italy) ; Jaco Hoeve (Department of Physics and Astronomy, VU Amsterdam, Amsterdam, 1081HV, The Netherlands; Nikhef Theory Group, Science Park 105, Amsterdam, 1098 XG, The Netherlands) ; Maeve Madigan (DAMTP, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, U.K.) ; Juan Rojo (Department of Physics and Astronomy, VU Amsterdam, Amsterdam, 1081HV, The Netherlands; Nikhef Theory Group, Science Park 105, Amsterdam, 1098 XG, The Netherlands) ; Veronica Sanz (Instituto de Física Corpuscular (IFIC), Universidad de Valencia-CSIC, Valencia, E-46980, Spain; Department of Physics and Astronomy, University of Sussex, Brighton, BN1 9QH, U.K.)

Theoretical interpretations of particle physics data, such as the determination of the Wilson coefficients of the Standard Model Effective Field Theory (SMEFT), often involve the inference of multiple parameters from a global dataset. Optimizing such interpretations requires the identification of observables that exhibit the highest possible sensitivity to the underlying theory parameters. In this work we develop a flexible open source frame-work, ML4EFT, enabling the integration of unbinned multivariate observables into global SMEFT fits. As compared to traditional measurements, such observables enhance the sensitivity to the theory parameters by preventing the information loss incurred when binning in a subset of final-state kinematic variables. Our strategy combines machine learning regression and classification techniques to parameterize high-dimensional likelihood ratios, using the Monte Carlo replica method to estimate and propagate methodological uncertainties. As a proof of concept we construct unbinned multivariate observables for top-quark pair and Higgs+Z production at the LHC, demonstrate their impact on the SMEFT parameter space as compared to binned measurements, and study the improved constraints associated to multivariate inputs. Since the number of neural networks to be trained scales quadratically with the number of parameters and can be fully parallelized, the ML4EFT framework is well-suited to construct unbinned multivariate observables which depend on up to tens of EFT coefficients, as required in global fits.

{
  "_oai": {
    "updated": "2023-06-24T00:34:18Z", 
    "id": "oai:repo.scoap3.org:76254", 
    "sets": [
      "JHEP"
    ]
  }, 
  "authors": [
    {
      "affiliations": [
        {
          "country": "Italy", 
          "value": "Dipartimento di Fisica \u201cG. Occhialini\u201d, Universita degli Studi di Milano-Bicocca and INFN, Sezione di Milano Bicocca, Piazza della Scienza 3, Milano, I-20126, Italy", 
          "organization": "Universita degli Studi di Milano-Bicocca and INFN, Sezione di Milano Bicocca"
        }, 
        {
          "country": "Italy", 
          "value": "Dipartimento di Fisica, Universit\u00e0 degli Studi di Torino and INFN, Sezione di Torino, Via P. Giuria 1, Torino, 10125, Italy", 
          "organization": "Universit\u00e0 degli Studi di Torino and INFN, Sezione di Torino"
        }
      ], 
      "surname": "Ambrosio", 
      "email": "raquel.gomezambrosio@unito.it", 
      "full_name": "Ambrosio, Raquel", 
      "given_names": "Raquel"
    }, 
    {
      "affiliations": [
        {
          "country": "Netherlands", 
          "value": "Department of Physics and Astronomy, VU Amsterdam, Amsterdam, 1081HV, The Netherlands", 
          "organization": "VU Amsterdam"
        }, 
        {
          "country": "Netherlands", 
          "value": "Nikhef Theory Group, Science Park 105, Amsterdam, 1098 XG, The Netherlands", 
          "organization": "Nikhef Theory Group"
        }
      ], 
      "surname": "Hoeve", 
      "email": "j.j.ter.hoeve@vu.nl", 
      "full_name": "Hoeve, Jaco", 
      "given_names": "Jaco"
    }, 
    {
      "affiliations": [
        {
          "country": "UK", 
          "value": "DAMTP, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, U.K.", 
          "organization": "University of Cambridge"
        }
      ], 
      "surname": "Madigan", 
      "email": "mum20@cam.ac.uk", 
      "full_name": "Madigan, Maeve", 
      "given_names": "Maeve"
    }, 
    {
      "affiliations": [
        {
          "country": "Netherlands", 
          "value": "Department of Physics and Astronomy, VU Amsterdam, Amsterdam, 1081HV, The Netherlands", 
          "organization": "VU Amsterdam"
        }, 
        {
          "country": "Netherlands", 
          "value": "Nikhef Theory Group, Science Park 105, Amsterdam, 1098 XG, The Netherlands", 
          "organization": "Nikhef Theory Group"
        }
      ], 
      "surname": "Rojo", 
      "email": "j.rojo@vu.nl", 
      "full_name": "Rojo, Juan", 
      "given_names": "Juan"
    }, 
    {
      "affiliations": [
        {
          "country": "Spain", 
          "value": "Instituto de F\u00edsica Corpuscular (IFIC), Universidad de Valencia-CSIC, Valencia, E-46980, Spain", 
          "organization": "Universidad de Valencia-CSIC"
        }, 
        {
          "country": "UK", 
          "value": "Department of Physics and Astronomy, University of Sussex, Brighton, BN1 9QH, U.K.", 
          "organization": "University of Sussex"
        }
      ], 
      "surname": "Sanz", 
      "email": "veronica.sanz@uv.es", 
      "full_name": "Sanz, Veronica", 
      "given_names": "Veronica"
    }
  ], 
  "titles": [
    {
      "source": "Springer", 
      "title": "Unbinned multivariate observables for global SMEFT analyses from machine learning"
    }
  ], 
  "dois": [
    {
      "value": "10.1007/JHEP03(2023)033"
    }
  ], 
  "publication_info": [
    {
      "page_end": "66", 
      "journal_title": "Journal of High Energy Physics", 
      "material": "article", 
      "journal_volume": "2023", 
      "artid": "JHEP03(2023)033", 
      "year": 2023, 
      "page_start": "1", 
      "journal_issue": "3"
    }
  ], 
  "$schema": "http://repo.scoap3.org/schemas/hep.json", 
  "acquisition_source": {
    "date": "2023-06-24T00:31:16.457422", 
    "source": "Springer", 
    "method": "Springer", 
    "submission_number": "40dbafc0122611ee91ac6ee2827c7def"
  }, 
  "page_nr": [
    66
  ], 
  "license": [
    {
      "url": "https://creativecommons.org/licenses//by/4.0", 
      "license": "CC-BY-4.0"
    }
  ], 
  "copyright": [
    {
      "holder": "The Author(s)", 
      "year": "2023"
    }
  ], 
  "control_number": "76254", 
  "record_creation_date": "2023-03-08T03:30:25.544789", 
  "_files": [
    {
      "checksum": "md5:8fff870d7bd92375a30641b91ff48057", 
      "filetype": "xml", 
      "bucket": "cc7ea3ac-7347-415d-93f6-6f3adb8cfe35", 
      "version_id": "4a41a0ec-4897-4138-b51f-79ba3ec9780b", 
      "key": "10.1007/JHEP03(2023)033.xml", 
      "size": 17360
    }, 
    {
      "checksum": "md5:b8087ef9694e1d1c2e1ab0376335da9d", 
      "filetype": "pdf/a", 
      "bucket": "cc7ea3ac-7347-415d-93f6-6f3adb8cfe35", 
      "version_id": "3a1e7f33-058c-40ed-a21d-6549adc4d2fa", 
      "key": "10.1007/JHEP03(2023)033_a.pdf", 
      "size": 4539889
    }
  ], 
  "collections": [
    {
      "primary": "Journal of High Energy Physics"
    }
  ], 
  "arxiv_eprints": [
    {
      "categories": [
        "hep-ph", 
        "hep-ex"
      ], 
      "value": "2211.02058"
    }
  ], 
  "abstracts": [
    {
      "source": "Springer", 
      "value": "Theoretical interpretations of particle physics data, such as the determination of the Wilson coefficients of the Standard Model Effective Field Theory (SMEFT), often involve the inference of multiple parameters from a global dataset. Optimizing such interpretations requires the identification of observables that exhibit the highest possible sensitivity to the underlying theory parameters. In this work we develop a flexible open source frame-work, ML4EFT, enabling the integration of unbinned multivariate observables into global SMEFT fits. As compared to traditional measurements, such observables enhance the sensitivity to the theory parameters by preventing the information loss incurred when binning in a subset of final-state kinematic variables. Our strategy combines machine learning regression and classification techniques to parameterize high-dimensional likelihood ratios, using the Monte Carlo replica method to estimate and propagate methodological uncertainties. As a proof of concept we construct unbinned multivariate observables for top-quark pair and Higgs+Z production at the LHC, demonstrate their impact on the SMEFT parameter space as compared to binned measurements, and study the improved constraints associated to multivariate inputs. Since the number of neural networks to be trained scales quadratically with the number of parameters and can be fully parallelized, the ML4EFT framework is well-suited to construct unbinned multivariate observables which depend on up to tens of EFT coefficients, as required in global fits."
    }
  ], 
  "imprints": [
    {
      "date": "2023-03-06", 
      "publisher": "Springer"
    }
  ]
}
Published on:
06 March 2023
Publisher:
Springer
Published in:
Journal of High Energy Physics , Volume 2023 (2023)
Issue 3
Pages 1-66
DOI:
https://doi.org/10.1007/JHEP03(2023)033
arXiv:
2211.02058
Copyrights:
The Author(s)
Licence:
CC-BY-4.0

Fulltext files: