Autoencoders for unsupervised anomaly detection in high energy physics

Thorben Finke (Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany) ; Michael Krämer (Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany) ; Alessandro Morandini (Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany) ; Alexander Mück (Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany) ; Ivan Oleksiyuk (Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany)

Autoencoders are widely used in machine learning applications, in particular for anomaly detection. Hence, they have been introduced in high energy physics as a promising tool for model-independent new physics searches. We scrutinize the usage of autoencoders for unsupervised anomaly detection based on reconstruction loss to show their capabilities, but also their limitations. As a particle physics benchmark scenario, we study the tagging of top jet images in a background of QCD jet images. Although we reproduce the positive results from the literature, we show that the standard autoencoder setup cannot be considered as a model-independent anomaly tagger by inverting the task: due to the sparsity and the specific structure of the jet images, the autoencoder fails to tag QCD jets if it is trained on top jets even in a semi-supervised setup. Since the same autoencoder architecture can be a good tagger for a specific example of an anomaly and a bad tagger for a different example, we suggest improved performance measures for the task of model-independent anomaly detection. We also improve the capability of the autoencoder to learn non-trivial features of the jet images, such that it is able to achieve both top jet tagging and the inverse task of QCD jet tagging with the same setup. However, we want to stress that a truly model-independent and powerful autoencoder-based unsupervised jet tagger still needs to be developed.

{
  "_oai": {
    "updated": "2021-09-23T13:44:10Z", 
    "id": "oai:repo.scoap3.org:63099", 
    "sets": [
      "JHEP"
    ]
  }, 
  "authors": [
    {
      "affiliations": [
        {
          "country": "Germany", 
          "value": "Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany", 
          "organization": "RWTH Aachen University"
        }
      ], 
      "surname": "Finke", 
      "email": "finke@physik.rwth-aachen.de", 
      "full_name": "Finke, Thorben", 
      "given_names": "Thorben"
    }, 
    {
      "affiliations": [
        {
          "country": "Germany", 
          "value": "Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany", 
          "organization": "RWTH Aachen University"
        }
      ], 
      "surname": "Kr\u00e4mer", 
      "email": "mkraemer@physik.rwth-aachen.de", 
      "full_name": "Kr\u00e4mer, Michael", 
      "given_names": "Michael"
    }, 
    {
      "affiliations": [
        {
          "country": "Germany", 
          "value": "Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany", 
          "organization": "RWTH Aachen University"
        }
      ], 
      "surname": "Morandini", 
      "email": "morandini@physik.rwth-achen.de", 
      "full_name": "Morandini, Alessandro", 
      "given_names": "Alessandro"
    }, 
    {
      "affiliations": [
        {
          "country": "Germany", 
          "value": "Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany", 
          "organization": "RWTH Aachen University"
        }
      ], 
      "surname": "M\u00fcck", 
      "email": "mueck@physik.rwth-aachen.de", 
      "full_name": "M\u00fcck, Alexander", 
      "given_names": "Alexander"
    }, 
    {
      "affiliations": [
        {
          "country": "Germany", 
          "value": "Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Aachen, D-52056, Germany", 
          "organization": "RWTH Aachen University"
        }
      ], 
      "surname": "Oleksiyuk", 
      "email": "ivan.oleksiyuk@rwth-aachen.de", 
      "full_name": "Oleksiyuk, Ivan", 
      "given_names": "Ivan"
    }
  ], 
  "titles": [
    {
      "source": "Springer", 
      "title": "Autoencoders for unsupervised anomaly detection in high energy physics"
    }
  ], 
  "dois": [
    {
      "value": "10.1007/JHEP06(2021)161"
    }
  ], 
  "publication_info": [
    {
      "page_end": "32", 
      "journal_title": "Journal of High Energy Physics", 
      "material": "article", 
      "journal_volume": "2021", 
      "artid": "JHEP06(2021)161", 
      "year": 2021, 
      "page_start": "1", 
      "journal_issue": "6"
    }
  ], 
  "$schema": "http://repo.scoap3.org/schemas/hep.json", 
  "acquisition_source": {
    "date": "2021-09-23T12:33:57.208594", 
    "source": "Springer", 
    "method": "Springer", 
    "submission_number": "f8313adc1c6911ecb53772fd3742099d"
  }, 
  "page_nr": [
    32
  ], 
  "license": [
    {
      "url": "https://creativecommons.org/licenses//by/4.0", 
      "license": "CC-BY-4.0"
    }
  ], 
  "copyright": [
    {
      "holder": "The Author(s)", 
      "year": "2021"
    }
  ], 
  "control_number": "63099", 
  "record_creation_date": "2021-06-29T00:30:25.541967", 
  "_files": [
    {
      "checksum": "md5:ac472a02edc1cb0c2e08c24d9c70ab69", 
      "filetype": "xml", 
      "bucket": "56dd176d-5c07-4019-a233-2c12a2ace86a", 
      "version_id": "d3e522ce-bb67-4298-bc0a-5619ff2edfd7", 
      "key": "10.1007/JHEP06(2021)161.xml", 
      "size": 12839
    }, 
    {
      "checksum": "md5:f17478bc0afefa07f16277db6523e341", 
      "filetype": "pdf/a", 
      "bucket": "56dd176d-5c07-4019-a233-2c12a2ace86a", 
      "version_id": "73b1157a-7194-45bf-a361-8763e22fd6c2", 
      "key": "10.1007/JHEP06(2021)161_a.pdf", 
      "size": 14169406
    }
  ], 
  "collections": [
    {
      "primary": "Journal of High Energy Physics"
    }
  ], 
  "arxiv_eprints": [
    {
      "categories": [
        "hep-ph", 
        "cs.LG", 
        "physics.data-an"
      ], 
      "value": "2104.09051"
    }
  ], 
  "abstracts": [
    {
      "source": "Springer", 
      "value": "Autoencoders are widely used in machine learning applications, in particular for anomaly detection. Hence, they have been introduced in high energy physics as a promising tool for model-independent new physics searches. We scrutinize the usage of autoencoders for unsupervised anomaly detection based on reconstruction loss to show their capabilities, but also their limitations. As a particle physics benchmark scenario, we study the tagging of top jet images in a background of QCD jet images. Although we reproduce the positive results from the literature, we show that the standard autoencoder setup cannot be considered as a model-independent anomaly tagger by inverting the task: due to the sparsity and the specific structure of the jet images, the autoencoder fails to tag QCD jets if it is trained on top jets even in a semi-supervised setup. Since the same autoencoder architecture can be a good tagger for a specific example of an anomaly and a bad tagger for a different example, we suggest improved performance measures for the task of model-independent anomaly detection. We also improve the capability of the autoencoder to learn non-trivial features of the jet images, such that it is able to achieve both top jet tagging and the inverse task of QCD jet tagging with the same setup. However, we want to stress that a truly model-independent and powerful autoencoder-based unsupervised jet tagger still needs to be developed."
    }
  ], 
  "imprints": [
    {
      "date": "2021-06-28", 
      "publisher": "Springer"
    }
  ]
}
Published on:
28 June 2021
Publisher:
Springer
Published in:
Journal of High Energy Physics , Volume 2021 (2021)
Issue 6
Pages 1-32
DOI:
https://doi.org/10.1007/JHEP06(2021)161
arXiv:
2104.09051
Copyrights:
The Author(s)
Licence:
CC-BY-4.0

Fulltext files: