Journal: Machine Learning: Science and Technology
Loading...
Abbreviation
Mach. Learn.: Sci. Technol.
Publisher
IOP Publishing
21 results
Filters
Reset filtersSearch Results
Publications 1 - 10 of 21
- Lightweight jet reconstruction and identification as an object detection taskItem type: Journal Article
Machine Learning: Science and TechnologyPol, Adrian Alan; Aarrestad, Thea; Govorkova, Ekaterina; et al. (2022)We apply object detection techniques based on deep convolutional blocks to end-to-end jet identification and reconstruction tasks encountered at the CERN large hadron collider (LHC). Collision events produced at the LHC and represented as an image composed of calorimeter and tracker cells are given as an input to a Single Shot Detection network. The algorithm, named PFJet-SSD performs simultaneous localization, classification and regression tasks to cluster jets and reconstruct their features. This all-in-one single feed-forward pass gives advantages in terms of execution time and an improved accuracy w.r.t. traditional rule-based methods. A further gain is obtained from network slimming, homogeneous quantization, and optimized runtime for meeting memory and latency constraints of a typical real-time processing environment. We experiment with 8-bit and ternary quantization, benchmarking their accuracy and inference latency against a single-precision floating-point. We show that the ternary network closely matches the performance of its full-precision equivalent and outperforms the state-of-the-art rule-based algorithm. Finally, we report the inference latency on different hardware platforms and discuss future applications. - Process tomography of structured optical gates with convolutional neural networksItem type: Journal Article
Machine Learning: Science and TechnologyJaouni, Tareq; Di Colandrea, Francesco; Amato, Lorenzo; et al. (2024)Efficient and accurate characterization of an experimental setup is a critical requirement in any physical setting. In the quantum realm, the characterization of an unknown operator is experimentally accomplished via Quantum Process Tomography (QPT). This technique combines the outcomes of different projective measurements to reconstruct the underlying process matrix, typically extracted from maximum-likelihood estimation. Here, we exploit the logical correspondence between optical polarization and two-level quantum systems to retrieve the complex action of structured metasurfaces within a QPT-inspired context. In particular, we investigate a deep-learning approach that allows for fast and accurate reconstructions of space-dependent SU(2) operators by only processing a minimal set of measurements. We train a convolutional neural network based on a scalable U-Net architecture to process entire experimental images in parallel. Synthetic processes are reconstructed with average fidelity above 90%. The performance of our routine is experimentally validated in the case of space-dependent polarization transformations acting on a classical laser beam. Our approach further expands the toolbox of data-driven approaches to QPT and shows promise in the real-time characterization of complex optical gates. - A deep neural network to search for new long-lived particles decaying to jetsItem type: Journal Article
Machine Learning: Science and TechnologyThe CMS Collaboration (2020)A tagging algorithm to identify jets that are significantly displaced from the proton-proton (pp) collision region in the CMS detector at the LHC is presented. Displaced jets can arise from the decays of long-lived particles (LLPs), which are predicted by several theoretical extensions of the standard model. The tagger is a multiclass classifier based on a deep neural network, which is parameterised according to the proper decay length c τ 0 of the LLP. A novel scheme is defined to reliably label jets from LLP decays for supervised learning. Samples of pp collision data, recorded by the CMS detector at a centre-of-mass energy of 13 TeV, and simulated events are used to train the neural network. Domain adaptation by backward propagation is performed to improve the simulation modelling of the jet class probability distributions observed in pp collision data. The potential performance of the tagger is demonstrated with a search for long-lived gluinos, a manifestation of split supersymmetric models. The tagger provides a rejection factor of 10 000 for jets from standard model processes, while maintaining an LLP jet tagging efficiency of 30%–80% for gluinos with 1 mm≤c τ 0≤ 10 m. The expected coverage of the parameter space for split supersymmetry is presented. - ChemLit-QA: a human evaluated dataset for chemistry RAG tasksItem type: Journal Article
Machine Learning: Science and TechnologyWellawatte, Geemi P.; Guo, Huixuan; Lederbauer, Magdalena; et al. (2025)Retrieval-Augmented Generation (RAG) is a widely used strategy in Large-Language Models (LLMs) to extrapolate beyond the inherent pre-trained knowledge. Hence, RAG is crucial when working in data-sparse fields such as Chemistry. The evaluation of RAG systems is commonly conducted using specialized datasets. However, existing datasets, typically in the form of scientific Question-Answer-Context (QAC) triplets or QA pairs, are often limited in size due to the labor-intensive nature of manual curation or require further quality assessment when generated through automated processes. This highlights a critical need for large, high-quality datasets tailored to scientific applications. We introduce ChemLit-QA, a comprehensive, expert-validated, open-source dataset comprising over 1,000 entries specifically designed for chemistry. Our approach involves the initial generation and filtering of a QAC dataset using an automated framework based on GPT-4 Turbo, followed by rigorous evaluation by chemistry experts. Additionally, we provide two supplementary datasets: ChemLit-QA-neg focused on negative data, and ChemLit-QA-multi focused on multihop reasoning tasks for LLMs, which complement the main dataset on hallucination detection and more reasoning-intensive tasks. - Transfer learning application of self-supervised learning in ARPESItem type: Journal Article
Machine Learning: Science and TechnologyEkahana, Sandy A.; Winata, Genta I.; Soh, Y; et al. (2023)There is a growing recognition that electronic band structure is a local property of materials and devices, and there is steep growth in capabilities to collect the relevant data. New photon sources, from small-laboratory-based lasers to free electron lasers, together with focusing beam optics and advanced electron spectrometers, are beginning to enable angle-resolved photoemission spectroscopy (ARPES) in scanning mode with a spatial resolution of near to and below microns, two- to three orders of magnitude smaller than what has been typical for ARPES hitherto. The results are vast data sets inhabiting a five-dimensional subspace of the ten-dimensional space spanned by two scanning dimensions of real space, three of reciprocal space, three of spin-space, time, and energy. In this work, we demonstrate that recent developments in representational learning (self-supervised learning) combined with k-means clustering can help automate the labeling and spatial mapping of dispersion cuts, thus saving precious time relative to manual analysis, albeit with low performance. Finally, we introduce a few-shot learning (k-nearest neighbor) in representational space where we selectively choose one (k = 1) image reference for each known label and subsequently label the rest of the data with respect to the nearest reference image. This last approach demonstrates the strength of self-supervised learning to automate image analysis in ARPES in particular and can be generalized to any scientific image analysis. - Unravelling physics beyond the standard model with classical and quantum anomaly detectionItem type: Journal Article
Machine Learning: Science and TechnologySchuhmacher, Julian; Boggia, Laura; Belis, Vasilis; et al. (2023)Much hope for finding new physics phenomena at microscopic scale relies on the observations obtained from High Energy Physics experiments, like the ones performed at the Large Hadron Collider (LHC). However, current experiments do not indicate clear signs of new physics that could guide the development of additional Beyond Standard Model (BSM) theories. Identifying signatures of new physics out of the enormous amount of data produced at the LHC falls into the class of anomaly detection and constitutes one of the greatest computational challenges. In this article, we propose a novel strategy to perform anomaly detection in a supervised learning setting, based on the artificial creation of anomalies through a random process. For the resulting supervised learning problem, we successfully apply classical and quantum support vector classifiers (CSVC and QSVC respectively) to identify the artificial anomalies among the SM events. Even more promising, we find that employing an SVC trained to identify the artificial anomalies, it is possible to identify realistic BSM events with high accuracy. In parallel, we also explore the potential of quantum algorithms for improving the classification accuracy and provide plausible conditions for the best exploitation of this novel computational paradigm. - Distilling particle knowledge for fast reconstruction at high-energy physics experimentsItem type: Journal Article
Machine Learning: Science and TechnologyBal, Aritra; Brandes, Tristan; Iemmi, Fabio; et al. (2024)Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational resources, in particular on edge devices. In this article, we consider proton-proton collisions at the High-Luminosity Large Hadron Collider (HL-LHC) and demonstrate a successful knowledge transfer from an event-level graph neural network (GNN) to a particle-level small deep neural network (DNN). Our algorithm, DistillNet, is a DNN that is trained to learn about the provenance of particles, as provided by the soft labels that are the GNN outputs, to predict whether or not a particle originates from the primary interaction vertex. The results indicate that for this problem, which is one of the main challenges at the HL-LHC, there is minimal loss during the transfer of knowledge to the small student network, while improving significantly the computational resource needs compared to the teacher. This is demonstrated for the distilled student network on a CPU, as well as for a quantized and pruned student network deployed on an field programmable gate array. Our study proves that knowledge transfer between networks of different complexity can be used for fast artificial intelligence (AI) in high-energy physics that improves the expressiveness of observables over non-AI-based reconstruction algorithms. Such an approach can become essential at the HL-LHC experiments, e.g. to comply with the resource budget of their trigger stages. - Guided quantum compression for high dimensional data classificationItem type: Journal Article
Machine Learning: Science and TechnologyBelis, Vasilis; Odagiu, Patrick; Grossi, Michele; et al. (2024)Quantum machine learning provides a fundamentally different approach to analyzing data. However, many interesting datasets are too complex for currently available quantum computers. Present quantum machine learning applications usually diminish this complexity by reducing the dimensionality of the data, e.g. via auto-encoders, before passing it through the quantum models. Here, we design a classical-quantum paradigm that unifies the dimensionality reduction task with a quantum classification model into a single architecture: the guided quantum compression model. We exemplify how this architecture outperforms conventional quantum machine learning approaches on a challenging binary classification problem: identifying the Higgs boson in proton-proton collisions at the LHC. Furthermore, the guided quantum compression model shows better performance compared to the deep learning benchmark when using solely the kinematic variables in our dataset. - Evaluation of synthetic and experimental training data in supervised machine learning applied to charge-state detection of quantum dotsItem type: Journal Article
Machine Learning: Science and TechnologyDarulová, Jana; Troyer, Matthias; Cassidy, Maja C. (2021)Automated tuning of gate-defined quantum dots is a requirement for large-scale semiconductor-based qubit initialisation. An essential step of these tuning procedures is charge-state detection based on charge stability diagrams. Using supervised machine learning to perform this task requires a large dataset for models to train on. In order to avoid hand labelling experimental data, synthetic data has been explored as an alternative. While providing a significant increase in the size of the training dataset compared to using experimental data, using synthetic data means that classifiers are trained on data sourced from a different distribution than the experimental data that is part of the tuning process. Here we evaluate the prediction accuracy of a range of machine learning models trained on simulated and experimental data, and their ability to generalise to experimental charge stability diagrams in two-dimensional electron gas and nanowire devices. We find that classifiers perform best on either purely experimental or a combination of synthetic and experimental training data, and that adding common experimental noise signatures to the synthetic data does not dramatically improve the classification accuracy. These results suggest that experimental training data as well as realistic quantum dot simulations and noise models are essential in charge-state detection using supervised machine learning. - Operationally meaningful representations of physical systems in neural networksItem type: Journal Article
Machine Learning: Science and TechnologyPoulsen Nautrup, Hendrik; Metger, Tony; Iten, Raban; et al. (2022)To make progress in science, we often build abstract representations of physical systems that meaningfully encode information about the systems. Such representations ignore redundant features and treat parameters such as velocity and position separately because they can be useful for making statements about different experimental settings. Here, we capture this notion by formally defining the concept of operationally meaningful representations. We present an autoencoder architecture with attention mechanism that can generate such representations and demonstrate it on examples involving both classical and quantum physics. For instance, our architecture finds a compact representation of an arbitrary two-qubit system that separates local parameters from parameters describing quantum correlations.
Publications 1 - 10 of 21