Highlights

In addition to general support for all kinds of data analysis across the facility, the group focuses on research and developments in specific project areas.

Artificial intelligence

Machine Learning is widespread in several fields now, but people often loose sight of what it actually means. The goal is to take advantage of mathematical techniques to automate and optimize processes. In such a large and complex facility as the European XFEL we need all automation we can get to keep the machine running at high performance, so we are exploring several methods to get scientists to worry about their science and not the innerworkings of the detectors or the FEL beam.

Automating Serial Femtosecond Crystallography
Serial Femtosecond Crystallography experiments provide an excellent way of gaining insight into the structure of crystals, but the analysis chain required to analyse the data collected is complex and requires the tuning of several parameters. Scientists need feedback on the quality of their data during the experiment to be able to understand if the experiment is proceeding as expected, and if not, to change it appropriately.

Our Machine Learning team has developed a technique, in collaboration with colleagues from the Center for Free-Electron Laser Science and scientific instruments, which automatically fits the data quality as the experiment progresses. The calculated information is looked at as a function of the analysis chain parameters, which are optimized to maximize the data quality. As a result, scientists do not have to focus on the analysis chain parameters and can concentrate on understanding if the data quality is sub-optimal and react accordingly, changing the experiment parameters.

Transforming low-resolution measurements into high-resolution spectrometry
Understanding the photon characteristics is fundamental at the EuXFEL, for example, to determine each pulse’s spectrum. The issue is: the best analysis of our beam happens if we stop the beam and analyse it with a high-resolution spectrometer! We can gain some insight on the beam quality from low-resolution spectrometers, but this often requires complex manual data analysis by experts.

In collaboration with our X-ray Photon Diagnostics colleagues, we have developed a technique which takes simultaneous measurements from the low- and high-resolution spectrometers and learns how to map one into the other with no parameter fine-tuning needed. As a result, we may use the learned mapping to convert low-resolution data into high-resolution data without stopping the beam. Insight into the explainability of the model is gained with a detailed uncertainty calculation of that mapping. The main idea is to project the data into a sub-space with only the most relevant data, so that data selection happens automatically.

Multi-modular detector geometry optimization
Our detectors at the EuXFEL are complex devices with several different modules, which may be moved independently, so that scientists have a lot of freedom in how they take their data. The disadvantage of that freedom is that the position of the modules relative to the x-ray beam is hardly known with high precision. We rely on irradiating standard samples to understand how to estimate the correct detector alignment and then spend a lot of time and manpower analysing that data to discover how the modules have been shifted, relative to a previous experiment.

Our team has devised a solution to this problem when the standard sample comes from a powder diffraction experiment, which produces rings in the detector. In this case, a perfectly aligned detector produces concentric rings in the detectors. By establishing a set of mathematical transformations, we can create a simple metric that needs to be optimized to align the detector modules. Leveraging Bayesian Optimization this is done in very few iterations.

Finding ice: the million-euro question
Both the accelerator and the EuXFEL detectors are our most important pieces of machinery and we do our best to keep them running as often as we can and with no damage. One of those sources of damage may appear when ice crystals are inadvertently formed during sample injection. The ice scatters the x-ray beam intensely, which might damage some detector pixels.

In collaboration with the scientific instrument SPB/SFX, we have devised a solution which detects such events by analysing data from a monitoring camera and blocks the beam automatically whenever this is at risk of happening. Before such method, a human operator was needed for this, who usually responds later, due to delayed human response, while the software reacts as soon as ice is formed!

And more ...
We have many other projects in the pipeline including predictive maintenance in our complex control systems and automatic data clustering. We invite you to join us and be part of the EuXFEL automation team!

Metadata tracking and exploration

Experiments at the European XFEL often rely on a huge amount of metadata, quantities describing the primary data product. This is often collected manually with shared spreadsheets, which can be tedious and error prone. To address this, we are developing a system to automatically gather any metadata defined by the user and to automate certain kinds of analysis. This, could be as simple as a motor position, or as complex as the result of a user-defined fitting procedure. The system offers rich and interactive visualization of collected quantities, with the aim of making beamtimes more efficient. To comprehensively account for all the experiment’s steps, samples details as well as further analysis results can be added and inspected at any point in time.

Data reduction

The European XFEL generates a large amount of scientific data, which poses a challenge for storage capacity and complicates data analysis. To address this issue, a new data processing and retention strategy has been implemented based on the European XFEL scientific data policy. Users are asked to reduce the amount of data retained on hard-disk storage to a defined limit, which is only a fraction of the originally acquired raw data. Data reduction can be achieved by selecting unmodified data or by altering/transforming the data.

The Data Department offers users procedures and tools to assist them with data reduction activities, aiming to automate the process wherever possible. The selection can be done manually based on the quality and scientific value of data, or by automatic tools which detect illuminated frames, or those containing sample diffraction patterns from serial femtosecond crystallography or single particle imaging. Data can also be selected in the spatial dimension, e.g. by defining a detector region of interest or by storing a single module, where applicable. Other methods include data compression and integration of data over spatial and/or temporal dimensions.

As a reduction example, the picture above demonstrates the effect of utilizing photonisation and compression in combination with lit frame selection, for different pulse patterns. The above example of data reduction is taken from a X-ray photon correlation experiment, where an overall data reduction to 0.7% of the raw volume was achieved, thus saving 423 TiB of storage space.

Read more here.

Automatic data processing services

European XFEL aims to provide facility users with a scientifically utilizable dataset as the primary data product. Depending on the experiment, this may range from the raw data received directly from hardware to the result of multiple processing steps. For this purpose, the Data Department develops data processing services that apply common and established analysis steps automatically and immediately after data acquisition.

For the X-ray imaging detectors used at European XFEL, in particular the custom burst-mode detectors AGIPD, LPD and DSSC, this includes the automatic correction and transformation from raw data to gain-calibrated data on a photon energy scale.

The necessary calibration data is taken and analyzed by detector experts beforehand, while noise-related effects are compensated by frequent calibrations performed automatically in between measurements. All calibration data is stored in a database indexed by time and the exact operating condition of the detector for later reference.

Recorded data can be processed directly after acquisition on the Maxwell computing cluster with the results written to files accompanying the raw data. These may include additional correction or reduction steps not available in real-time for a publication-quality analysis, and are documented by generated report document to assess the results. Since October 2021, the processed data can be exactly reproduced at any later point in time from the raw data.
In addition to correcting X-ray pixel detectors, the offline processing also includes event reconstruction for delay line detectors and is currently planned to be expanded to more general processing steps such as azimuthal integration.

Read more here.