J/A+A/708/A224 XMM-Newton supervised flare detection (Pasquato+, 2026)
Stellar flare detection in XMM-Newton with gradient boosted trees.
Pasquato M., Marelli M., De Luca A., Salvaterra R., Carenini G.,
Belfiore A., Tiengo A., Esposito P.
<Astron. Astrophys. 708, A224 (2026)>
=2026A&A...708A.224P 2026A&A...708A.224P (SIMBAD/NED BibCode)
ADC_Keywords: X-ray sources ; Stars, flare
Keywords: stars: activity - stars: flare - X-rays: binaries - X-rays: bursts -
X-rays: general - X-rays: stars
Abstract:
The EXTraS project, based on data collected with the XMM-Newton
observatory, provided us with a vast amount of light
curves for X-ray sources. For each light curve, EXTraS also provided
us with a set of features (https://extras.inaf.it). We extract from
the EXTraS database a tabular dataset of 31832 variable sources by
108 features. Of these, 13851 sources were manually labeled as
stellar flares or non-flares based on direct visual inspection.
We employed a supervised learning approach to produce a catalog of
stellar flares based on our dataset, releasing it to the community. We
leverage explainable AI tools and interpretable features to better
understand our classifier.
We train a gradient boosting classifier on 80% of the data for which
labels are available. We compute permutation feature importance
scores, visualize feature space using UMAP, and analyze some false
positive and false negative data points with the help of Shapley
additive explanations - an AI explainability technique used to measure
the importance of each feature in determining the classifier's
prediction for each instance.
On the test set made up of the remainder 20% of our labeled data, we
obtain an accuracy of 97.1%, with a precision of 82.4% and a recall of
73.3%. Our classifier outperforms a simple criterion based on fitting
the light curve with a flare template and significantly surpasses a
gradient-boosted classifier trained only on model-independent
features. False positives appear related to flaring light curves that
are not associated with a stellar counterpart, while false negatives
often correspond to multiple flares or otherwise peculiar or noisy
curves.
We apply our trained classifier to currently unlabeled sources,
releasing the largest catalog of X-ray stellar flares to date. We
estimate that integrating our classifier into the astronomers'
workflow will reduce the time spent visually inspecting light curves
by approximately half compared to an approach based on flare template
fitting, with implications for the classification of sources whose
variability is less well established within EXTraS as well as for
other catalogs and, possibly, forthcoming missions.
Description:
This catalogue provides predicted flare probabilities for 31832
variable X-ray sources observed by XMM-Newton. The predictions are
obtained using a statistical learning model (gradient boosted trees)
applied to EXTraS project features.
Each row corresponds to an XMM-Newton source. The catalogue lists
observation and source identifiers, sky coordinates (pipeline and XMM
when available), the predicted probability of being a flare, and the
predicted class (flare or not). A subset of sources was visually
inspected and used as training data; for these sources the visual
classification is also reported.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
catalog.dat 98 31832 Source catalogue
--------------------------------------------------------------------------------
See also:
IX/69 : XMM-Newton Serendipitous Source Catalogue 4XMM-DR13 (Webb+, 2023)
Byte-by-byte Description of file: catalog.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 9 I9 --- ObsID Observation identifier (OBS_ID)
11- 13 I3 --- Source Source number within observation (SRC_NUM)
15- 27 F13.9 deg RAdeg Right ascension from pipeline (J2000)
(PPS_RA)
29- 42 F14.10 deg DEdeg Declination from pipeline (J2000)
(PPS_DEC)
44- 56 F13.9 deg RAXdeg ?=- XMM-Newton right ascension (J2000)
(XMM_RA)
58- 71 F14.10 deg DEXdeg ?=- XMM-Newton declination (J2000)
(XMM_DEC)
73- 92 E20.18 --- PredFlareProb Predicted probability of flare
(predictedflareprobability)
94 I1 --- PredFare Predicted flare flag (TRUE/FALSE)
(predicted_flare)
96 I1 --- TrainDatapt Training set flag (TRUE/FALSE)
(training_datapoint)
98 A1 --- TrainFlare [YN-] Visual flare label
(trainingflarelabel)
--------------------------------------------------------------------------------
Acknowledgements:
Mario Pasquato. mario.pasquato(at)inaf.it
License: CC-BY-4.0 [see https://spdx.org/licenses/]
(End) Patricia Vannier [CDS] 01-Apr-2026