J/A+A/704/A70 The main sequence-white dwarf valley (Ranaivomanana+, 2025)
Unsupervised learning for variability detection with Gaia DR3 photometry.
The main sequence-white dwarf valley.
Ranaivomanana P., Johnston C., Iorio G., Groot P.J., Uzundag M., Kupfer T.,
Aerts C.
<Astron. Astrophys. 704, A70 (2025)>
=2025A&A...704A..70R 2025A&A...704A..70R (SIMBAD/NED BibCode)
ADC_Keywords: Stars, variable ; Stars, subdwarf ; Binaries, orbits ; Optical
Keywords: methods: data analysis - methods: statistical -
techniques: photometric - surveys - subdwarfs -
stars: variables: general
Abstract:
The unprecedented volume and quality of data from space- and
ground-based telescopes present an opportunity for machine learning to
identify new classes of variable stars and peculiar systems that may
have been overlooked by traditional methods. The region between the
main sequence and white-dwarf sequence in the colour-magnitude
diagram (CMD) hosts a variety of astrophysically valuable and poorly
characterised objects, including hot subdwarfs, pre-white dwarfs, and
interacting binaries. Extending prior methodological work, this study
investigates the potential of unsupervised learning approach to scale
effectively to larger stellar populations, including objects in
crowded fields, and without the need for pre-selected catalogues,
specifically focusing on 13405 sources selected from Gaia DR3 and
lying in the selected region of the CMD. Our methodology incorporates
unsupervised clustering techniques based primarily on statistical
features extracted from Gaia DR3 epoch photometry. We used the
t-distributed stochastic neighbour embedding (t-SNE) algorithm to
identify variability classes, their subtypes, and spurious variability
induced by instrumental effects. Feature importance was evaluated
using SHapley Additive exPlanations (SHAP) values to identify the most
influential parameters associated with each cluster. The clustering
results revealed distinct groups, including hot subdwarfs, cataclysmic
variables (CVs), eclipsing binaries, and objects in crowded fields,
such as those in the Andromeda (M31) field. Several potential stellar
subtypes also emerged within these clusters, such as pulsating hot
subdwarfs exhibiting pure or hybrid (pressure and/or gravity) modes
within the hot subdwarf cluster. Magnetic CVs and dwarf novae appeared
in the CVs cluster. Feature evaluation further enabled the
identification of a cluster dominated purely by photometric
variability, as well as clusters associated with instrumental effects
and crowded fields. Notably, objects previously labelled as RR Lyrae
were found in an unexpected region of the CMD, potentially due to
either unreliable astrometric measurements (e.g. due to binarity) or
alternative evolutionary pathways. This study emphasises the
robustness of the proposed method in finding variable objects in a
large region of the Gaia CMD, including variable hot subdwarfs and
CVs, while demonstrating its efficiency in detecting variability in
extended stellar populations. The proposed unsupervised learning
framework demonstrates scalability to large datasets and yields
promising results in identifying stellar subclasses.
Description:
List consisting of 13405 targets located between the main-sequence
and the white dwarf sequence valley. The table provides the resulting
t-SNE embeddings along with classifications from the Gaia SOS
pipeline, the Gaia machine-learning classifier, and the literature.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
tablea2.dat 268 13405 Targets located between the main-sequence and
the white dwarf sequence valley
--------------------------------------------------------------------------------
See also:
I/355 : Gaia DR3 Part 1. Main source (Gaia Collaboration, 2022)
Byte-by-byte Description of file: tablea2.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 19 I19 --- GaiaDR3 Gaia DR3 source ID
21- 40 F20.16 deg RAdeg Right Ascension (ICRS) at Ep=2016.0
42- 61 F20.16 deg DEdeg Right Ascension (ICRS) at Ep=2016.0
63- 72 F10.7 mag Gmag Gaia G-band magnitude
74- 92 F19.16 mag GMAG Gaia G-band absolute magnitude
94-106 F13.9 --- BP-RP Gaia BP-RP colour
108-128 F21.16 d Per Gaia G-band period
130-151 F22.18 --- tSNEComp1 t-SNE component 1
153-174 F22.18 --- tSNEComp2 t-SNE component 2
176-179 A4 --- Cluster Cluster name
181-195 A15 --- GaiaSOSClass Gaia SOS classification
197-211 A15 --- GaiaMLClass Gaia machine learning classification
213-248 A36 --- LitClass Classification from literature
250-268 A19 --- r_LitClass Literature classification reference
--------------------------------------------------------------------------------
Acknowledgements:
Princy Ranaivomanana, rtprincy(at)gmail.com
(End) Patricia Vannier [CDS] 28-Oct-2025