J/A+A/697/A107 Carbon star identification in Gaia DR3 (Ye+, 2025)
Deep learning interpretability analysis for carbon star identification in
Gaia DR3.
Ye S., Cui W.Y., Li Y.B., Luo A.L., Jones H.R.A.
<Astron. Astrophys. 697, A107 (2025)>
=2025A&A...697A.107Y 2025A&A...697A.107Y (SIMBAD/NED BibCode)
ADC_Keywords: Stars, carbon ; Optical
Keywords: methods: analytical - methods: data analysis - catalogs
Abstract:
A large fraction of Asymptotic Giant Branch (AGB) stars develop
carbon-rich atmospheres during their evolution. Based on their color
and luminosity, these carbon stars can be easily distinguished from
many other kinds of stars. However, numerous G, K, and M giants also
occupy the same region as carbon stars on the HR diagram. Despite
this, their spectra exhibit differences, especially in the prominent
CN molecular bands.
We aim to distinguish carbon stars from other kinds of stars using
Gaia's XP spectra, while providing attributional interpretations of
key features necessary for identification, and even discovering
additional new spectral key features.
We propose a classification model named "GaiaNet", an improved
one-dimensional convolutional neural network specifically designed for
handling Gaia's XP spectra. We utilized the SHAP interpretability
model to determine SHAP values for each feature in a spectrum,
enabling us to explain the output of the "GaiaNet" model and
provide further meaningful analysis.
Compared to four traditional machine-learning methods, the "GaiaNet"
model exhibits an average classification accuracy improvement of
approximately 0.3% on the validation set, with the highest accuracy
reaching 100%. Utilizing the SHAP model, we present a clear
spectroscopic heatmap highlighting molecular band absorption features
primarily distributed around CN773.3 and CN895.0, and summarize five
key feature regions for carbon star identification. Upon applying the
trained classification model to the CSTAR sample with Gaia
"xpsampledmean" spectra, we obtained 451 new candidate carbon stars
as a by-product.
Our algorithm is capable of discerning subtle feature differences from
low-resolution spectra of Gaia, thereby assisting us in effectively
identifying carbon stars with typically higher temperatures and weaker
CN features, while providing compelling attributive explanations. The
interpretability analysis of deep learning holds significant potential
in spectral identification.
Description:
The catalog has 451 entries, which are all the most likely carbon star
candidates selected by our model. The model obtains this result by a
training positive sample selected from CSTARXPMG and negative sample
selected from CSTARXPMnonG. All candidates are sorted in descending
order of confidence.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
tableb1.dat 99 451 Main information of carbon star candidates
--------------------------------------------------------------------------------
See also:
I/355 : Gaia DR3 Part 1. Main source (Gaia Collaboration, 2022)
Byte-by-byte Description of file: tableb1.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 2 A2 --- Notes [ab ] Note (1)
3- 21 I19 --- GaiaDR3 Gaia DR3 Source ID
23- 32 F10.6 deg RAdeg Right ascension (ICRS) at Ep=2016.0
34- 43 F10.6 deg DEdeg Declination (ICRS) at Ep=2016.0
45- 57 A13 --- MainType Main type given by SIMBAD
59- 83 A25 --- OtherTypes Other types given SIMBAD (2)
85- 91 F7.5 --- Confidence Confidence given by GaiaNet
93- 99 F7.5 --- SHAPvalue Sum of SHAP values given by SHAP model
--------------------------------------------------------------------------------
Note (1): Notes as follows:
a = carbon star candidates are also among the C-rich AGB star
candidates of Sanders & Matsunaga (2023MNRAS.521.2745S 2023MNRAS.521.2745S)
b = carbon star candidates are labeled as carbon stars by
LAMOST's pipeline
Note (2): Other types of SIMBAD are given for 6 top types
--------------------------------------------------------------------------------
Acknowledgements:
Shuo Ye, yeshuo(at)bao.ac.cn
(End) Patricia Vannier [CDS] 31-Dec-2024