J/A+A/657/A62 OB stars spectral classification automated tool (Kyritsis+, 2022)
A new automated tool for the spectral classification of OB stars.
Kyritsis E., Maravelias G., Zezas A., Bonfini P., Kovlakas K., Reig P.
<Astron. Astrophys. 657, A62 (2022)>
=2022A&A...657A..62K 2022A&A...657A..62K (SIMBAD/NED BibCode)
ADC_Keywords: Stars, early-type ; Stars, OB ; Stars, Be ; Spectral types ;
MK spectral classification
Keywords: stars: early-type - stars: massive - X-rays: binaries -
methods: statistical - stars: emission-line, Be
Abstract:
As an increasing number of spectroscopic surveys become available, an
automated approach to spectral classification becomes necessary. Due
to the significance of the massive stars, it is of great importance to
identify the phenomenological parameters of these stars (e.g., the
spectral type), which can be used as proxies to their physical
parameters (e.g., mass and temperature).
In this work, we aim to use the random forest (RF) algorithm to
develop a tool for the automated spectral classification of OB-type
stars according to their sub-types.
We used the regular RF algorithm, the probabilistic RF (PRF), which is
an extension of RF that incorporates uncertainties, and we introduced
the KDE-RF method which is a combination of the kernel-density
estimation and the RF algorithm. We trained the algorithms on the
equivalent width (EW) of characteristic absorption lines measured in
high-quality spectra (Signal-to-Noise (S/N)>50) from large Galactic
(LAMOST, GOSSS) and extragalactic surveys (2dF, VFTS) with available
spectral types and luminosity classes. By following an adaptive
binning approach, we grouped the labels of these data in 11 spectral
classes within the O2-B9 range. We examined which of the
characteristic spectral lines (features) are more important for the
classification based on a number of feature selection methods, and we
searched for the optimal hyperparameters of the classifiers to achieve
the best performance.
From the feature-screening process, we find that the full set of 17
spectral lines is needed to reach the maximum performance per spectral
class. We find that the overall accuracy score is ∼70%, with similar
results across all approaches. We apply our model in other
observational data sets providing examples of the potential
application of our classifier to real science cases. We find that it
performs well for both single massive stars and for the companion
massive stars in Be X-ray binaries, especially for data of similar
quality to the training sample. In addition, we propose a reduced
ten-features scheme that can be applied to large data sets with lower
S/N∼20-50.
The similarity in the performances of our models indicates the
robustness and the reliability of the RF algorithm when it is used for
the spectral classification of early-type stars. The score of ∼70% is
high if we consider (a) the complexity of such multiclass
classification problems (i.e., 11 classes), (b) the intrinsic scatter
of the EW distributions within the examined spectral classes, and (c)
the diversity of the training set since we use data obtained from
different surveys with different observing strategies. In addition,
the approach presented in this work is applicable to products from
different surveys in terms of quality (e.g., different resolution) and
different formats (e.g., absolute or normalized flux), while our
classifier is agnostic to the luminosity class of a star, and, as much
as possible, it is metallicity independent.
Description:
Spectral type classification of stars in the IACOB project
(Simon-Diaz et al., 2015hsa8.conf..576S) based on the random forest
classifier developed in this work (Kyritsis et al., 2022, arXiv:
2110.10669). For comparison we also tabulate the spectral types
provided from the IACOB database.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table6.dat 119 328 Spectral type classification of stars in
the IACOB survey
--------------------------------------------------------------------------------
Byte-by-byte Description of file: table6.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 9 A9 --- ID Object's ID in the IACOB database
11- 12 A2 --- IACOBSpType Published spectral type
14- 18 A5 --- RFSpType Predicted spectral class based on
the RF method
20- 25 F6.4 --- Prob Probability of predicted spectral class
27- 42 A16 --- ConfLevel Confidence level flag
(see Sec. 5.1 of the paper)
44- 49 F6.4 --- B0 Probability for class B0
51- 56 F6.4 --- B1 Probability for class B1
58- 63 F6.4 --- B2 Probability for class B2
65- 70 F6.4 --- B3-B4 Probability for class B3-B4
72- 77 F6.4 --- B5-B7 Probability for class B5-B7
79- 84 F6.4 --- B8 Probability for class B8
86- 91 F6.4 --- B9 Probability for class B9
93- 98 F6.4 --- O2-O6 Probability for class O2-O6
100-105 F6.4 --- O7 Probability for class O7
107-112 F6.4 --- O8 Probability for class O8
114-119 F6.4 --- O9 Probability for class O9
--------------------------------------------------------------------------------
Acknowledgements:
Elias Kyritsis, ekyritsis(at)physics.uoc.gr
References:
Simon-Diaz et al., 2015hsa8.conf..576S,
The IACOB spectroscopic database: recent updates and first data release
(End) Patricia Vannier [CDS] 30-Dec-2021