J/A+A/657/A62  OB stars spectral classification automated tool (Kyritsis+, 2022)

A new automated tool for the spectral classification of OB stars. Kyritsis E., Maravelias G., Zezas A., Bonfini P., Kovlakas K., Reig P. <Astron. Astrophys. 657, A62 (2022)> =2022A&A...657A..62K 2022A&A...657A..62K (SIMBAD/NED BibCode)
ADC_Keywords: Stars, early-type ; Stars, OB ; Stars, Be ; Spectral types ; MK spectral classification Keywords: stars: early-type - stars: massive - X-rays: binaries - methods: statistical - stars: emission-line, Be Abstract: As an increasing number of spectroscopic surveys become available, an automated approach to spectral classification becomes necessary. Due to the significance of the massive stars, it is of great importance to identify the phenomenological parameters of these stars (e.g., the spectral type), which can be used as proxies to their physical parameters (e.g., mass and temperature). In this work, we aim to use the random forest (RF) algorithm to develop a tool for the automated spectral classification of OB-type stars according to their sub-types. We used the regular RF algorithm, the probabilistic RF (PRF), which is an extension of RF that incorporates uncertainties, and we introduced the KDE-RF method which is a combination of the kernel-density estimation and the RF algorithm. We trained the algorithms on the equivalent width (EW) of characteristic absorption lines measured in high-quality spectra (Signal-to-Noise (S/N)>50) from large Galactic (LAMOST, GOSSS) and extragalactic surveys (2dF, VFTS) with available spectral types and luminosity classes. By following an adaptive binning approach, we grouped the labels of these data in 11 spectral classes within the O2-B9 range. We examined which of the characteristic spectral lines (features) are more important for the classification based on a number of feature selection methods, and we searched for the optimal hyperparameters of the classifiers to achieve the best performance. From the feature-screening process, we find that the full set of 17 spectral lines is needed to reach the maximum performance per spectral class. We find that the overall accuracy score is ∼70%, with similar results across all approaches. We apply our model in other observational data sets providing examples of the potential application of our classifier to real science cases. We find that it performs well for both single massive stars and for the companion massive stars in Be X-ray binaries, especially for data of similar quality to the training sample. In addition, we propose a reduced ten-features scheme that can be applied to large data sets with lower S/N∼20-50. The similarity in the performances of our models indicates the robustness and the reliability of the RF algorithm when it is used for the spectral classification of early-type stars. The score of ∼70% is high if we consider (a) the complexity of such multiclass classification problems (i.e., 11 classes), (b) the intrinsic scatter of the EW distributions within the examined spectral classes, and (c) the diversity of the training set since we use data obtained from different surveys with different observing strategies. In addition, the approach presented in this work is applicable to products from different surveys in terms of quality (e.g., different resolution) and different formats (e.g., absolute or normalized flux), while our classifier is agnostic to the luminosity class of a star, and, as much as possible, it is metallicity independent. Description: Spectral type classification of stars in the IACOB project (Simon-Diaz et al., 2015hsa8.conf..576S) based on the random forest classifier developed in this work (Kyritsis et al., 2022, arXiv: 2110.10669). For comparison we also tabulate the spectral types provided from the IACOB database. File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file table6.dat 119 328 Spectral type classification of stars in the IACOB survey -------------------------------------------------------------------------------- Byte-by-byte Description of file: table6.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 9 A9 --- ID Object's ID in the IACOB database 11- 12 A2 --- IACOBSpType Published spectral type 14- 18 A5 --- RFSpType Predicted spectral class based on the RF method 20- 25 F6.4 --- Prob Probability of predicted spectral class 27- 42 A16 --- ConfLevel Confidence level flag (see Sec. 5.1 of the paper) 44- 49 F6.4 --- B0 Probability for class B0 51- 56 F6.4 --- B1 Probability for class B1 58- 63 F6.4 --- B2 Probability for class B2 65- 70 F6.4 --- B3-B4 Probability for class B3-B4 72- 77 F6.4 --- B5-B7 Probability for class B5-B7 79- 84 F6.4 --- B8 Probability for class B8 86- 91 F6.4 --- B9 Probability for class B9 93- 98 F6.4 --- O2-O6 Probability for class O2-O6 100-105 F6.4 --- O7 Probability for class O7 107-112 F6.4 --- O8 Probability for class O8 114-119 F6.4 --- O9 Probability for class O9 -------------------------------------------------------------------------------- Acknowledgements: Elias Kyritsis, ekyritsis(at)physics.uoc.gr References: Simon-Diaz et al., 2015hsa8.conf..576S, The IACOB spectroscopic database: recent updates and first data release
(End) Patricia Vannier [CDS] 30-Dec-2021
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line