J/A+A/704/A155 Massive stars in the Milky Way (Monsalves+, 2025)
Astrophotometric search for massive stars in the Milky Way confronting random
forest predictions with available spectroscopy.
Monsalves N., Bayo A., Jaque Arancibia M., Bodensteiner J.,
Guerrero Caneppa A., Sanchez-Saez P., Angeloni R.
<Astron. Astrophys. 704, A155 (2025)>
=2025A&A...704A.155M 2025A&A...704A.155M (SIMBAD/NED BibCode)
ADC_Keywords: Stars, early-type ; Surveys ; Photometry
Keywords: methods: data analysis - methods: statistical -
techniques: spectroscopic - catalogs; surveys - stars: early-type
Abstract:
Massive stars play a significant role in different branches of
astronomy, from shaping the processes of star and planet formation to
influencing the evolution and chemical enrichment of the distant
universe. Despite their high astrophysical significance, these objects
are rare and difficult to detect. With Gaia's advent, we now possess
extensive kinematic and photometric data for a significant portion of
the Galaxy that can unveil, among others, new populations of massive
star candidates. In order to produce bonafide bright (G magnitude <12)
massive-star candidate lists (threshold set to spectral type B2 or
earlier but with slight changes in this threshold also explored) in
the Milky Way subject to be followed up by future massive
spectroscopic surveys, we developed a Gaia DR3 plus literature data
based methodology. We trained a balanced random forest (BRF) with the
spectral types from the Skiff compilation as labels. Our approach
yields a completeness of ∼80% and a purity ranging from 0.51±0.02
for probabilities between 0.6 and 0.7, up to 0.85±0.05 for the
0.9-1.0 range. To externally validate our methodology, we searched for
and analyzed archival spectra of moderate- to high-probability (p>0.6)
candidates that are not contained in our catalog of labels. Our
independent spectral validation confirms the expected performance of
the BRF, spectroscopically classifying 300 stars as B3 or earlier (due
to observational constraints imposed on the B0-3 range), including 107
new stars. Based on the most conservative yields of our methodology,
our candidate list could increase the number of bright massive stars
by ∼50%. As a byproduct, we developed an automatic methodology for
spectral typing optimized for LAMOST spectra, based on line detection
and characterization that guides a decision path.
Description:
Catalog of bright massive star candidates in the Milky Way (G<12),
identified using a Balanced Random Forest classifier trained on
literature spectral types and validated with available spectroscopy.
This release provides:
(1) candidate massive stars predicted by the classifier,
(2) the parent magnitude-limited sample used for model training and
evaluation,
(3) auxiliary codification used during preprocessing and encoding.
The catalog combines Gaia DR3 astrometry and photometry with 2MASS
near-infrared photometry and literature spectral classifications to
identify probable early-type massive stars suitable for spectroscopic
follow-up.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
candms.dat 411 9089 Massive star candidates, excluded from
training labels
g12par.dat 285 99387 Parent catalog used as training label source
auxcode.dat 458 99387 Auxiliary spectral type codification of
the parent catalog
--------------------------------------------------------------------------------
See also:
I/355 : Gaia DR3 Part 1. Main source (Gaia Collaboration, 2022
Byte-by-byte Description of file: candms.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 19 I19 --- GaiaDR3 Gaia DR3 identifier
21- 40 F20.16 deg RAdeg Right ascension (ICRS) at Ep=2016.0
42- 61 F20.16 deg DEdeg Declination (ICRS) at Ep=2016.0
63- 82 A20 --- skiff-type First SpT skiff
84-271 A188 --- skiff-type-list All Skiff spectral types
273-287 A15 --- sp-Li2021 Li (2021) spectral type
289-341 A53 --- sp-ALSII ALS II spectral type
343-346 A4 --- sp-GOSSS GOSSS spectral type
348-355 A8 --- sp-Chini2012 Chini et al. (2012) spectral type
357-375 A19 --- sp-IACOB IACOB spectral type
377-395 A19 --- sp-simbad SIMBAD spectral type
397-401 A5 --- sp-Monsalves2025 Spectral type from this work
403-407 F5.3 --- B2-cut-prob BRF probability
409 I1 --- news [0/1] New candidate flag
411 I1 --- OB [0/1] OB flag
--------------------------------------------------------------------------------
Byte-by-byte Description of file: g12par.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 19 I19 --- GaiaDR3 Gaia DR3 identifier
21- 25 I5 --- label ?=-9999 Training label
27- 31 A5 --- B2-cut-split Training split
33- 37 F5.3 --- B2-cut-prob BRF probability
39- 45 F7.4 mas plx Gaia parallax
47- 55 F9.4 --- RPlx Parallax over error
57- 61 F5.2 mag Jmag 2MASS J magnitude
63- 67 F5.2 mag Hmag 2MASS H magnitude
69- 73 F5.2 mag Ksmag 2MASS Ks magnitude
75- 82 F8.4 mas/yr pmRA Proper motion in RA
84- 88 F5.2 mag RPmag Gaia RP magnitude
90- 97 F8.4 mas/yr pmDE Proper motion in DEC
99-104 F6.3 mag G-RP G-RP Gaia colour index
106-111 F6.3 mag BP-RP BP-RP Gaia colour index
113-117 F5.2 mag Gmag Gaia G magnitude
119-123 F5.2 mag BPmag Gaia BP magnitude
125-130 F6.4 mas e_plx Parallax error
132-137 F6.3 mag G-BP G-BP Gaia colour index
139-144 F6.3 mag J-Ks J-Ks Colour index
146-151 F6.4 mas/yr e_pmRA Proper motion RA error
153-158 F6.3 mag Ks-RP Ks-RP Colour index
160-165 F6.3 mag H-Ks H-Ks Colour index
167-172 F6.4 --- Sig5dmax Astrometric parameter
174-180 F7.4 --- RUWE Astrometric parameter
182-190 F9.4 --- gofAL Astrometric parameter
192-197 F6.3 mag H-RP Colour index
199-204 F6.4 --- IPDgofha Astrometric parameter
206-215 F10.4 --- sepsi Astrometric parameter
217-222 F6.3 mag J-RP J-RP Colour index
224-229 F6.3 mag J-BP J-BP Colour index
231-236 F6.4 --- epsi Astrometric parameter
238-243 F6.3 mag J-H J-H Colour index
245-250 F6.4 mas/yr e_pmDE Proper motion DEC error
252-257 F6.3 mag H-BP H-BP Colour index
259-264 F6.3 mag J-G J-G Colour index
266-271 F6.3 mag Ks-BP Ks-BP Colour index
273-278 F6.3 mag Ks-G Ks-G Colour index
280-285 F6.3 mag H-G H-G Colour index
--------------------------------------------------------------------------------
Byte-by-byte Description of file: auxcode.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 19 I19 --- GaiaDR3 Gaia DR3 source identifier
21- 40 F20.16 deg RAdeg Right ascension (ICRS) at Ep=2016.0
42- 61 F20.16 deg DEdeg Declination (ICRS) at Ep=2016.0
63- 67 I5 --- sk-type-mean ?=-9999 Skiff type code mean
70- 78 E9.4 --- sk-type-sd ?=1e+20 Skiff type code stddev
80-315 A236 --- sk-type-list Skiff type list
318-324 E7.3 --- sk-lum-mean ?=1e+20 Skiff luminosity code mean
326-334 E9.4 --- sk-lum-sd ?=1e+20 Skiff luminosity code stddev
336-337 I2 --- sk-bin [0/10] Skiff binary flag
339 I1 --- sk-BpHgMn [0/5] Skiff Bp/HgMn flag
341 I1 --- sk-C [0/2] Skiff C-type flag
343 I1 --- sk-OB [0/4] Skiff OB flag
345 I1 --- sk-OBe [0/2] Skiff OBe flag
347 I1 --- sk-OIafpe [0/1] Skiff OIafpe flag
349 I1 --- sk-OfWN [0/1] Skiff Of/WN flag
351 I1 --- sk-Ofp [0/4] Skiff Of?p flag
353 I1 --- sk-PN [0/1] Skiff PN flag
355 I1 --- sk-S [0] Skiff S-type flag
357 I1 --- sk-WD [0] Skiff white dwarf flag
359-360 I2 --- sk-WR [0/10] Skiff WR flag
362 I1 --- sk-no-sub [0/5] No subtype flag
364 I1 --- sk-sub-rng [0/4] Subtype range
366-367 I2 --- sk-unc-type [0/11] Uncodifiable type
369-370 I2 --- sk-no-lum [0/15] No luminosity class
372 I1 --- sk-lum-rng [0/3] Luminosity range
374 I1 --- sk-lum-rng-nc [0/2] Non-consec. range
376 I1 --- sk-lum-rng-unc [0] Uncertain luminosity range
378 I1 --- sk-unc-lum [0/2] Uncodifiable luminosity class
380 I1 --- sk-pec-broad [0/9] Peculiarity broadening
382-383 I2 --- sk-pec-em [0/13] Peculiarity emission
385 I1 --- sk-pec-misc [0/7] Peculiarity misc
387 I1 --- sk-pec-NC [0/4] Peculiarity NC
389 I1 --- sk-pec-unc [0/5] Uncertain peculiarity
391-399 E9.4 --- sb-type-code ?=1e+20 SIMBAD type code
401-408 E8.4 --- sb-lum-code ?=1e+20 SIMBAD luminosity code
410 I1 --- sb-bin [0/1] SIMBAD binary flag
412 I1 --- sb-BpHgMn [0/1] SIMBAD Bp/HgMn flag
414 I1 --- sb-C [0/1] SIMBAD C-type flag
416 I1 --- sb-OB [0/1] SIMBAD OB flag
418 I1 --- sb-OBe [0/1] SIMBAD OBe flag
420 I1 --- sb-OIafpe [0/1] SIMBAD OIafpe flag
422 I1 --- sb-OfWN [0/1] SIMBAD Of/WN flag
424 I1 --- sb-Ofp [0/1] SIMBAD Of?p flag
426 I1 --- sb-PN [0/1] SIMBAD PN flag
428 I1 --- sb-S [0/1] SIMBAD S-type flag
430 I1 --- sb-WD [0/1] SIMBAD white dwarf flag
432 I1 --- sb-WR [0/1] SIMBAD WR flag
434 I1 --- sb-no-sub [0/1] No subtype flag
436 I1 --- sb-sub-rng [0/1] Subtype range
438 I1 --- sb-unc-type [0/1] Uncodifiable type
440 I1 --- sb-no-lum [0/1] No luminosity class
442 I1 --- sb-lum-rng [0/1] Luminosity range
444 I1 --- sb-lum-rng-nc [0/1] Non-consec. range
446 I1 --- sb-lum-rng-unc [0/1] Uncertain luminosity range
448 I1 --- sb-unc-lum [0/1] Uncodifiable luminosity class
450 I1 --- sb-pec-broad [0/1] Peculiarity broadening
452 I1 --- sb-pec-em [0/1] Peculiarity emission
454 I1 --- sb-pec-misc [0/1] Peculiarity misc
456 I1 --- sb-pec-NC [0/1] Peculiarity NC
458 I1 --- sb-pec-unc [0/1] Uncertain peculiarity
--------------------------------------------------------------------------------
Acknowledgements:
Nicolas Monsalves, nicolas.monsalves(at)userena.cl
(End) Patricia Vannier [CDS] 02-Mar-2026