V/160                  SHBoost 2024                           (Khalatyan+, 2024)

Transferring spectroscopic stellar labels to 217 million Gaia DR3 XP stars with SHBoost. Khalatyan A., Anders F., Chiappini C., Queiroz A.B.A., Nepal S., dal Ponte M., Jordi C., Guiglion G., Valentini M., Torralba Elipe G., Steinmetz M., Pantaleoni-Gonzalez M., Malhotra S., Jimenez-Arranz O., Enke H., Casamiquela L., Ardevol J. <Astron. Astrophys. 691, A98 (2024)> =2024A&A...691A..98K 2024A&A...691A..98K (SIMBAD/NED BibCode) =2024yCat.5160....0K 2024yCat.5160....0K
ADC_Keywords: Milky Way ; Stars, distances ; Effective temperatures ; Radial velocities ; Space velocities ; Abundances, [Fe/H] Keywords: catalogs - stars: general - stars: statistics - Galaxy: general - Galaxy: stellar content - Galaxy: structure Abstract: With Gaia Data Release 3 (DR3), new and improved astrometric, photometric, and spectroscopic measurements for 1.8 billion stars have become available. Alongside this wealth of new data, however, there are challenges in finding ecient and accurate computational methods for their analysis. In this paper, we explore the feasibility of using machine learning regression as a method of extracting basic stellar parameters and lineof- sight extinctions from spectro-photometric data. To this end, we built a stable gradient-boosted random-forest regressor (xgboost), trained on spectroscopic data, capable of producing output parameters with reliable uncertainties from Gaia DR3 data (most notably the low-resolution XP spectra), without ground-based spectroscopic observations. Using Shapley additive explanations, we interpret how the predictions for each star are influenced by each data feature. For the training and testing of the network, we used high-quality parameters obtained from the StarHorse code for a sample of around eight million stars observed by major spectroscopic stellar surveys, complemented by curated samples of hot stars, very metal-poor stars, white dwarfs, and hot sub-dwarfs. The training data cover the whole sky, all Galactic components, and almost the full magnitude range of the Gaia DR3 XP sample of more than 217 million objects that also have reported parallaxes. We have achieved median uncertainties of 0.20mag in V-band extinction, 0.01dex in logarithmic eective temperature, 0.20dex in surface gravity, 0.18dex in metallicity, and 12% in mass (over the full Gaia DR3 XP sample, with considerable variations in precision as a function of magnitude and stellar type). We succeeded in predicting competitive results based on Gaia DR3 XP spectra compared to classical isochrone or spectral-energy distribution fitting methods we employed in earlier works, especially for parameters AV and Te, along with the metallicity values. Finally, we showcase some potential applications of this new catalogue, including extinction maps, metallicity trends in the Milky Way, and extended maps of young massive stars, metal-poor stars, and metal-rich stars). Description: We use an xgboost regression to produce a catalogue of stellar properties derived from Gaia DR3 XP spectra, astrometry, and multi-wavelength photometry. This catalogue, referred to as SHBoost, comprises the extinction, effective temperature, surface gravity, [M/H], and mass estimates for more than 217 million stars. See also: I/352 : Distances to 1.47 billion stars in Gaia EDR3 (Bailer-Jones+, 2021) File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file shboost.sam 390 1000 Data model of the Gaia DR3 SHBoost catalogue -------------------------------------------------------------------------------- Byte-by-byte Description of file: shboost.sam -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 19 I19 --- GaiaDR3 Gaia DR3 source_id (source_id) 21- 35 F15.11 deg RAdeg Right ascension (ICRS) at Ep=2016.0 (ra) 37- 51 F15.11 deg DEdeg Declination (ICRS) at Ep=2016 (dec) 53- 61 F9.6 mag AV Line-of-sight extinction at λ=5420Å, AV, xgboost point estimate (xgb_av) 63- 71 F9.6 mag AVmean Line-of-sight extinction at λ=5420Å, AV, xgboost-distribution mean value (xgbdistavmean) 73- 83 F11.6 mag s_AVmean Line-of-sight extinction at λ=5420Å, AV, xgboost-distribution standard deviation (xgbdistavstd) 85- 92 F8.6 [K] logTeff Effective temperature, xgboost point estimate (xgb_logteff) 94-101 F8.6 [K] logTeffmean Effective temperature, xgboost-distribution mean value (xgbdistlogteffmean) 103-111 F9.6 [K] s_logTeffmean Effective temperature, xgboost-distribution standard deviation (xgbdistlogteffstd) 113-121 F9.6 [cm/s2] logg Surface gravity, xgboost point estimate (xgb_logg) 123-131 F9.6 [cm/s2] loggmean Surface gravity, xgboost-distribution mean value (xgbdistloggmean) 133-142 F10.6 [cm/s2] s_loggmean Surface gravity, xgboost-distribution standard deviation (xgbdistloggstd) 144-152 F9.6 [-] Met Metallicity, xgboost point estimate (xgb_met) 154-162 F9.6 [-] Metmean Metallicity, xgboost-distribution mean value (xgbdistmetmean) 164-172 F9.6 [-] s_Metmean Metallicity, xgboost-distribution standard deviation (xgbdistmetstd) 174-182 F9.6 Msun Mass Stellar mass, xgboost point estimate (xgb_mass) 184-192 F9.6 Msun Massmean Stellar mass, xgboost-distribution mean value (xgbdistmassmean) 194-205 E12.6 Msun s_Massmean Stellar mass, xgboost-distribution standard deviation (xgbdistmassstd) 207-216 F10.6 pc Dist Distance estimate from the literature (dist) 218-227 F10.6 pc b_Dist ? 16th distance percentile from the literature (dist_lower) 229-238 F10.6 pc B_Dist ? 84th distance percentile from the literature (dist_upper) 240 I1 --- f_Dist [0/2] Distance flag (dist_flag) (1) 242-249 F8.5 mag (BP-RP)0 ? Dereddened colour, derived with gaiaedr3photutils (bprp0) 251-259 F9.5 mag GMAG0 Absolute magnitude, derived with gaiaedr3photutils (mg0) 261-270 F10.5 kpc Xgal Galactocentric Cartesian X coordinate, derived from dist and assuming X0 = -8.2 kpc (xg) 272-281 F10.5 kpc Ygal Galactocentric Cartesian Y coordinate, derived from dist and assuming X0 = 0 kpc (yg) 283-292 F10.5 kpc Zgal Cartesian Z coordinate, derived from dist and assuming Z0 = 0 (zg) 294-302 F9.5 kpc Rgal Galactocentric planar distance, derived from XGal and YGal (rg) 304-313 F10.4 km/s VX ? Galactic Cartesian velocity in X direction (vxg) 315-324 F10.4 km/s VY ? Galactic Cartesian velocity in Y direction (vyg) 326-336 F11.4 km/s VZ ? Galactic Cartesian velocity in Z direction (vzg) 338-347 F10.4 km/s VR ? Galactic radial velocity (vrg) 349-358 F10.4 km/s Vphi ? Galactic angular velocity (vphig) 360-380 A21 --- InputFlag SHBoost input flag (xgb_inputflag) 382 I1 --- q_AV [0/1] AV output quality flag (=0 if xgbdistavstd<0.3) (xgbavoutputflag) 384 I1 --- q_logTeff [0/1] logTeff output quality flag (=0 if xgbdistlogteffstd<0.1) (xgblogteffoutputflag) 386 I1 --- q_logg [0/1] logg output quality flag (=0 if xgbdistloggstd<0.3) (xgbloggoutputflag) 388 I1 --- q_Met [0/1] [M/H] output quality flag (=0 if xgbdistmetstd<0.3) (xgbmetoutputflag) 390 I1 --- q_Mass [0/1] Mass output quality flag (=0 if xgbdistmassstd/xgbdistmassmean<0.3) (xgbmassoutputflag) -------------------------------------------------------------------------------- Note (1): Distance flag as follows: 0 = StarHorse EDR3, Anders et al., 2022A&A...658A..91A 2022A&A...658A..91A, Cat. I/354 1 = Bailer-Jones et al. 2021AJ....161..147B 2021AJ....161..147B, Cat. I/352, photogeo 2 - Bailer-Jones et al. 2021AJ....161..147B 2021AJ....161..147B, Cat. I/352, geo -------------------------------------------------------------------------------- Acknowledgements: Friedrich Anders, fanders(at)fqa.ub.edu
(End) Francois-Xavier Pineau, Patricia Vannier [CDS] 07-Oct-2024
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line