J/MNRAS/517/5325 Star ages with ML and GALAH DR3 spectra (Hayden+, 2022)
The GALAH survey chemical clocks.
Hayden M.R., Sharma S., Bland-Hawthorn J., Spina L., Buder S., Ciuca I.,
Asplund M., Casey A.R., De Silva G.M., D'Orazi V., Freeman K.C., Kos J.,
Lewis G.F., Lin J., Lind K., Martell S.L., Schlesinger K.J., Simpson J.D.,
Zucker D.B., Zwitter T., Chen B., Cotar K., Feuillet D., Horner J.,
Joyce M., Nordlander T., Stello D., Tepper-Garcia T., Ting Y.-S., Wang P.,
Wittenmyer R., Wyse R.
<Mon. Not. R. Astron. Soc. 517, 5325-5339 (2022)>
=2022MNRAS.517.5325H 2022MNRAS.517.5325H (SIMBAD/NED BibCode)
ADC_Keywords: Milky Way ; Stars, dwarfs ; Stars, giant ; Spectroscopy ;
Optical ; Positional data ; Abundances ; Stars, ages
Keywords: Galaxy: abundances - Galaxy: kinematics and dynamics -
Galaxy: stellar content - Galaxy: structure
Abstract:
We present the first large-scale study that demonstrates how ages can
be determined for large samples of stars through Galactic chemical
evolution. Previous studies found that the elemental abundances of a
star correlate directly with its age and metallicity. Using this
knowledge, we derive ages for 214577 stars in GALAH DR3 using only
overall metallicities and chemical abundances. Stellar ages are
estimated via the machine learning algorithm XGBoost for stars
belonging to the Milky Way disc with metallicities in the range
-1 < [Fe/H] < 0.5, using main-sequence turn-off stars as our training
set. We find that stellar ages for the bulk of GALAH DR3 are precise
to 1-2 Gyr using this method. With these ages, we replicate many
recent results on the age- kinematic trends of the nearby disc,
including the solar neighbourhood's age- velocity dispersion
relationship and the larger global velocity dispersion relations of
the disc found using Gaia and GALAH. These results show that chemical
abundance variations at a given birth radius are small, and that
strong chemical tagging of stars directly to birth clusters may prove
difficult with our current elemental abundance precision. Our results
highlight the need to measure abundances for as many nucleosynthetic
production sites as possible in order to estimate reliable ages from
chemistry. Our methods open a new door into studies of the kinematic
structure and evolution of the disc, as ages may potentially be
estimated to a precision of 1-2 Gyr for a large fraction of stars in
existing spectroscopic surveys.
Description:
As explained in introduction section, we estimate the ages of several
hundred thousand stars directly from their chemical abundances as
measured in GALAH DR3 (Buder et al. 2021MNRAS.506..150B 2021MNRAS.506..150B, Cat.
J/MNRAS/506/150). We use MSTO stars as a training set for Bayesian and
machine learning models for age estimation, and attempt to estimate
ages for stars across the H-R diagram based on chemical abundances
alone. GALAH DR3 provides abundance determinations for nearly 30
elements, allowing us to be selective and choose elements that are
well estimated for a large fraction of the stars sample while also
covering the different nucleosynthetic production sites in the Milky
Way.
As mentionned in section 2, we mainly use GALAH DR3 Hermes
spectroscopic data (with added K2-Hermes and TESS-Hermes data survey)
covers four wavelength ranges (4713-4903Å, 5648-5873Å,
6478-6737Å, and 7585-7887Å). Stellar atmospheric parameters and
individual abundances are derived using SME model. In this analysis,
we use a selection of 13 well-measured elemental abundances from GALAH
DR3: [Fe/H], [Mg/Fe], [Ca/Fe], [Ti I/Fe], [Si/Fe], [O/Fe], [Mn/Fe],
[Cr/Fe], [Na/Fe], [K/Fe], [Y/Fe], [Ba/Fe], and [Sc/Fe] which span a
range of nucleosynthetic production sites. In data selection, we
include quality cuts on SNR spectra, stellar parameters and abundance
flags, Teff, [Fe/H] values and others (i.e see criteria table in
section 2). Yielding to a sample of 155519 dwarf stars and 92645
giant stars for which we determine ages.
For the training set we use MSTO and subgiant stars, with ages
computed with the code bstep giving Bayesian estimate of intrinsic
stellar and astrometric parameters. In addition to the abundance and
stellar parameter quality conditions applied to the rest of the GALAH
sample, we apply additional criteria for the MSTO training set on
logg, SNR, age τ and στ (i.e see criteria in section
2). Thus as reported in section 3, computed age estimations with Bayes
model (i.e section 3.1) and finally ML XGBoost gradient boosting
algorithm (i.e section 3.2) for whole 214577 stars sample using
training and test sets. XGBoost age results are presented in
table.dat.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table.dat 109 214577 Chemical clock ages estimated with ML XGBoost
algorithm for our selected Milky way stars
--------------------------------------------------------------------------------
See also:
J/MNRAS/506/150 : The GALAH+ Survey DR3 (Buder+, 2021)
J/MNRAS/503/3279 : GALAH survey. Galactic disc with open clusters (Spina+,2021)
J/MNRAS/489/176 : Dynamical heating across the Milky Way disc
(Mackereth+, 2019)
J/MNRAS/473/2004 : TESS-HERMES Survey Data Release 1 catalog (Sharma+, 2018)
J/MNRAS/475/5487 : Stellar properties of KIC stars (Silva Aguirre+, 2018)
J/MNRAS/456/3655 : Masses and ages of red giants (Martig+, 2016)
J/MNRAS/474/2580 : Temporal evolution of neutron-capture elements (Spina+,2018)
J/A+A/645/A85 : Age dissection of the Milky Way discs (Miglio+, 2021)
J/A+A/640/A81 : Abundances of 72 solar-type stars (Nissen+, 2020)
J/A+A/639/A127 : Age-chemical-clocks-metallicity relations (Casali+, 2020)
J/A+A/627/A117 : Equivalent widths for six M67 stars (Liu+, 2019)
J/A+A/562/A71 : Chemical abundances of solar neighbourhood dwarfs
(Bensby+, 2014)
J/A+A/530/A138 : Geneva-Copenhagen survey re-analysis (Casagrande+, 2011)
J/ApJ/887/114 : VLA ∼8-10GHz obs. of WISE Galactic HII regions
(Wenger+, 2019)
J/ApJ/887/80 : Gas phase oxygen abundances for HII regions (Kreckel+, 2019)
J/ApJ/865/68 : Abundances for 79 Sun-like stars within 100pc (Bedell+,2018)
J/ApJ/858/28 : Mixing-length parameter for a sample of KIC stars
(Viani+, 2018)
J/ApJ/823/114 : The Cannon: a new approach to determine masses (Ness+, 2016)
J/ApJ/817/40 : High-resolution NIR spectra of local giants (Feuillet+,2016)
J/ApJ/808/16 : The Cannon: a new approach to determine abundances
(Ness+, 2015)
J/ApJ/722/1373 : ω Centauri giants abundances (Johnson+, 2010)
V/117 : Geneva-Copenhagen Survey of Solar neighbourhood
(Holmberg+, 2007)
III/284 : APOGEE-2 data from DR16 (Johnsson+, 2020)
I/345 : Gaia DR2 (Gaia Collaboration, 2018)
I/337 : Gaia DR1 (Gaia Collaboration, 2016)
Byte-by-byte Description of file: table.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 15 I15 --- GALAH GALAH identifier from GALAH+ DR3 of Buder et
al. 2021MNRAS.506..150B 2021MNRAS.506..150B, Cat. J/MNRAS/506/150
(galah_id)
17- 32 A16 --- 2MASS 2MASS identifier from 2MASS All-Sky Catalogue
2003yCat.2246....0C 2003yCat.2246....0C, Cat. II/246 (2mass_id)
34- 50 F17.14 Gyr Age The chemical clock age estimated from XGBoost
algorithm detailed in sect. 3.2, ages
recommended for use(age_xgboost)
52- 67 F16.14 Gyr e_Age The total combined Age error from XGBoost
algorithm (ageerrorxgboost) (1)
69 I1 --- Test Flag True as 1 for 5190 cases and 0 as false
to indicates stars selected for the test set
of XGBoost verifications (istestingstar) (2)
71 I1 --- Train Flag True as 1 for 12109 cases and 0 as false
to indicates stars selected for the training
set of XGBoost (istrainingstar) (2)
73- 90 F18.15 Gyr AgeS2021 The chemical clock ages derived by Sharma
et al. (2021MNRAS.506.1761S 2021MNRAS.506.1761S) using Bayes th.
not recommended for use (agesharma2021bayes)
92-109 F18.15 Gyr e_AgeS2021 The error in the chemical clock AgeS2021
not recommended for use
(ageerrorsharma2021_bayes)
--------------------------------------------------------------------------------
Note (1): This is calculated using a monte carlo of the observational errors
for each star, combined in quadrature with the systematic errors
intrinsic to the method bottom panel presented in figure 8 (random
age errors due to abundance uncertainties as a function of SNR) of
the section 3.2.
Note (2): As mentionned in section 3.2, we split our initial sample of 15424
MSTO stars into a training set (70 per cent of the original MSTO
sample) and test set (30 per cent of the MSTO sample). The training
set uses 13 chemical abundances described in the data section as
input, along with the desired output parameter of age τ determined
from MSTO isochrone fitting. We ran a fivefold cross validation grid
of several hundred thousand XGBoost hyper parameters to obtain the
model that best reproduced the trends of the test set while trying
to minimize the overfitting of the training set.
--------------------------------------------------------------------------------
History:
From electronic version of the journal
(End) Luc Trabelsi [CDS] 21-Oct-2025