J/A+A/706/A341 Selection function Gaia DR3 open cluster census (Hunt+, 2026)
The selection function of the Gaia DR3 open cluster census.
Hunt E.L., Cantat-Gaudin T., Anders F., Malhotra S., Spina L.,
Castro-Ginard A., Cavallo L.
<Astron. Astrophys. 706, A341 (2026)>
=2026A&A...706A.341H 2026A&A...706A.341H (SIMBAD/NED BibCode)
ADC_Keywords: Milky Way ; Surveys ; Clusters, open ; Optical ; Models
Keywords: methods: data analysis - galaxy: disk - galaxy: evolution -
open clusters and associations: general
Abstract:
Open clusters are among the most useful and widespread tracers of
Galactic structure. The completeness of the Galactic open cluster
census, however, remains poorly understood. For the first time ever,
we establish the selection function of an entire open cluster census,
publishing our results as an open-source Python package for use by the
community. Our work is valid for the Hunt & Reffert
(2024A&A...686A..42H 2024A&A...686A..42H, Cat. J/A+A/686/A42) catalogue of clusters in
Gaia DR3. We developed and open-sourced our cluster simulator from our
first work. Then, we performed 80590 injection and retrievals of
simulated open clusters to test the Hunt & Reffert
(2024A&A...686A..42H 2024A&A...686A..42H, Cat. J/A+A/686/A42) catalogue's sensitivity. We
fit a logistic model of cluster detectability that depends only on a
cluster's number of stars, median parallax error, Gaia data density,
and a user-specified significance threshold. We find that our simple
model accurately predicts cluster detectability, with a 94.53%
accuracy on our training data that is comparable to a machine-learning
based model with orders of magnitude more parameters. Our model itself
offers numerous insights on why certain clusters are detected. We
briefly use our model to show that cluster detectability depends on
non-intuitive parameters, such as a cluster's proper motion, and we
show that even a modest 25km/s boost to a cluster's orbital speed can
result in an almost 3 times higher detection probability, depending on
its position. In addition, we publish our raw cluster injection and
retrievals and cluster memberships, which could be used for a number
of other science cases -- such as estimating cluster membership
incompleteness. Using our results, selection effect-corrected studies
are now possible with the open cluster census. Our work will enable a
number of brand new types of study, such as detailed comparisons
between the Milky Way's cluster census and recent extragalactic
cluster samples.
Description:
In this work, we established the first ever global selection func-
tion of the OC census, which is valid for the Hunt and Reffert
(2024A&A...686A..42H 2024A&A...686A..42H, Cat. J/A+A/686/A42) catalogue. Building on our
work in Paper I Hunt and Reffert (2024A&A...646A.104H 2024A&A...646A.104H, Cat.
J/A+A/646/A104), we created an accurate cluster simulator and open
sourced it in the ocelot Python package. We performed 80 590 cluster
injection and retrievals into Gaia DR3, varying a wide range of
different parameters in order to thoroughly investigate cluster
detectability.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
clusters.dat 826 233917 Simulated clusters (table B.1)
members.csv 643 50873539 All simulated stars and any recovered Gaia stars
for any given detection of a cluster, from
which the membership of any cluster can be
reconstructred (table B.2)
--------------------------------------------------------------------------------
See also:
J/A+A/646/A104 : Improving the open cluster census. I. (Hunt+, 2021)
J/A+A/686/A42 : Improving the open cluster census. III. (Hunt+, 2024)
Byte-by-byte Description of file: clusters.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 7 I7 --- Cluster Simulated cluster ID (internal numbering
scheme) (cluster_id) (1)
9- 11 A3 --- Det Unique three-character detection ID
given to this cluster
(detection_id) (2)
13- 17 A5 --- BestDet Flag indicating if this detection was
the best detection attempt of this
given cluster, based on
simple heuristics (best_detection)
19- 20 I2 --- minClSize Minimum cluster size parameter of
HDBSCAN used for this cluster detection
(minclustersize)
22- 44 F23.20 --- cst ?=- Cluster significance test score of
this cluster (cst) (3)
46- 50 I5 --- Nstars Total number of stars in the cluster
detection, assuming it was detected,
including true members and false
members (n_stars)
52- 56 I5 --- Nsim Number of member stars simulated for
this cluster and injected into Gaia
data (n_simulated)
58- 62 I5 --- NsimDet Number of simulated member stars that
were included in the detection of this
cluster, assuming it was detected
(nsimulateddetected)
64- 68 I5 --- Nnoise Number of member stars that are not true
members (n_noise)
70- 73 I4 --- NHunt23 Number of member stars that are in
common with a known cluster in
Hunt & Reffert (2023,
Cat. J/A+A/686/A42) (n_hunt23)
75- 96 E22.20 --- fracNstarsSim Fraction of detected stars that are
simulated, i.e. true member stars
(fracnstars_simulated)
98-119 F22.20 --- fracNstarsNoise Fraction of detected member stars that
are not true members
(fracnstars_noise)
121-142 F22.20 --- fracNstarsH23 Fraction of detected member stars that
are in common with a known cluster in
Hunt & Reffert (2023,
Cat. J/A+A/686/A42)
(fracnstars_hunt23)
144-165 E22.20 --- fracSimuDet Fraction of stars simulated for this
cluster that were detected as member
stars (fracsimulateddetected)
167-188 F22.18 deg RAdeg Right ascension (J2015.5) (ra)
190-212 F23.19 deg DEdeg Declination (J2015.5) (dec)
214-236 F23.19 deg GLON Galactic longitude (l)
238-260 E23.20 deg GLAT Galactic latitude (b)
262-281 F20.14 pc Dist Cluster distance (distance)
283-305 E23.20 mas/yr pmRA Proper motion in right ascension
multiplied by cos of declination (pmra)
307-329 E23.20 mas/yr pmDE Proper motion in declination (pmdec)
331-353 E23.20 mas/yr pmGLON Proper motion in Galactic longitude
(pml)
355-377 E23.20 mas/yr pmGLAT Proper motion in Galactic latitude (pmb)
379-402 F24.19 km/s RV Cluster radial velocity
(radial_velocity)
404-427 F24.17 pc Xpos Cartesian X coordinate in
Galactocentric coordinates (x)
429-452 F24.17 pc Ypos Cartesian Y coordinate in
Galactocentric coordinates (y)
454-477 F24.18 pc Zpos Cartesian Z coordinate in
Galactocentric coordinates (z)
479-498 F20.14 pc rho Galactocentric cylindrical rho
coordinate (2D distance from Galactic
center) (rho)
500-518 F19.16 pc phi Galactocentric cylindrical phi
coordinate (phi)
520-539 F20.15 km/s Vphi Velocity in the phi direction (v_phi)
541-565 F25.20 km/s Vrho Velocity in the rho direction (v_rho)
567-588 E22.19 km/s Vz Velocity in the z direction (v_z)
590-607 F18.14 km/s Vcirc Circular velocity at the location of
this cluster based on adopted Milky Way
potential (v_circ)
609-627 F19.17 pc Rcore Cluster core radius (King model)
(r_core)
629-648 F20.16 pc Rtidal Cluster tidal radius (King model)
(r_tidal)
650-669 F20.16 pc Rjacobi Cluster Jacobi radius (r_jacobi)
671-691 F21.15 Msun Mass Cluster mass (mass)
693-710 F18.16 [yr] logAge Log (base 10) of cluster age in years
(log_age)
712-732 F21.18 mag Ext Cluster extinction (extinction)
734-755 E22.20 mag diffExt Gaussian standard deviation of cluster
differential extinction
(differential_extinction)
757-779 E23.20 [-] [M/H] Cluster metallicity as [M/H], following
the PARSEC isochrone metallicity
definition (metallicity)
781-799 F19.17 --- VirRatio Cluster virial ratio (0.5=virialized)
(virial_ratio)
801-807 F7.4 mag m10 Value of m10 parameter from Gaia DR3
selection function (m10)
809-826 F18.15 mag magRybizki Median magnitude of stars removed from
Hunt & Reffert (2023,
Cat. J/A+A/686/A42) sample of Gaia DR3
data (m_rybizki)
--------------------------------------------------------------------------------
Note (1): Each simulated cluster may appear multiple times in this table,
depending on how many times it was detected at different HDBSCAN
minclustersize runs.
Note (2): Detection IDs can be used to reconstruct the membership of a given
detection, by querying all stars in Table B.2 that include the same detection
ID in their detection_id column.
Note (3): see Hunt & Reffert (2021A&A...646A.104H 2021A&A...646A.104H) for a definition.
--------------------------------------------------------------------------------
Description of file: members.csv
--------------------------------------------------------------------------------
Column unit Label Explanations
--------------------------------------------------------------------------------
1 --- Det Comma-separated list of all unique cluster
detection IDs that this star belongs to, if any
(detection_id)
2 --- Cluster Original ID of simulated cluster (cluster_id)
3 --- Sim Original ID of simulated member star within
simulated cluster (simulated_id)
4 --- SourceId Gaia DR3 source ID for false positive cluster
members (source_id)
5 --- SimuStar Flag indicating if star was simulated
(simulated_star)
6 deg RAdeg Right ascension (J2015.5) (ra)
7 deg DEdeg Declination (J2015.5) (dec)
8 deg GLON Galactic longitude (l)
9 deg GLAT Galactic latitude (b)
10 mas/yr pmRA Proper motion in right ascension multiplied by
cos of declination (pmra)
11 mas/yr e_pmRA Error on proper motion in right ascension
multiplied by declination (pmra_error)
12 mas/yr pmDE Proper motion in declination (pmdec)
13 mas/yr e_pmDE Error on proper motion in declination
(pmdec_error)
14 mas plx Parallax of source (parallax)
15 mas e_plx Error on parallax of source (parallax_error)
16 mas/yr pmRAtrue True proper motion in right ascension multiplied
by cos of declination (simulated stars only)
(pmra_true)
17 mas/yr pmDEtrue True proper motion in declination
(simulated stars only) (pmdec_true)
18 mas plxtrue True parallax (simulated stars only)
(parallax_true)
19 km/s RVtrue True source radial velocity (simulated stars only)
(radialvelocitytrue)
20 mag Gmag Magnitude in Gaia DR3 G band (photgmean_mag)
21 mag BPmag Magnitude in Gaia DR3 G_BP band (photbpmean_mag)
22 mag RPmag Magnitude in Gaia DR3 G_RP band (photrpmean_mag)
23 mag BP-RP BP minus RP colour (bp_rp)
24 mag G-RP G minus RP colour (g_rp)
25 mag BP-G BP minus G colour (bp_g)
26 e-/s e_FG Mean flux error in G band (photgmeanfluxerror)
27 e-/s e_FBP Mean flux error in G_BP band
(photbpmeanfluxerror)
28 e-/s e_FRP Mean flux error in G_RP band
(photrpmeanfluxerror)
29 mag Gmagtrue True G magnitude (simulated stars only)
(photgmeanmagtrue)
30 mag BPmagtrue True G_BP magnitude (simulated stars only)
(photbpmeanmagtrue)
31 mag RPmagtrue True G_RP magnitude (simulated stars only)
(photrpmeanmagtrue)
32 Msun Mass True mass (simulated stars only) (mass)
33 K Temp True temperature (simulated stars only)
(temperature)
34 Lsun Lum True luminosity (simulated stars only)
(luminosity)
35 mag Ext True extinction (simulated stars only)
(extinction)
36 [cm/s2] logg True surface gravity (simulated stars only)
(log_g)
37 --- NComp Number of companions of the star
(simulated stars only) (companions)
38 --- NCompUn Number of companions that are unresolved
(simulated stars only) (unresolved_companions)
--------------------------------------------------------------------------------
Acknowledgements:
Emily Hunt, emily.lauren.hunt(at)univie.ac.at
(End) Patricia Vannier [CDS] 16-Jan-2026