J/AJ/168/210 Classification results from CatBoost & SPE in 4XMM-DR13 (Ma+, 2024)
Search for young stellar objects within 4XMM-DR13 using CatBoost and SPE.
Ma X., Zhang Y., Zhang J., Li C., Kang Z., Li Ji
<Astron. J., 168, 210 (2024)>
=2024AJ....168..210M 2024AJ....168..210M
ADC_Keywords: Models ; QSOs ; YSOs ; Galaxies ; X-ray sources; Stars, normal
Keywords: Astronomy data analysis ; Astronomy databases ; Classification ;
Astrostatistics techniques ; Young stellar objects ; Quasars ;
X-ray sources
Abstract:
Classifying and summarizing large data sets from different sky survey
projects is essential for various subsequent scientific research. By
combining data from 4XMM-DR13, Sloan Digital Sky Survey (SDSS) DR18,
and CatWISE, we formed an XMM-WISE-SDSS sample that included
information in the X-ray, optical, and infrared bands. By cross
matching this sample with data sets from known spectral
classifications from SDSS and LAMOST, we obtained a training data set
containing stars, galaxies, quasars, and young stellar objects (YSOs).
Two machine learning methods, CatBoost and Self-Paced Ensemble (SPE),
were used to train and construct machine learning models through
training sets to classify the XMM-WISE-SDSS sample. Notably, the SPE
classifier showed excellent performance in YSO classification,
identifying 1102 YSO candidates from 160,545 sources, including 258
known YSOs. Then we further verify whether these candidates are YSOs
by the spectra in LAMOST and the identification in the SIMBAD and
VizieR databases. Finally there are 412 unidentified YSO candidates.
The discovery of these new YSOs is an important addition to existing
YSO samples and will deepen our understanding of star formation and
evolution. Moreover we provided a classification catalog for the whole
XMM- WISE-SDSS sample.
Description:
In this work, we utilize data from three different catalogs:
Xmm-Newton, SDSS, and CatWISE.
The X-ray Multi-Mirror Mission (XMM-Newton) satellite was launched by
the European Space Agency (ESA) and deployed on 1999 December 10. It
has a high sensitivity across a broad energy spectrum, ranging from
0.1 to 12keV. For our analysis, we use data from 4XMM-DR13 (IX/69) which
includes source detections from 13,243 European Photon Imaging Camera
(EPIC) observations. We primarly utilize the total band flux (f8),
which encompasses the energy range of 0.2-12keV, along with the f9
X-ray band flux, which spans from 0.5 to 4.5keV.
Since its inception in 1998, the Sloan Digital Sky Survey (SDSS) has
observed stars, galaxies, quasars, and various celestial bodies almost
continuously in the optical and infrared using the 2.5-m wide-angle
optical telescope at Apache Point Observatory in New Mexico, United
States. From SDSS-V DR18, we mainly use the information of u, g, r, i,
z bands. The approximate limiting magnitudes for each band are 22.0 (u),
23.0 (g), 22.5 (r), 22.0 (i) and 20.5 (z).
The CatWISE2020 (see II/365) directory comprises a staggering
1,890,715,640 sources. These sources were meticulously curated from
Wide-Field Infrared Survey Explorer (WISE) and NEOWISE measurements at
3.4um and 4.6um (W1 and W2), spanning from 2010-Jan-7 to 2018-Dec-13.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table3.dat 84 160545 The classification result of the XMM-WISE-SDSS
sample by CatBoost and Self-Paced Ensemble (SPE)
--------------------------------------------------------------------------------
See also:
II/360 : Gaia DR2 x AllWISE catalogue (Marton+, 2019)
II/365 : The CatWISE2020 cat. (updated version 28-01-21) (Marocco+, 2021)
IX/69 : XMM-Newton Serendipitous Source Cat. 4XMM-DR13 (Webb+, 2023)
VII/273 : The Half Million Quasars (HMQ) catalogue (Flesch, 2015)
V/156 : LAMOST DR7 catalogs (Luo+, 2019)
VII/289 : SDSS quasar cat., sixteenth data release (DR16Q) (Lyke+, 2020)
VII/294 : The Million Quasars (Milliquas) cat., version 8 (Flesch, 2023)
J/ApJ/802/60 : Structure of young stellar clusters. II. (Kuhn+, 2015)
J/MNRAS/458/3479 : SVM selection of WISE YSO Candidates (Marton+, 2016)
J/ApJ/902/114 : Stellar X-ray activity.I.Chandra, Gaia & GALEX (Wang+, 2020)
J/ApJ/920/132 : NEOWISE 6.5yr variability in YSOs (Park+, 2021)
J/MNRAS/503/5263 : Sorting of 4XMM-DR9 srcs by machine learning (Zhang+, 2021)
J/ApJS/268/36 : XMM-N view of M31 (New-ANGELS). I. X-ray cat. (Huang+, 2023)
J/ApJS/267/7 : YSO candidates from LAMOST LRS DR9 & ZTF (Zhang+, 2023)
http://www.sdss.org/ : SDSS home page
Byte-by-byte Description of file:table3.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 15 I15 --- ID Source ID from XMM (SRCID)
17- 36 F20.16 deg RAdeg Right ascension (ICRS) (SR_RA)
38- 58 E21.16 deg DEdeg Declination (ICRS) (SR_DEC)
60- 65 A6 --- ClassC Class given by CatBoost (Class_CatBoost) (1)
67- 71 F5.3 --- pC [0.2/1] Classification probability
by Catboost (P_catBoost)
73- 78 A6 --- ClassS Class given by Self-Paced Ensemble (SPE)
(Class_SPE) (2)
80- 84 F5.3 --- pS [0.2/1]? Classification probability
by SPE (P_SPE)
--------------------------------------------------------------------------------
Note (1): Class occurrences from CatBoost (Dorogush+ 2018arXiv181011363V 2018arXiv181011363V ; see
Section 3.1) as follows:
GALAXY = 49849 occurrences
QSO = 90916 occurrences
STAR = 19233 occurrences
YSO = 547 occurrences
Note (2): Class occurrences from SPE as follows:
GALAXY = 60545 occurrences
QSO = 78696 occurrences
STAR = 20277 occurrences
YSO = 1102 occurrences
--------------------------------------------------------------------------------
History:
From electronic version of the journal
(End) Robin Leichtnam [CDS] 25-Jun-2025