J/ApJ/941/104 Classification of Chandra sources (Yang+, 2022)
Classifying Unidentified X-Ray Sources in the Chandra Source Catalog Using
a Multiwavelength Machine-learning Approach.
Yang H., Hare J., Kargaltsev O., Volkov I., Chen S., Rangelov B.
<Astrophys. J., 941, 104 (2022)>
=2022ApJ...941..104Y 2022ApJ...941..104Y (SIMBAD/NED BibCode)
ADC_Keywords: Active gal. nuclei ; Binaries, X-ray ; X-ray sources
Keywords: Catalogs - X-ray sources - Classification - Random Forests -
X-ray binary stars - Active galactic nuclei - X-ray stars -
Young stellar objects - Cataclysmic variable stars -
Astrostatistics tools - X-ray surveys - Compact objects
Abstract:
The rapid increase in serendipitous X-ray source detections requires
the development of novel approaches to efficiently explore the nature
of X-ray sources. If even a fraction of these sources could be
reliably classified, it would enable population studies for various
astrophysical source types on a much larger scale than currently
possible. Classification of large numbers of sources from multiple
classes characterized by multiple properties (features) must be done
automatically and supervised machine learning (ML) seems to provide
the only feasible approach. We perform classification of Chandra
Source Catalog version 2.0 (CSCv2) sources to explore the potential of
the ML approach and identify various biases, limitations, and
bottlenecks that present themselves in these kinds of studies. We
establish the framework and present a flexible and expandable Python
pipeline, which can be used and improved by others. We also release
the training data set of 2941 X-ray sources with confidently
established classes. In addition to providing probabilistic
classifications of 66,369 CSCv2 sources (21% of the entire CSCv2
catalog), we perform several narrower-focused case studies (high-mass
X-ray binary candidates and X-ray sources within the extent of the
H.E.S.S. TeV sources) to demonstrate some possible applications of our
ML approach. We also discuss future possible modifications of the
presented pipeline, which are expected to lead to substantial
improvements in classification confidences.
Description:
The following tables present the properties and classification results
of the good CSCv2 sample (GCS), the training dataset (TD), and the
X-ray sources within the unidentified HESS sources using the
multiwavelength machine-learning method (MUWCLASS).
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table8.dat 473 66369 Properties and classification results of the
GCS sources using MUWCLASS
table9.dat 543 2941 Properties and classification results of the
TD sources using MUWCLASS
table10.dat 566 2000 Properties and classification results of the
HESS field sources using MUWCLASS
--------------------------------------------------------------------------------
See also:
IX/57 : The Chandra Source Catalog (CSC), Release 2.0 (Evans+, 2019)
Byte-by-byte Description of file: table8.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 21 A21 --- CSCv2 CSCv2 source name (1)
23- 33 F11.7 deg RAdeg CSCv2 Right Ascension (J2000)
35- 45 F11.7 deg DEdeg CSCv2 Declination (J2000)
47- 50 F4.2 arcsec PU CSCv2 errellipser0 value (2)
52- 57 F6.2 --- S/N CSCv2 X-ray significance
59- 66 E8.3 mW/m2 Fs Average soft (0.5-1.2 keV) band flux
68- 72 E5.1 mW/m2 e_Fs The 1-sigma uncertainty in Fs
74- 81 E8.3 mW/m2 Fm Average medium (1.2-2 keV) band flux
83- 87 E5.1 mW/m2 e_Fm The 1-sigma uncertainty in Fm
89- 96 E8.3 mW/m2 Fh Average hard (2-7 keV) band flux
98-102 E5.1 mW/m2 e_Fh The 1-sigma uncertainty in Fh
104-111 E8.3 mW/m2 Fb Average broad (0.5-7 keV) band flux
113-117 E5.1 mW/m2 e_Fb The 1-sigma uncertainty in Fb
119-123 F5.3 --- PIntra ? Intra-observation variability
probability (3)
125-129 F5.3 --- PInter ? Inter-observation variability probability
131-136 F6.3 mag Gmag ? Gaia EDR3 G band magnitude
138-142 F5.3 mag e_Gmag ? The 1-sigma uncertainty in Gmag
144-149 F6.3 mag BPmag ? Gaia EDR3 BP band magnitude
151-155 F5.3 mag e_BPmag ? The 1-sigma uncertainty in BPmag
157-162 F6.3 mag RPmag ? Gaia EDR3 RP band magnitude
164-168 F5.3 mag e_RPmag ? The 1-sigma uncertainty in RPmag
170-174 F5.2 mag Jmag ? 2MASS J band magnitude
176-179 F4.2 mag e_Jmag ? The 1-sigma uncertainty in Jmag
181-185 F5.2 mag Hmag ? 2MASS H band magnitude
187-191 F5.2 mag e_Hmag ? The 1-sigma uncertainty in Hmag
193-197 F5.2 mag Kmag ? 2MASS K band magnitude
199-202 F4.2 mag e_Kmag ? The 1-sigma uncertainty in Kmag
204-210 F7.4 mag W1mag ? W1 band magnitude (4)
212-217 F6.4 mag e_W1mag ? The 1-sigma uncertainty in W1mag
219-225 F7.4 mag W2mag ? W2 band magnitude (4)
228-233 F6.4 mag e_W2mag ? The 1-sigma uncertainty in W2mag
235-240 F6.3 mag W3mag ? W3 band magnitude (4)
242-246 F5.3 mag e_W3mag ? The 1-sigma uncertainty in W3mag
248-254 F7.3 mas plx ? Gaia EDR3 absolute stellar parallax
256-260 F5.3 mas e_plx ? The 1-sigma uncertainty in plx
262-268 F7.3 mas/yr pm ? Gaia EDR3 total proper motion
270-276 F7.1 pc rgeo ? Median of geometric distance (5)
278-284 F7.1 pc b_rgeo ? 16th percentile of rgeo
286-292 F7.1 pc B_rgeo ? 84th percentile of rgeo
294-299 F6.4 --- PAGN Classification probability as an AGN
301-306 F6.4 --- PCV Classification probability as a CV
308-314 F7.4 --- PHM* Classification probability as a HM-STAR
316-322 F7.4 --- PHMXB Classification probability as a HMXB
324-330 F7.4 --- PLM* Classification probability as a LM-STAR
332-337 F6.4 --- PLMXB Classification probability as a LMXB
339-344 F6.4 --- PNS Classification probability as a NS
346-351 F6.4 --- PYSO Classification probability as a YSO
353-358 F6.4 --- e_PAGN The 1-sigma uncertainty in PAGN
360-365 F6.4 --- e_PCV The 1-sigma uncertainty in PCV
367-372 F6.4 --- e_PHM* The 1-sigma uncertainty in PHM*
374-379 F6.4 --- e_PHMXB The 1-sigma uncertainty in PHMXB
381-386 F6.4 --- e_PLM* The 1-sigma uncertainty in PLM*
388-393 F6.4 --- e_PLMXB The 1-sigma uncertainty in PLMXB
395-400 F6.4 --- e_PNS The 1-sigma uncertainty in PNS
402-407 F6.4 --- e_PYSO The 1-sigma uncertainty in PYSO
409-415 A7 --- Class Predicted class (6)
417-422 F6.4 --- ClassP Classification probability (7)
424-429 F6.4 --- e_ClassP The 1-sigma uncertainty in ClassP
431-440 F10.3 --- CT Classification confidence threshold
442-473 A32 --- Flags Compilation of CSCv2 flags (8)
--------------------------------------------------------------------------------
Note (1): In the form 2CXO Jhhmmss.s+ddmmss.
Note (2): The major radius of the 95% confidence level position error ellipse.
Note (3): Highest value of Kuiper's test variability probability across
all observations available in CSCv2.
Note (4): From AllWISE, CatWISE2020 and unWISE catalogs.
Note (5): From Gaia EDR3 Distances catalog.
Note (6): Of the source with the highest classification probability
among eight classes.
Note (7): Of the predicted class calculated from MUWCLASS.
Note (8): Including conf flag (conf), extent flag (extent), and
pileup flag (pileup), jointed by a |.
--------------------------------------------------------------------------------
Byte-by-byte Description of file: table9.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 21 A21 --- CSCv2 CSCv2 source name (1)
23- 33 F11.7 deg RAdeg CSCv2 Right Ascension (J2000)
35- 45 F11.7 deg DEdeg CSCv2 Declination (J2000)
47- 50 F4.2 arcsec PU CSCv2 errellipser0 value (2)
52- 57 F6.2 --- S/N CSCv2 X-ray significance
59- 66 E8.3 mW/m2 Fs Average soft (0.5-1.2 keV) band flux
68- 72 E5.1 mW/m2 e_Fs The 1-sigma uncertainty in Fs
74- 81 E8.3 mW/m2 Fm Average medium (1.2-2 keV) band flux
83- 87 E5.1 mW/m2 e_Fm The 1-sigma uncertainty in Fm
89- 97 E9.4 mW/m2 Fh Average hard (2-7 keV) band flux
99-103 E5.1 mW/m2 e_Fh The 1-sigma uncertainty in Fh
105-113 E9.4 mW/m2 Fb Average broad (0.5-7 keV) band flux
115-119 E5.1 mW/m2 e_Fb The 1-sigma uncertainty in Fb
121-125 F5.3 --- PIntra ? Intra-observation variability
probability (3)
127-131 F5.3 --- PInter ? Inter-observation variability probability
133-138 F6.3 mag Gmag ? Gaia EDR3 G band magnitude
140-144 F5.3 mag e_Gmag ? The 1-sigma uncertainty in Gmag
146-151 F6.3 mag BPmag ? Gaia EDR3 BP band magnitude
153-157 F5.3 mag e_BPmag ? The 1-sigma uncertainty in BPmag
159-164 F6.3 mag RPmag ? Gaia EDR3 RP band magnitude
166-170 F5.3 mag e_RPmag ? The 1-sigma uncertainty in RPmag
172-176 F5.2 mag Jmag ? 2MASS J band magnitude
178-182 F5.2 mag e_Jmag ? The 1-sigma uncertainty in Jmag
184-188 F5.2 mag Hmag ? 2MASS H band magnitude
190-193 F4.2 mag e_Hmag ? The 1-sigma uncertainty in Hmag
195-199 F5.2 mag Kmag ? 2MASS K band magnitude
201-204 F4.2 mag e_Kmag ? The 1-sigma uncertainty in Kmag
206-213 F8.4 mag W1mag ? W1 band magnitude (4)
215-220 F6.4 mag e_W1mag ? The 1-sigma uncertainty in W1mag
222-229 F8.4 mag W2mag ? W2 band magnitude (4)
231-236 F6.4 mag e_W2mag ? The 1-sigma uncertainty in W2mag
238-243 F6.2 mag W3mag ? W3 band magnitude (4)
245-248 F4.2 mag e_W3mag ? The 1-sigma uncertainty in W3mag
250-255 F6.2 mas plx ? Gaia EDR3 absolute stellar parallax
257-260 F4.2 mas e_plx ? The 1-sigma uncertainty in plx
262-267 F6.3 mas/yr PM ? Gaia EDR3 total proper motion
269-275 F7.1 pc rgeo ? Median of geometric distance (5)
277-283 F7.1 pc b_rgeo ? 16th percentile of rgeo
285-291 F7.1 pc B_rgeo ? 84th percentile of rgeo
293-298 F6.4 --- PAGN Classification probability as an AGN
300-305 F6.4 --- PCV Classification probability as a CV
307-312 F6.4 --- PHM* Classification probability as a HM-STAR
314-319 F6.4 --- PHMXB Classification probability as a HMXB
321-326 F6.4 --- PLM* Classification probability as a LM-STAR
328-333 F6.4 --- PLMXB Classification probability as a LMXB
335-340 F6.4 --- PNS Classification probability as a NS
342-347 F6.4 --- PYSO Classification probability as a YSO
349-354 F6.4 --- e_PAGN The 1-sigma uncertainty in PAGN
356-361 F6.4 --- e_PCV The 1-sigma uncertainty in PCV
363-368 F6.4 --- e_PHM* The 1-sigma uncertainty in PHM*
370-375 F6.4 --- e_PHMXB The 1-sigma uncertainty in PHMXB
377-382 F6.4 --- e_PLM* The 1-sigma uncertainty in PLM*
384-389 F6.4 --- e_PLMXB The 1-sigma uncertainty in PLMXB
391-396 F6.4 --- e_PNS The 1-sigma uncertainty in PNS
398-403 F6.4 --- e_PYSO The 1-sigma uncertainty in PYSO
405-411 A7 --- Class Predicted class (6)
413-418 F6.4 --- ClassP Classification probability (7)
420-425 F6.4 --- e_ClassP The 1-sigma uncertainty in ClassP
427-434 F8.3 --- CT Classification confidence threshold
436-467 A32 --- Flags Compilation of CSCv2 flags (8)
469-500 A32 --- Catalog Source name from literature verified
catalogs (9)
502-508 A7 --- TClass True class of source from the TD
510-543 A34 --- ClassR Reference of classifications of the
TD source
--------------------------------------------------------------------------------
Note (1): In the form 2CXO Jhhmmss.s+ddmmss.
Note (2): The major radius of the 95% confidence level position error ellipse.
Note (3): Highest value of Kuiper's test variability probability across
all observations available in CSCv2.
Note (4): From AllWISE, CatWISE2020 and unWISE catalogs.
Note (5): From Gaia EDR3 Distances catalog.
Note (6): Of the source with the highest classification probability
among eight classes.
Note (7): Of the predicted class calculated from MUWCLASS.
Note (8): Including conf flag (con), extent flag (extent), and
pileup flag (pileup), jointed by a |.
Note (9): For the classification of TD sources.
--------------------------------------------------------------------------------
Byte-by-byte Description of file: table10.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 21 A21 --- CSCv2 CSCv2 source name (1)
23- 33 F11.7 deg RAdeg CSCv2 Right Ascension (J2000)
35- 45 F11.7 deg DEdeg CSCv2 Declination (J2000)
47- 51 F5.2 arcsec PU CSCv2 errellipser0 value (2)
53- 58 F6.2 --- S/N CSCv2 X-ray significance
60- 67 E8.3 mW/m2 Fs Average soft (0.5-1.2 keV) band flux
69- 73 E5.1 mW/m2 e_Fs The 1-sigma uncertainty in Fs
75- 82 E8.3 mW/m2 Fm Average medium (1.2-2 keV) band flux
84- 88 E5.1 mW/m2 e_Fm The 1-sigma uncertainty in Fm
90- 97 E8.3 mW/m2 Fh Average hard (2-7 keV) band flux
99-103 E5.1 mW/m2 e_Fh The 1-sigma uncertainty in Fh
105-112 E8.3 mW/m2 Fb Average broad (0.5-7 keV) band flux
114-118 E5.1 mW/m2 e_Fb The 1-sigma uncertainty in Fb
120-124 F5.3 --- PIntra ? Intra-observation variability
probability (3)
126-130 F5.3 --- PInter ? Inter-observation variability probability
132-137 F6.3 mag Gmag ? Gaia EDR3 G band magnitude
139-143 F5.3 mag e_Gmag ? The 1-sigma uncertainty in Gmag
145-150 F6.3 mag BPmag ? Gaia EDR3 BP band magnitude
152-156 F5.3 mag e_BPmag ? The 1-sigma uncertainty in BPmag
158-163 F6.3 mag RPmag ? Gaia EDR3 RP band magnitude
165-169 F5.3 mag e_RPmag ? The 1-sigma uncertainty in RPmag
171-175 F5.2 mag Jmag ? 2MASS J band magnitude
177-180 F4.2 mag e_Jmag ? The 1-sigma uncertainty in Jmag
182-186 F5.2 mag Hmag ? 2MASS H band magnitude
188-191 F4.2 mag e_Hmag ? The 1-sigma uncertainty in Hmag
193-197 F5.2 mag Kmag ? 2MASS K band magnitude
199-202 F4.2 mag e_Kmag ? The 1-sigma uncertainty in Kmag
204-211 F8.4 mag W1mag ? W1 band magnitude (4)
213-218 F6.4 mag e_W1mag ? The 1-sigma uncertainty in W1mag
220-226 F7.4 mag W2mag ? W2 band magnitude (4)
228-233 F6.4 mag e_W2mag ? The 1-sigma uncertainty in W2mag
235-240 F6.2 mag W3mag ? W3 band magnitude (4)
242-245 F4.2 mag e_W3mag ? The 1-sigma uncertainty in W3mag
247-252 F6.2 mas plx ? Gaia EDR3 absolute stellar parallax
254-257 F4.2 mas e_plx ? The 1-sigma uncertainty in plx
259-265 F7.3 mas/yr PM ? Gaia EDR3 total proper motion
267-272 F6.1 pc rgeo ? Median of geometric distance (5)
274-279 F6.1 pc b_rgeo ? 16th percentile of rgeo
281-287 F7.1 pc B_rgeo ? 84th percentile of rgeo
289-294 F6.4 --- PAGN Classification probability as an AGN
296-301 F6.4 --- PCV Classification probability as a CV
303-308 F6.4 --- PHM* Classification probability as a HM-STAR
310-315 F6.4 --- PHMXB Classification probability as a HMXB
317-322 F6.4 --- PLM* Classification probability as a LM-STAR
324-329 F6.4 --- PLMXB Classification probability as a LMXB
331-336 F6.4 --- PNS Classification probability as a NS
338-342 F5.3 --- PYSO Classification probability as a YSO
344-349 F6.4 --- e_PAGN The 1-sigma uncertainty in PAGN
351-356 F6.4 --- e_PCV The 1-sigma uncertainty in PCV
358-363 F6.4 --- e_PHM* The 1-sigma uncertainty in PHM*
365-370 F6.4 --- e_PHMXB The 1-sigma uncertainty in PHMXB
372-377 F6.4 --- e_PLM* The 1-sigma uncertainty in PLM*
379-384 F6.4 --- e_PLMXB The 1-sigma uncertainty in PLMXB
386-391 F6.4 --- e_PNS The 1-sigma uncertainty in PNS
393-397 F5.3 --- e_PYSO The 1-sigma uncertainty in PYSO
399-405 A7 --- Class Predicted class (6)
407-410 F4.2 --- ClassP Classification probability (7)
412-415 F4.2 --- e_ClassP The 1-sigma uncertainty in ClassP
417-422 F6.3 --- CT Classification confidence threshold
424-455 A32 --- Flags Compilation of CSCv2 flags (8)
457-488 A32 --- Catalog Source name from literature verified
catalogs (9)
490-521 A32 --- TClass True class of source from the TD
523-556 A34 --- ClassR Reference of classifications of the
TD source
558-566 A9 --- HESS HESS name that the CSCv2 source resides in
--------------------------------------------------------------------------------
Note (1): In the form 2CXO Jhhmmss.s+ddmmss.
Note (2): The major radius of the 95% confidence level position error ellipse.
Note (3): Highest value of Kuiper's test variability probability across
all observations available in CSCv2.
Note (4): From AllWISE, CatWISE2020 and unWISE catalogs.
Note (5): From Gaia EDR3 Distances catalog.
Note (6): Of the source with the highest classification probability
among eight classes.
Note (7): Of the predicted class calculated from MUWCLASS.
Note (8): Including conf flag (conf), extent flag (extent), and
pileup flag (pileu), jointed by a |.
Note (9): For the classification of TD sources.
--------------------------------------------------------------------------------
Acknowledgements:
Hui Yang, huiyang(at)gwmail.gwu.edu
(End) Prepared by [AAS], Patricia Vannier [CDS] 30-Jun-2023