J/ApJ/941/104       Classification of Chandra sources              (Yang+, 2022)

Classifying Unidentified X-Ray Sources in the Chandra Source Catalog Using a Multiwavelength Machine-learning Approach. Yang H., Hare J., Kargaltsev O., Volkov I., Chen S., Rangelov B. <Astrophys. J., 941, 104 (2022)> =2022ApJ...941..104Y 2022ApJ...941..104Y (SIMBAD/NED BibCode)
ADC_Keywords: Active gal. nuclei ; Binaries, X-ray ; X-ray sources Keywords: Catalogs - X-ray sources - Classification - Random Forests - X-ray binary stars - Active galactic nuclei - X-ray stars - Young stellar objects - Cataclysmic variable stars - Astrostatistics tools - X-ray surveys - Compact objects Abstract: The rapid increase in serendipitous X-ray source detections requires the development of novel approaches to efficiently explore the nature of X-ray sources. If even a fraction of these sources could be reliably classified, it would enable population studies for various astrophysical source types on a much larger scale than currently possible. Classification of large numbers of sources from multiple classes characterized by multiple properties (features) must be done automatically and supervised machine learning (ML) seems to provide the only feasible approach. We perform classification of Chandra Source Catalog version 2.0 (CSCv2) sources to explore the potential of the ML approach and identify various biases, limitations, and bottlenecks that present themselves in these kinds of studies. We establish the framework and present a flexible and expandable Python pipeline, which can be used and improved by others. We also release the training data set of 2941 X-ray sources with confidently established classes. In addition to providing probabilistic classifications of 66,369 CSCv2 sources (21% of the entire CSCv2 catalog), we perform several narrower-focused case studies (high-mass X-ray binary candidates and X-ray sources within the extent of the H.E.S.S. TeV sources) to demonstrate some possible applications of our ML approach. We also discuss future possible modifications of the presented pipeline, which are expected to lead to substantial improvements in classification confidences. Description: The following tables present the properties and classification results of the good CSCv2 sample (GCS), the training dataset (TD), and the X-ray sources within the unidentified HESS sources using the multiwavelength machine-learning method (MUWCLASS). File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file table8.dat 473 66369 Properties and classification results of the GCS sources using MUWCLASS table9.dat 543 2941 Properties and classification results of the TD sources using MUWCLASS table10.dat 566 2000 Properties and classification results of the HESS field sources using MUWCLASS -------------------------------------------------------------------------------- See also: IX/57 : The Chandra Source Catalog (CSC), Release 2.0 (Evans+, 2019) Byte-by-byte Description of file: table8.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 21 A21 --- CSCv2 CSCv2 source name (1) 23- 33 F11.7 deg RAdeg CSCv2 Right Ascension (J2000) 35- 45 F11.7 deg DEdeg CSCv2 Declination (J2000) 47- 50 F4.2 arcsec PU CSCv2 errellipser0 value (2) 52- 57 F6.2 --- S/N CSCv2 X-ray significance 59- 66 E8.3 mW/m2 Fs Average soft (0.5-1.2 keV) band flux 68- 72 E5.1 mW/m2 e_Fs The 1-sigma uncertainty in Fs 74- 81 E8.3 mW/m2 Fm Average medium (1.2-2 keV) band flux 83- 87 E5.1 mW/m2 e_Fm The 1-sigma uncertainty in Fm 89- 96 E8.3 mW/m2 Fh Average hard (2-7 keV) band flux 98-102 E5.1 mW/m2 e_Fh The 1-sigma uncertainty in Fh 104-111 E8.3 mW/m2 Fb Average broad (0.5-7 keV) band flux 113-117 E5.1 mW/m2 e_Fb The 1-sigma uncertainty in Fb 119-123 F5.3 --- PIntra ? Intra-observation variability probability (3) 125-129 F5.3 --- PInter ? Inter-observation variability probability 131-136 F6.3 mag Gmag ? Gaia EDR3 G band magnitude 138-142 F5.3 mag e_Gmag ? The 1-sigma uncertainty in Gmag 144-149 F6.3 mag BPmag ? Gaia EDR3 BP band magnitude 151-155 F5.3 mag e_BPmag ? The 1-sigma uncertainty in BPmag 157-162 F6.3 mag RPmag ? Gaia EDR3 RP band magnitude 164-168 F5.3 mag e_RPmag ? The 1-sigma uncertainty in RPmag 170-174 F5.2 mag Jmag ? 2MASS J band magnitude 176-179 F4.2 mag e_Jmag ? The 1-sigma uncertainty in Jmag 181-185 F5.2 mag Hmag ? 2MASS H band magnitude 187-191 F5.2 mag e_Hmag ? The 1-sigma uncertainty in Hmag 193-197 F5.2 mag Kmag ? 2MASS K band magnitude 199-202 F4.2 mag e_Kmag ? The 1-sigma uncertainty in Kmag 204-210 F7.4 mag W1mag ? W1 band magnitude (4) 212-217 F6.4 mag e_W1mag ? The 1-sigma uncertainty in W1mag 219-225 F7.4 mag W2mag ? W2 band magnitude (4) 228-233 F6.4 mag e_W2mag ? The 1-sigma uncertainty in W2mag 235-240 F6.3 mag W3mag ? W3 band magnitude (4) 242-246 F5.3 mag e_W3mag ? The 1-sigma uncertainty in W3mag 248-254 F7.3 mas plx ? Gaia EDR3 absolute stellar parallax 256-260 F5.3 mas e_plx ? The 1-sigma uncertainty in plx 262-268 F7.3 mas/yr pm ? Gaia EDR3 total proper motion 270-276 F7.1 pc rgeo ? Median of geometric distance (5) 278-284 F7.1 pc b_rgeo ? 16th percentile of rgeo 286-292 F7.1 pc B_rgeo ? 84th percentile of rgeo 294-299 F6.4 --- PAGN Classification probability as an AGN 301-306 F6.4 --- PCV Classification probability as a CV 308-314 F7.4 --- PHM* Classification probability as a HM-STAR 316-322 F7.4 --- PHMXB Classification probability as a HMXB 324-330 F7.4 --- PLM* Classification probability as a LM-STAR 332-337 F6.4 --- PLMXB Classification probability as a LMXB 339-344 F6.4 --- PNS Classification probability as a NS 346-351 F6.4 --- PYSO Classification probability as a YSO 353-358 F6.4 --- e_PAGN The 1-sigma uncertainty in PAGN 360-365 F6.4 --- e_PCV The 1-sigma uncertainty in PCV 367-372 F6.4 --- e_PHM* The 1-sigma uncertainty in PHM* 374-379 F6.4 --- e_PHMXB The 1-sigma uncertainty in PHMXB 381-386 F6.4 --- e_PLM* The 1-sigma uncertainty in PLM* 388-393 F6.4 --- e_PLMXB The 1-sigma uncertainty in PLMXB 395-400 F6.4 --- e_PNS The 1-sigma uncertainty in PNS 402-407 F6.4 --- e_PYSO The 1-sigma uncertainty in PYSO 409-415 A7 --- Class Predicted class (6) 417-422 F6.4 --- ClassP Classification probability (7) 424-429 F6.4 --- e_ClassP The 1-sigma uncertainty in ClassP 431-440 F10.3 --- CT Classification confidence threshold 442-473 A32 --- Flags Compilation of CSCv2 flags (8) -------------------------------------------------------------------------------- Note (1): In the form 2CXO Jhhmmss.s+ddmmss. Note (2): The major radius of the 95% confidence level position error ellipse. Note (3): Highest value of Kuiper's test variability probability across all observations available in CSCv2. Note (4): From AllWISE, CatWISE2020 and unWISE catalogs. Note (5): From Gaia EDR3 Distances catalog. Note (6): Of the source with the highest classification probability among eight classes. Note (7): Of the predicted class calculated from MUWCLASS. Note (8): Including conf flag (conf), extent flag (extent), and pileup flag (pileup), jointed by a |. -------------------------------------------------------------------------------- Byte-by-byte Description of file: table9.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 21 A21 --- CSCv2 CSCv2 source name (1) 23- 33 F11.7 deg RAdeg CSCv2 Right Ascension (J2000) 35- 45 F11.7 deg DEdeg CSCv2 Declination (J2000) 47- 50 F4.2 arcsec PU CSCv2 errellipser0 value (2) 52- 57 F6.2 --- S/N CSCv2 X-ray significance 59- 66 E8.3 mW/m2 Fs Average soft (0.5-1.2 keV) band flux 68- 72 E5.1 mW/m2 e_Fs The 1-sigma uncertainty in Fs 74- 81 E8.3 mW/m2 Fm Average medium (1.2-2 keV) band flux 83- 87 E5.1 mW/m2 e_Fm The 1-sigma uncertainty in Fm 89- 97 E9.4 mW/m2 Fh Average hard (2-7 keV) band flux 99-103 E5.1 mW/m2 e_Fh The 1-sigma uncertainty in Fh 105-113 E9.4 mW/m2 Fb Average broad (0.5-7 keV) band flux 115-119 E5.1 mW/m2 e_Fb The 1-sigma uncertainty in Fb 121-125 F5.3 --- PIntra ? Intra-observation variability probability (3) 127-131 F5.3 --- PInter ? Inter-observation variability probability 133-138 F6.3 mag Gmag ? Gaia EDR3 G band magnitude 140-144 F5.3 mag e_Gmag ? The 1-sigma uncertainty in Gmag 146-151 F6.3 mag BPmag ? Gaia EDR3 BP band magnitude 153-157 F5.3 mag e_BPmag ? The 1-sigma uncertainty in BPmag 159-164 F6.3 mag RPmag ? Gaia EDR3 RP band magnitude 166-170 F5.3 mag e_RPmag ? The 1-sigma uncertainty in RPmag 172-176 F5.2 mag Jmag ? 2MASS J band magnitude 178-182 F5.2 mag e_Jmag ? The 1-sigma uncertainty in Jmag 184-188 F5.2 mag Hmag ? 2MASS H band magnitude 190-193 F4.2 mag e_Hmag ? The 1-sigma uncertainty in Hmag 195-199 F5.2 mag Kmag ? 2MASS K band magnitude 201-204 F4.2 mag e_Kmag ? The 1-sigma uncertainty in Kmag 206-213 F8.4 mag W1mag ? W1 band magnitude (4) 215-220 F6.4 mag e_W1mag ? The 1-sigma uncertainty in W1mag 222-229 F8.4 mag W2mag ? W2 band magnitude (4) 231-236 F6.4 mag e_W2mag ? The 1-sigma uncertainty in W2mag 238-243 F6.2 mag W3mag ? W3 band magnitude (4) 245-248 F4.2 mag e_W3mag ? The 1-sigma uncertainty in W3mag 250-255 F6.2 mas plx ? Gaia EDR3 absolute stellar parallax 257-260 F4.2 mas e_plx ? The 1-sigma uncertainty in plx 262-267 F6.3 mas/yr PM ? Gaia EDR3 total proper motion 269-275 F7.1 pc rgeo ? Median of geometric distance (5) 277-283 F7.1 pc b_rgeo ? 16th percentile of rgeo 285-291 F7.1 pc B_rgeo ? 84th percentile of rgeo 293-298 F6.4 --- PAGN Classification probability as an AGN 300-305 F6.4 --- PCV Classification probability as a CV 307-312 F6.4 --- PHM* Classification probability as a HM-STAR 314-319 F6.4 --- PHMXB Classification probability as a HMXB 321-326 F6.4 --- PLM* Classification probability as a LM-STAR 328-333 F6.4 --- PLMXB Classification probability as a LMXB 335-340 F6.4 --- PNS Classification probability as a NS 342-347 F6.4 --- PYSO Classification probability as a YSO 349-354 F6.4 --- e_PAGN The 1-sigma uncertainty in PAGN 356-361 F6.4 --- e_PCV The 1-sigma uncertainty in PCV 363-368 F6.4 --- e_PHM* The 1-sigma uncertainty in PHM* 370-375 F6.4 --- e_PHMXB The 1-sigma uncertainty in PHMXB 377-382 F6.4 --- e_PLM* The 1-sigma uncertainty in PLM* 384-389 F6.4 --- e_PLMXB The 1-sigma uncertainty in PLMXB 391-396 F6.4 --- e_PNS The 1-sigma uncertainty in PNS 398-403 F6.4 --- e_PYSO The 1-sigma uncertainty in PYSO 405-411 A7 --- Class Predicted class (6) 413-418 F6.4 --- ClassP Classification probability (7) 420-425 F6.4 --- e_ClassP The 1-sigma uncertainty in ClassP 427-434 F8.3 --- CT Classification confidence threshold 436-467 A32 --- Flags Compilation of CSCv2 flags (8) 469-500 A32 --- Catalog Source name from literature verified catalogs (9) 502-508 A7 --- TClass True class of source from the TD 510-543 A34 --- ClassR Reference of classifications of the TD source -------------------------------------------------------------------------------- Note (1): In the form 2CXO Jhhmmss.s+ddmmss. Note (2): The major radius of the 95% confidence level position error ellipse. Note (3): Highest value of Kuiper's test variability probability across all observations available in CSCv2. Note (4): From AllWISE, CatWISE2020 and unWISE catalogs. Note (5): From Gaia EDR3 Distances catalog. Note (6): Of the source with the highest classification probability among eight classes. Note (7): Of the predicted class calculated from MUWCLASS. Note (8): Including conf flag (con), extent flag (extent), and pileup flag (pileup), jointed by a |. Note (9): For the classification of TD sources. -------------------------------------------------------------------------------- Byte-by-byte Description of file: table10.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 21 A21 --- CSCv2 CSCv2 source name (1) 23- 33 F11.7 deg RAdeg CSCv2 Right Ascension (J2000) 35- 45 F11.7 deg DEdeg CSCv2 Declination (J2000) 47- 51 F5.2 arcsec PU CSCv2 errellipser0 value (2) 53- 58 F6.2 --- S/N CSCv2 X-ray significance 60- 67 E8.3 mW/m2 Fs Average soft (0.5-1.2 keV) band flux 69- 73 E5.1 mW/m2 e_Fs The 1-sigma uncertainty in Fs 75- 82 E8.3 mW/m2 Fm Average medium (1.2-2 keV) band flux 84- 88 E5.1 mW/m2 e_Fm The 1-sigma uncertainty in Fm 90- 97 E8.3 mW/m2 Fh Average hard (2-7 keV) band flux 99-103 E5.1 mW/m2 e_Fh The 1-sigma uncertainty in Fh 105-112 E8.3 mW/m2 Fb Average broad (0.5-7 keV) band flux 114-118 E5.1 mW/m2 e_Fb The 1-sigma uncertainty in Fb 120-124 F5.3 --- PIntra ? Intra-observation variability probability (3) 126-130 F5.3 --- PInter ? Inter-observation variability probability 132-137 F6.3 mag Gmag ? Gaia EDR3 G band magnitude 139-143 F5.3 mag e_Gmag ? The 1-sigma uncertainty in Gmag 145-150 F6.3 mag BPmag ? Gaia EDR3 BP band magnitude 152-156 F5.3 mag e_BPmag ? The 1-sigma uncertainty in BPmag 158-163 F6.3 mag RPmag ? Gaia EDR3 RP band magnitude 165-169 F5.3 mag e_RPmag ? The 1-sigma uncertainty in RPmag 171-175 F5.2 mag Jmag ? 2MASS J band magnitude 177-180 F4.2 mag e_Jmag ? The 1-sigma uncertainty in Jmag 182-186 F5.2 mag Hmag ? 2MASS H band magnitude 188-191 F4.2 mag e_Hmag ? The 1-sigma uncertainty in Hmag 193-197 F5.2 mag Kmag ? 2MASS K band magnitude 199-202 F4.2 mag e_Kmag ? The 1-sigma uncertainty in Kmag 204-211 F8.4 mag W1mag ? W1 band magnitude (4) 213-218 F6.4 mag e_W1mag ? The 1-sigma uncertainty in W1mag 220-226 F7.4 mag W2mag ? W2 band magnitude (4) 228-233 F6.4 mag e_W2mag ? The 1-sigma uncertainty in W2mag 235-240 F6.2 mag W3mag ? W3 band magnitude (4) 242-245 F4.2 mag e_W3mag ? The 1-sigma uncertainty in W3mag 247-252 F6.2 mas plx ? Gaia EDR3 absolute stellar parallax 254-257 F4.2 mas e_plx ? The 1-sigma uncertainty in plx 259-265 F7.3 mas/yr PM ? Gaia EDR3 total proper motion 267-272 F6.1 pc rgeo ? Median of geometric distance (5) 274-279 F6.1 pc b_rgeo ? 16th percentile of rgeo 281-287 F7.1 pc B_rgeo ? 84th percentile of rgeo 289-294 F6.4 --- PAGN Classification probability as an AGN 296-301 F6.4 --- PCV Classification probability as a CV 303-308 F6.4 --- PHM* Classification probability as a HM-STAR 310-315 F6.4 --- PHMXB Classification probability as a HMXB 317-322 F6.4 --- PLM* Classification probability as a LM-STAR 324-329 F6.4 --- PLMXB Classification probability as a LMXB 331-336 F6.4 --- PNS Classification probability as a NS 338-342 F5.3 --- PYSO Classification probability as a YSO 344-349 F6.4 --- e_PAGN The 1-sigma uncertainty in PAGN 351-356 F6.4 --- e_PCV The 1-sigma uncertainty in PCV 358-363 F6.4 --- e_PHM* The 1-sigma uncertainty in PHM* 365-370 F6.4 --- e_PHMXB The 1-sigma uncertainty in PHMXB 372-377 F6.4 --- e_PLM* The 1-sigma uncertainty in PLM* 379-384 F6.4 --- e_PLMXB The 1-sigma uncertainty in PLMXB 386-391 F6.4 --- e_PNS The 1-sigma uncertainty in PNS 393-397 F5.3 --- e_PYSO The 1-sigma uncertainty in PYSO 399-405 A7 --- Class Predicted class (6) 407-410 F4.2 --- ClassP Classification probability (7) 412-415 F4.2 --- e_ClassP The 1-sigma uncertainty in ClassP 417-422 F6.3 --- CT Classification confidence threshold 424-455 A32 --- Flags Compilation of CSCv2 flags (8) 457-488 A32 --- Catalog Source name from literature verified catalogs (9) 490-521 A32 --- TClass True class of source from the TD 523-556 A34 --- ClassR Reference of classifications of the TD source 558-566 A9 --- HESS HESS name that the CSCv2 source resides in -------------------------------------------------------------------------------- Note (1): In the form 2CXO Jhhmmss.s+ddmmss. Note (2): The major radius of the 95% confidence level position error ellipse. Note (3): Highest value of Kuiper's test variability probability across all observations available in CSCv2. Note (4): From AllWISE, CatWISE2020 and unWISE catalogs. Note (5): From Gaia EDR3 Distances catalog. Note (6): Of the source with the highest classification probability among eight classes. Note (7): Of the predicted class calculated from MUWCLASS. Note (8): Including conf flag (conf), extent flag (extent), and pileup flag (pileu), jointed by a |. Note (9): For the classification of TD sources. -------------------------------------------------------------------------------- Acknowledgements: Hui Yang, huiyang(at)gwmail.gwu.edu
(End) Prepared by [AAS], Patricia Vannier [CDS] 30-Jun-2023
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line