J/MNRAS/503/5263 Sorting of 4XMM-DR9 sources by machine learning  (Zhang+, 2021)

Classification of 4XMM-DR9 sources by machine learning. Zhang Y., Zhao Y., Wu X.-B. <Mon. Not. R. Astron. Soc., 503, 5263-5273 (2021)> =2021MNRAS.503.5263Z 2021MNRAS.503.5263Z (SIMBAD/NED BibCode)
ADC_Keywords: X-ray sources ; Optical ; Infrared sources ; Galaxies ; QSOs ; Stars, normal Keywords: methods: data analysis - methods: statistical - astronomical data bases: miscellaneous; catalogues - stars: general - galaxies: general Abstract: The ESA's X-ray Multi-mirror Mission (XMM-Newton) created a new high-quality version of the XMM-Newton serendipitous source catalogue, 4XMM-DR9, which provides a wealth of information for observed sources. The 4XMM-DR9 catalogue is correlated with the Sloan Digital Sky Survey (SDSS) DR12 photometric data base and the AllWISE data base; we then get X-ray sources with information from the X-ray, optical, and/or infrared bands and obtain the XMM-WISE, XMM-SDSS, and XMM-WISE-SDSS samples. Based on the large spectroscopic surveys of SDSS and the Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST), we cross-match the XMM-WISE-SDSS sample with sources of known spectral classes, and obtain known samples of stars, galaxies, and quasars. The distribution of stars, galaxies, and quasars as well as all spectral classes of stars in 2D parameter space is presented. Various machine-learning methods are applied to different samples from different bands. The better classified results are retained. For the sample from the X-ray band, a rotation-forest classifier performs the best. For the sample from the X-ray and infrared bands, a random-forest algorithm outperforms all other methods. For the samples from the X-ray, optical, and/or infrared bands, the LogitBoost classifier shows its superiority. Thus, all X-ray sources in the 4XMM-DR9 catalogue with different input patterns are classified by their respective models that are created by these best methods. Their membership of and membership probabilities for individual X-ray sources are assigned. The classified result will be of great value for the further research of X-ray sources in greater detail. Description: Firstly, we did catalogues cross matches based on 4XMM-DR9 (Webb et al. 2020A&A...641A.136W 2020A&A...641A.136W, Cat. IX/59) catalogue, SDSS (DR12 Alam et al. 2015ApJS..219...12A 2015ApJS..219...12A, Cat. V/147) photometric data base and the AllWISE (Cutri et al. 2013, Cat. II/328) 2013 data base which are correlated by the parameters of known objects. We obtained the XMM-WISE, XMM-SDSS, and XMM-WISE-SDSS samples which contains X-ray sources with informations on the X-ray, optical, and/or infrared bands. Secondly, based on The Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST; Cui et al. 2012RAA....12.1197C 2012RAA....12.1197C; Luo et al. 2015RAA....15.1095L 2015RAA....15.1095L, Cat. V/146) for stars and galaxies identifications and based on The SDSS Data Release 14 Quasar catalogue (DR14Q; Paris et al. 2018A&A...613A..51P 2018A&A...613A..51P, Cat. VII/286) for quasars identification. We create multiple samples of known objects in order to have classificate X-ray sources divided in three groups as Galaxy class (with Subclasses like AGN, SB etc. ), Star class (with Subclasses like O,A etc.) and QSOs. Finally, we trained 3 different machine learning algorithms : rotation-forest (Rodriguez, Kuncheva & Alonso 2006,IEEE Trans. Pattern Analysis and Machine Intelligence, 28, 1619), random-forest (Breiman 2001, Machine Learning, 45, 5) and LogitBoost (Friedman, Hastie & Tranibshirani, 2000, Ann. Statistics, 28, 337) on a input pattern parameters (see section for more details) in order to recognize subclasses of galaxies, stars, and quasars for the 4 differents cases of samples (only X-ray band, only X-ray and optical bands, only X-ray and infrared bands, X-ray,optical and infrared bands). We asigned LogitBoost for the cases (only X-ray and optical bands, X-ray,optical and infrared bands) which makes two differents classifiers , rotation-forest classifier for the case (only X-ray band) and random-forest classifier for the case (only X-ray and infrared bands). Due to the performance precision of algorithms to classify and sub-classify known X-ray sources from our training samples, we decided to keep only the three mains classes such as stars, galaxies and QSOs for best accuracy predictions of unknown X-ray sources classifications probabilities (see section 4 and 5). For the 4XMM-DR9 sources, all predicted results are shown in table10.dat. File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file table10.dat 108 550124 predicted results of machine learning classifications of 4XMM-DR9 sources -------------------------------------------------------------------------------- See also: IX/59 : XMM-Newton Serendipitous Source Catalogue 4XMM-DR9 (Webb+, 2020) V/147 : The SDSS Photometric Catalogue, Release 12 (Alam+, 2015) II/328 : AllWISE Data Release (Cutri+ 2013) V/146 : LAMOST DR1 catalogs (Luo+, 2015) VII/286 : SDSS quasar catalog, fourteenth data release (Paris+, 2018) Byte-by-byte Description of file: table10.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 15 I15 --- Source Source ID (scrid) 17- 35 E19.17 deg RAdeg Right ascension in decimal degrees (sc_ra) (J2000) 37- 56 E20.17 deg DEdeg Declination in decimal degrees (sc_dec) (J2000) 58- 63 A6 --- ClassX Source classification for the first classifier machine learning method (rotation-forest classifier X-ray information only) (Class_x) 65- 69 F5.3 --- PX The classification probabilities deducted for sources that only have the X-ray band (P_x) 71- 76 A6 --- ClassXO ? Source classification for the third classifier machine learning method (LogitBoost classifiers X-ray and optical bands information only) (Class_xo) 78- 82 F5.3 --- PXO ? The classification probabilities deducted for source that only have the X-ray and optical bands (P_xo) 84- 89 A6 --- ClassXI ? Source classification for the second classifier machine learning method (random-forest classifier X-ray and infrared bands information only) (Class_xi) 91- 95 F5.3 --- PXI ? The classification probabilities deducted for sources that only have the X-ray and infrared bands (P_xi) 97-102 A6 --- ClassXIO ? Source classification for the fourth classifier machine learning method (LogitBoost classifiers X-ray, optical and infrared bands information only) (Class_xio) 104-108 F5.3 --- PXIO ? The classification probabilities deducted for sources that have X-ray, optical and infrared bands (P_xio) -------------------------------------------------------------------------------- History: From electronic version of the journal
(End) Luc Trabelsi[CDS] 16-Apr-2024
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line