J/MNRAS/503/5263 Sorting of 4XMM-DR9 sources by machine learning  (Zhang+, 2021)
================================================================================
Classification of 4XMM-DR9 sources by machine learning.
    Zhang Y., Zhao Y., Wu X.-B.
   <Mon. Not. R. Astron. Soc., 503, 5263-5273 (2021)>
   =2021MNRAS.503.5263Z    (SIMBAD/NED BibCode)
================================================================================
ADC_Keywords: X-ray sources ; Optical ; Infrared sources ; Galaxies ; QSOs ;
              Stars, normal
Keywords: methods: data analysis - methods: statistical -
          astronomical data bases: miscellaneous; catalogues - stars: general -
          galaxies: general

Abstract:
    The ESA's X-ray Multi-mirror Mission (XMM-Newton) created a new
    high-quality version of the XMM-Newton serendipitous source catalogue,
    4XMM-DR9, which provides a wealth of information for observed sources.
    The 4XMM-DR9 catalogue is correlated with the Sloan Digital Sky Survey
    (SDSS) DR12 photometric data base and the AllWISE data base; we then
    get X-ray sources with information from the X-ray, optical, and/or
    infrared bands and obtain the XMM-WISE, XMM-SDSS, and XMM-WISE-SDSS
    samples. Based on the large spectroscopic surveys of SDSS and the
    Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST), we
    cross-match the XMM-WISE-SDSS sample with sources of known spectral
    classes, and obtain known samples of stars, galaxies, and quasars. The
    distribution of stars, galaxies, and quasars as well as all spectral
    classes of stars in 2D parameter space is presented. Various
    machine-learning methods are applied to different samples from
    different bands. The better classified results are retained. For the
    sample from the X-ray band, a rotation-forest classifier performs the
    best. For the sample from the X-ray and infrared bands, a
    random-forest algorithm outperforms all other methods. For the samples
    from the X-ray, optical, and/or infrared bands, the LogitBoost
    classifier shows its superiority. Thus, all X-ray sources in the
    4XMM-DR9 catalogue with different input patterns are classified by
    their respective models that are created by these best methods. Their
    membership of and membership probabilities for individual X-ray
    sources are assigned. The classified result will be of great value for
    the further research of X-ray sources in greater detail.

Description:
    Firstly, we did catalogues cross matches based on 4XMM-DR9 (Webb et
    al. 2020A&A...641A.136W, Cat. IX/59) catalogue, SDSS (DR12 Alam
    et al. 2015ApJS..219...12A, Cat. V/147) photometric data base and the
    AllWISE (Cutri et al. 2013, Cat. II/328) 2013 data base which are
    correlated by the parameters of known objects. We obtained the
    XMM-WISE, XMM-SDSS, and XMM-WISE-SDSS samples which contains
    X-ray sources with informations on the X-ray, optical, and/or infrared
    bands. Secondly, based on The Large Sky Area Multi-object Fiber
    Spectroscopic Telescope (LAMOST; Cui et al. 2012RAA....12.1197C; Luo
    et al. 2015RAA....15.1095L, Cat. V/146) for stars and galaxies
    identifications and based on The SDSS Data Release 14 Quasar catalogue
    (DR14Q; Paris et al. 2018A&A...613A..51P, Cat. VII/286) for quasars
    identification. We create multiple samples of known objects in order
    to have classificate X-ray sources divided in three groups as Galaxy
    class (with Subclasses like AGN, SB etc. ), Star class (with
    Subclasses like O,A etc.) and QSOs.

    Finally, we trained 3 different machine learning algorithms :
    rotation-forest (Rodriguez, Kuncheva & Alonso 2006,IEEE Trans. Pattern
    Analysis and Machine Intelligence, 28, 1619), random-forest (Breiman
    2001, Machine Learning, 45, 5) and LogitBoost (Friedman, Hastie &
    Tranibshirani, 2000, Ann. Statistics, 28, 337) on a input pattern
    parameters (see section for more details) in order to recognize
    subclasses of galaxies, stars, and quasars for the 4 differents cases
    of samples (only X-ray band, only X-ray and optical bands, only X-ray
    and infrared bands, X-ray,optical and infrared bands). We asigned
    LogitBoost for the cases (only X-ray and optical bands, X-ray,optical
    and infrared bands) which makes two differents classifiers ,
    rotation-forest classifier for the case (only X-ray band) and
    random-forest classifier for the case (only X-ray and infrared bands).
    Due to the performance precision of algorithms to classify and
    sub-classify known X-ray sources from our training samples, we decided
    to keep only the three mains classes such as stars, galaxies and QSOs
    for best accuracy predictions of unknown X-ray sources classifications
    probabilities (see section 4 and 5). For the 4XMM-DR9 sources, all
    predicted results are shown in table10.dat.

File Summary:
--------------------------------------------------------------------------------
 FileName      Lrecl  Records   Explanations
--------------------------------------------------------------------------------
ReadMe            80        .   This file
table10.dat      108   550124   predicted results of machine learning
                                classifications of 4XMM-DR9 sources
--------------------------------------------------------------------------------

See also:
 IX/59   : XMM-Newton Serendipitous Source Catalogue 4XMM-DR9 (Webb+, 2020)
 V/147   : The SDSS Photometric Catalogue, Release 12 (Alam+, 2015)
 II/328  : AllWISE Data Release (Cutri+ 2013)
 V/146   : LAMOST DR1 catalogs (Luo+, 2015)
 VII/286 : SDSS quasar catalog, fourteenth data release (Paris+, 2018)

Byte-by-byte Description of file: table10.dat
--------------------------------------------------------------------------------
   Bytes Format Units   Label     Explanations
--------------------------------------------------------------------------------
   1- 15 I15    ---     Source    Source ID (scrid)
  17- 35 E19.17 deg     RAdeg     Right ascension in decimal degrees (sc_ra)
                                  (J2000)
  37- 56 E20.17 deg     DEdeg     Declination in decimal degrees (sc_dec)
                                  (J2000)
  58- 63  A6    ---     ClassX    Source classification for the first
                                  classifier machine learning method
                                  (rotation-forest classifier X-ray
                                  information only) (Class_x)
  65- 69  F5.3  ---     PX        The classification probabilities deducted
                                  for sources that only have the X-ray band
                                  (P_x)
  71- 76  A6    ---     ClassXO   ? Source classification for the third
                                  classifier machine learning method
                                  (LogitBoost classifiers X-ray and optical
                                  bands information only) (Class_xo)
  78- 82  F5.3  ---     PXO       ? The classification probabilities deducted
                                  for source that only have the X-ray and
                                  optical bands (P_xo)
  84- 89  A6    ---     ClassXI   ? Source classification for the second
                                  classifier machine learning method
                                  (random-forest classifier X-ray and
                                  infrared bands information only) (Class_xi)
  91- 95  F5.3  ---     PXI       ? The classification probabilities deducted
                                  for sources that only have the X-ray and
                                  infrared bands (P_xi)
  97-102  A6    ---     ClassXIO  ? Source classification for the fourth
                                  classifier machine learning method
                                  (LogitBoost classifiers X-ray, optical
                                  and infrared bands information only)
                                  (Class_xio)
 104-108  F5.3  ---     PXIO      ? The classification probabilities deducted
                                  for sources that have X-ray, optical and
                                  infrared bands (P_xio)
--------------------------------------------------------------------------------

History:
    From electronic version of the journal

================================================================================
(End)                                           Luc Trabelsi[CDS]    16-Apr-2024
