J/A+A/664/A51     Feature-based asteroid taxonomy in 3D color space (Roh+, 2022)

A new approach to feature-based asteroid taxonomy in 3D color space. I. SDSS photometric system. Roh D.-G., Moon H.-K., Shin M.-S., DeMeo F.E. <Astron. Astrophys. 664, A51 (2022)> =2022A&A...664A..51R 2022A&A...664A..51R (SIMBAD/NED BibCode)
ADC_Keywords: Solar system ; Minor planets ; Photometry ; Morphology Keywords: minor planets, asteroids: general - techniques: photometric - methods: statistical Abstract: The taxonomic classification of asteroids has been mostly based on spectroscopic observations with wavelengths spanning from the visible (VIS) to the near-infrared (NIR). VIS-NIR spectra of ∼2500 asteroids have been obtained since the 1970s; the Sloan Digital Sky Survey (SDSS) Moving Object Catalog 4 (MOC 4) was released with ∼4x105 measurements of asteroid positions and colors in the early 2000s. A number of works then devised methods to classify these data within the framework of existing taxonomic systems. Some of these works, however, used 2D parameter space (e.g., gri slope vs. z-i color) that displayed a continuous distribution of clouds of data points resulting in boundaries that were artificially defined. We introduce here a more advanced method to classify asteroids based on existing systems. This approach is simply represented by a triplet of SDSS colors. The distributions and memberships of each taxonomic type are determined by machine learning methods in the form of both unsupervised and semi-supervised learning. We apply our scheme to MOC 4 calibrated with VIS-NIR reflectance spectra. We successfully separate seven different taxonomy classification (C, D, K, L, S, V, and X) with which we have a sufficient number of spectroscopic datasets. We found the overlapping regions of taxonomic types in a 2D plane were separated with relatively clear boundaries in the 3D space newly defined in this work. Our scheme explicitly discriminates between different taxonomic types (e.g., K and X types), which is an improvement over existing systems. This new method for taxonomic classification has a great deal of scalability for asteroid research, such as space weathering in the S-complex, and the origin and evolution of asteroid families. We present the structure of the asteroid belt, and describe the orbital distribution based on our newly assigned taxonomic classifications. It is also possible to extend the methods presented here to other photometric systems, such as the Johnson-Cousins and LSST filter systems. Description: Clustering is one kind of machine learning method to identify structures in a given dataset. The exploration of the data in the 3D color space informs us that the structure traced by the objects with the known taxonomy types can be well described by Gaussian shapes in the color space. We applied two clustering methods using Gaussian mixtures to identify known taxonomy types in the 3D color space.The first method (method A) uses an infinite Gaussian mixture model to describe the distribution of objects as a mixture of multiple Gaussian distributions in multi-dimensional color space without fixing the number of the components beforehand. This method simply tries to find a concentration of data in multi-dimensional color space even though we already know the taxonomy types of some objects with measured colors. Therefore, if there are not enough data to be identified as concentrated clusters, method A fails to recover these low-density clusters. The second method (method B) is a finite Gaussian mixture model that needs to predefine the number of Gaussian mixture components, and we adopt method B as a semi-supervised machine learning method that uses the colors of a few objects with known taxonomy types as a guide to infer the mixture properties. In method B new types cannot be found since the number of clusters is fixed as the number of the given objects with the known taxonomy types. File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file table1.dat 18 4213 Taxonomy types in method A with the dissimilarity matrix table3.dat 183 4213 Taxonomy types in method A with the MAP estimation of the parameters table5.dat 177 4213 Taxonomy types in method B table6.dat 18 2296 Objects with the consistent taxonomy types in methods A and B -------------------------------------------------------------------------------- Byte-by-byte Description of file: table1.dat table6.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 16 A16 --- Name Object name 18 A1 --- Type Taxonomy type -------------------------------------------------------------------------------- Byte-by-byte Description of file: table3.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 16 A16 --- Name Object name 18- 22 A5 --- ClustN Cluster number (taxonomy type) (N (A)) 24- 35 E12.5 --- Wmemb1 Membership weight for Cluster number = 1 37- 48 E12.5 --- Wmemb2 Membership weight for Cluster number = 2 50- 60 E11.5 --- Wmemb3 Membership weight for Cluster number = 3 62- 73 E12.5 --- Wmemb4 Membership weight for Cluster number = 4 75- 85 E11.5 --- Wmemb5 Membership weight for Cluster number = 5 87- 98 E12.5 --- Wmemb6 Membership weight for Cluster number = 6 100-110 E11.5 --- Wmemb7 Membership weight for Cluster number = 7 112-122 E11.5 --- Wmemb8 Membership weight for Cluster number = 8 124-134 E11.5 --- Wmemb9 Membership weight for Cluster number = 9 136-146 E11.5 --- Wmemb10 Membership weight for Cluster number = 10 148-159 E12.5 --- Wmemb11 Membership weight for Cluster number = 11 161-171 E11.5 --- Wmemb12 Membership weight for Cluster number = 12 173-183 E11.5 --- Wmemb13 Membership weight for Cluster number = 13 -------------------------------------------------------------------------------- Byte-by-byte Description of file: table5.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 16 A16 --- Name Object name 18- 23 A6 --- ClustN Cluster number (taxonomy type) (NN (A)) 25- 35 E11.5 --- Wmemb1 Membership weight for Cluster number = 1 37- 47 E11.5 --- Wmemb2 Membership weight for Cluster number = 2 49- 60 E12.5 --- Wmemb3 Membership weight for Cluster number = 3 62- 73 E12.5 --- Wmemb4 Membership weight for Cluster number = 4 75- 86 E12.5 --- Wmemb5 Membership weight for Cluster number = 5 88- 99 E12.5 --- Wmemb6 Membership weight for Cluster number = 6 101-112 E12.5 --- Wmemb7 Membership weight for Cluster number = 7 114-125 E12.5 --- Wmemb8 Membership weight for Cluster number = 8 127-138 E12.5 --- Wmemb9 Membership weight for Cluster number = 9 140-151 E12.5 --- Wmemb10 Membership weight for Cluster number = 10 153-164 E12.5 --- Wmemb11 Membership weight for Cluster number = 11 166-177 E12.5 --- Wmemb12 Membership weight for Cluster number = 12 -------------------------------------------------------------------------------- Acknowledgements: Dong-Goo Roh, rrdong9(at)kasi.re.kr
(End) Patricia Vannier [CDS] 08-Jul-2022
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line