J/A+A/664/A51 Feature-based asteroid taxonomy in 3D color space (Roh+, 2022)
A new approach to feature-based asteroid taxonomy in 3D color space.
I. SDSS photometric system.
Roh D.-G., Moon H.-K., Shin M.-S., DeMeo F.E.
<Astron. Astrophys. 664, A51 (2022)>
=2022A&A...664A..51R 2022A&A...664A..51R (SIMBAD/NED BibCode)
ADC_Keywords: Solar system ; Minor planets ; Photometry ; Morphology
Keywords: minor planets, asteroids: general - techniques: photometric -
methods: statistical
Abstract:
The taxonomic classification of asteroids has been mostly based on
spectroscopic observations with wavelengths spanning from the visible
(VIS) to the near-infrared (NIR). VIS-NIR spectra of ∼2500 asteroids
have been obtained since the 1970s; the Sloan Digital Sky Survey
(SDSS) Moving Object Catalog 4 (MOC 4) was released with ∼4x105
measurements of asteroid positions and colors in the early 2000s. A
number of works then devised methods to classify these data within the
framework of existing taxonomic systems. Some of these works, however,
used 2D parameter space (e.g., gri slope vs. z-i color) that displayed
a continuous distribution of clouds of data points resulting in
boundaries that were artificially defined. We introduce here a more
advanced method to classify asteroids based on existing systems. This
approach is simply represented by a triplet of SDSS colors. The
distributions and memberships of each taxonomic type are determined by
machine learning methods in the form of both unsupervised and
semi-supervised learning. We apply our scheme to MOC 4 calibrated with
VIS-NIR reflectance spectra. We successfully separate seven different
taxonomy classification (C, D, K, L, S, V, and X) with which we have
a sufficient number of spectroscopic datasets. We found the
overlapping regions of taxonomic types in a 2D plane were separated
with relatively clear boundaries in the 3D space newly defined in this
work. Our scheme explicitly discriminates between different taxonomic
types (e.g., K and X types), which is an improvement over existing
systems. This new method for taxonomic classification has a great deal
of scalability for asteroid research, such as space weathering in the
S-complex, and the origin and evolution of asteroid families. We
present the structure of the asteroid belt, and describe the orbital
distribution based on our newly assigned taxonomic classifications. It
is also possible to extend the methods presented here to other
photometric systems, such as the Johnson-Cousins and LSST filter
systems.
Description:
Clustering is one kind of machine learning method to identify
structures in a given dataset. The exploration of the data in the 3D
color space informs us that the structure traced by the objects with
the known taxonomy types can be well described by Gaussian shapes in
the color space. We applied two clustering methods using Gaussian
mixtures to identify known taxonomy types in the 3D color space.The
first method (method A) uses an infinite Gaussian mixture model to
describe the distribution of objects as a mixture of multiple Gaussian
distributions in multi-dimensional color space without fixing the
number of the components beforehand. This method simply tries to find
a concentration of data in multi-dimensional color space even though
we already know the taxonomy types of some objects with measured
colors. Therefore, if there are not enough data to be identified as
concentrated clusters, method A fails to recover these low-density
clusters. The second method (method B) is a finite Gaussian mixture
model that needs to predefine the number of Gaussian mixture
components, and we adopt method B as a semi-supervised machine
learning method that uses the colors of a few objects with known
taxonomy types as a guide to infer the mixture properties. In method B
new types cannot be found since the number of clusters is fixed as the
number of the given objects with the known taxonomy types.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table1.dat 18 4213 Taxonomy types in method A with the
dissimilarity matrix
table3.dat 183 4213 Taxonomy types in method A with the MAP
estimation of the parameters
table5.dat 177 4213 Taxonomy types in method B
table6.dat 18 2296 Objects with the consistent taxonomy types in
methods A and B
--------------------------------------------------------------------------------
Byte-by-byte Description of file: table1.dat table6.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 16 A16 --- Name Object name
18 A1 --- Type Taxonomy type
--------------------------------------------------------------------------------
Byte-by-byte Description of file: table3.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 16 A16 --- Name Object name
18- 22 A5 --- ClustN Cluster number (taxonomy type) (N (A))
24- 35 E12.5 --- Wmemb1 Membership weight for Cluster number = 1
37- 48 E12.5 --- Wmemb2 Membership weight for Cluster number = 2
50- 60 E11.5 --- Wmemb3 Membership weight for Cluster number = 3
62- 73 E12.5 --- Wmemb4 Membership weight for Cluster number = 4
75- 85 E11.5 --- Wmemb5 Membership weight for Cluster number = 5
87- 98 E12.5 --- Wmemb6 Membership weight for Cluster number = 6
100-110 E11.5 --- Wmemb7 Membership weight for Cluster number = 7
112-122 E11.5 --- Wmemb8 Membership weight for Cluster number = 8
124-134 E11.5 --- Wmemb9 Membership weight for Cluster number = 9
136-146 E11.5 --- Wmemb10 Membership weight for Cluster number = 10
148-159 E12.5 --- Wmemb11 Membership weight for Cluster number = 11
161-171 E11.5 --- Wmemb12 Membership weight for Cluster number = 12
173-183 E11.5 --- Wmemb13 Membership weight for Cluster number = 13
--------------------------------------------------------------------------------
Byte-by-byte Description of file: table5.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 16 A16 --- Name Object name
18- 23 A6 --- ClustN Cluster number (taxonomy type) (NN (A))
25- 35 E11.5 --- Wmemb1 Membership weight for Cluster number = 1
37- 47 E11.5 --- Wmemb2 Membership weight for Cluster number = 2
49- 60 E12.5 --- Wmemb3 Membership weight for Cluster number = 3
62- 73 E12.5 --- Wmemb4 Membership weight for Cluster number = 4
75- 86 E12.5 --- Wmemb5 Membership weight for Cluster number = 5
88- 99 E12.5 --- Wmemb6 Membership weight for Cluster number = 6
101-112 E12.5 --- Wmemb7 Membership weight for Cluster number = 7
114-125 E12.5 --- Wmemb8 Membership weight for Cluster number = 8
127-138 E12.5 --- Wmemb9 Membership weight for Cluster number = 9
140-151 E12.5 --- Wmemb10 Membership weight for Cluster number = 10
153-164 E12.5 --- Wmemb11 Membership weight for Cluster number = 11
166-177 E12.5 --- Wmemb12 Membership weight for Cluster number = 12
--------------------------------------------------------------------------------
Acknowledgements:
Dong-Goo Roh, rrdong9(at)kasi.re.kr
(End) Patricia Vannier [CDS] 08-Jul-2022