J/A+A/612/A98 APOGEE full information on classes (Garcia-Dias+, 2018)
Machine learning in APOGEE: Unsupervised spectral classification with K-means.
Garcia-Dias R., Allende Prieto C., Sanchez Almeida J., Ordovas-Pascual I.
<Astron. Astrophys. 612, A98 (2018)>
=2018A&A...612A..98G 2018A&A...612A..98G (SIMBAD/NED BibCode)
ADC_Keywords: Spectra, infrared ; MK spectral classification ;
Stellar distribution
Keywords: methods: data analysis - methods: numerical - catalogues - surveys -
techniques: spectroscopic - Galaxy: stellar content
Abstract:
The volume of data generated by astronomical surveys is growing
rapidly. Traditional analysis techniques in spectroscopy either demand
intensive human interaction or are computationally expensive. In this
scenario, machine learning, and unsupervised clustering algorithms in
particular, offer interesting alternatives. The Apache Point
Observatory Galactic Evolution Experiment (APOGEE) offers a vast data
set of near-infrared stellar spectra, which is perfect for testing
such alternatives.
Our research applies an unsupervised classification scheme based on
K-means to the massive APOGEE data set. We explore whether the data
are amenable to classification into discrete classes.
We apply the K-means algorithm to 153,847 high resolution spectra
(R∼22,500). We discuss the main virtues and weaknesses of the
algorithm, as well as our choice of parameters.
We show that a classification based on normalised spectra captures the
variations in stellar atmospheric parameters, chemical abundances, and
rotational velocity, among other factors. The algorithm is able to
separate the bulge and halo populations, and distinguish dwarfs,
sub-giants, RC, and RGB stars. However, a discrete classification in
flux space does not result in a neat organisation in the parameters'
space. Furthermore, the lack of obvious groups in flux space causes
the results to be fairly sensitive to the initialisation, and disrupts
the efficiency of commonly-used methods to select the optimal number
of clusters. Our classification is publicly available, including
extensive online material associated with the APOGEE Data Release 12
(DR12).
Our description of the APOGEE database can help greatly with the
identification of specific types of targets for various applications.
We find a lack of obvious groups in flux space, and identify
limitations of the K-means algorithm in dealing with this kind of
data.
Description:
Data for the classes derived on the paper. The tables provide the star
labels, the mean spectra of the classes and the within class standard
deviation.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
tableb2.dat 21 153847 Star labels
tableb3.dat 505 8575 Mean spectra of the 50 classes
tableb4.dat 505 8575 Within class standard deviation
--------------------------------------------------------------------------------
Byte-by-byte Description of file: tableb2.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 18 A18 --- APOGEE Star ID as defined in APOGEE DR12
20- 21 I2 --- Class ? K-means class
--------------------------------------------------------------------------------
Byte-by-byte Description of file: tableb3.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 8 F8.6 --- class00 Mean normalized flux of class 0
10- 17 F8.6 --- class01 Mean normalized flux of class 1
19- 26 F8.6 --- class02 Mean normalized flux of class 2
28- 35 F8.6 --- class03 Mean normalized flux of class 3
37- 44 F8.6 --- class04 Mean normalized flux of class 4
46- 53 F8.6 --- class05 Mean normalized flux of class 5
55- 62 F8.6 --- class06 Mean normalized flux of class 6
64- 71 F8.6 --- class07 Mean normalized flux of class 7
73- 80 F8.6 --- class08 Mean normalized flux of class 8
82- 89 F8.6 --- class09 Mean normalized flux of class 9
91- 98 F8.6 --- class10 Mean normalized flux of class 10
100-107 F8.6 --- class11 Mean normalized flux of class 11
109-116 F8.6 --- class12 Mean normalized flux of class 12
118-125 F8.6 --- class13 Mean normalized flux of class 13
127-134 F8.6 --- class14 Mean normalized flux of class 14
136-143 F8.6 --- class15 Mean normalized flux of class 15
145-152 F8.6 --- class16 Mean normalized flux of class 16
154-161 F8.6 --- class17 Mean normalized flux of class 17
163-170 F8.6 --- class18 Mean normalized flux of class 18
172-179 F8.6 --- class19 Mean normalized flux of class 19
181-188 F8.6 --- class20 Mean normalized flux of class 20
190-197 F8.6 --- class21 Mean normalized flux of class 21
199-206 F8.6 --- class22 Mean normalized flux of class 22
208-216 F9.6 --- class23 Mean normalized flux of class 23
218-225 F8.6 --- class24 Mean normalized flux of class 24
227-234 F8.6 --- class25 Mean normalized flux of class 25
236-243 F8.6 --- class26 Mean normalized flux of class 26
245-252 F8.6 --- class27 Mean normalized flux of class 27
254-261 F8.6 --- class28 Mean normalized flux of class 28
263-270 F8.6 --- class29 Mean normalized flux of class 29
272-280 F9.6 --- class30 Mean normalized flux of class 30
282-289 F8.6 --- class31 Mean normalized flux of class 31
291-300 F10.6 --- class32 Mean normalized flux of class 32
302-311 F10.6 --- class33 Mean normalized flux of class 33
313-320 F8.6 --- class34 Mean normalized flux of class 34
322-332 F11.6 --- class35 Mean normalized flux of class 35
334-341 F8.6 --- class36 Mean normalized flux of class 36
343-352 F10.6 --- class37 Mean normalized flux of class 37
354-362 F9.6 --- class38 Mean normalized flux of class 38
364-373 F10.6 --- class39 Mean normalized flux of class 39
375-385 F11.6 --- class40 Mean normalized flux of class 40
387-396 F10.6 --- class41 Mean normalized flux of class 41
398-405 F8.6 --- class42 Mean normalized flux of class 42
407-418 F12.6 --- class43 Mean normalized flux of class 43
420-430 F11.6 --- class44 Mean normalized flux of class 44
432-442 F11.6 --- class45 Mean normalized flux of class 45
444-455 F12.6 --- class46 Mean normalized flux of class 46
457-466 F10.6 --- class47 Mean normalized flux of class 47
468-479 F12.6 --- class48 Mean normalized flux of class 48
481-492 F12.6 --- class49 Mean normalized flux of class 49
494-503 F10.4 0.1nm Wavelength Wavelength in vacuum
505 I1 --- Mask [0/1] 1 if the pixel is used in k-means,
0 otherwise.
--------------------------------------------------------------------------------
Byte-by-byte Description of file: tableb4.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 8 F8.6 --- e_class00 Standard deviation of normalised flux
in class 0
10- 17 F8.6 --- e_class01 Standard deviation of normalised flux
in class 1
19- 26 F8.6 --- e_class02 Standard deviation of normalised flux
in class 2
28- 35 F8.6 --- e_class03 Standard deviation of normalised flux
in class 3
37- 44 F8.6 --- e_class04 Standard deviation of normalised flux
in class 4
46- 53 F8.6 --- e_class05 Standard deviation of normalised flux
in class 5
55- 62 F8.6 --- e_class06 Standard deviation of normalised flux
in class 6
64- 71 F8.6 --- e_class07 Standard deviation of normalised flux
in class 7
73- 80 F8.6 --- e_class08 Standard deviation of normalised flux
in class 8
82- 89 F8.6 --- e_class09 Standard deviation of normalised flux
in class 9
91- 98 F8.6 --- e_class10 Stanard deviation of normalised flux
in class 10
100-107 F8.6 --- e_class11 Standard deviation of normalised flux
in class 11
109-116 F8.6 --- e_class12 Standard deviation of normalised flux
in class 12
118-125 F8.6 --- e_class13 Standard deviation of normalised flux
in class 13
127-134 F8.6 --- e_class14 Standard deviation of normalised flux
in class 14
136-143 F8.6 --- e_class15 Standard deviation of normalised flux
in class 15
145-152 F8.6 --- e_class16 Standard deviation of normalised flux
in class 16
154-161 F8.6 --- e_class17 Standard deviation of normalised flux
in class 17
163-170 F8.6 --- e_class18 Standard deviation of normalised flux
in class 18
172-179 F8.6 --- e_class19 Standard deviation of normalised flux
in class 19
181-188 F8.6 --- e_class20 Standard deviation of normalised flux
in class 20
190-197 F8.6 --- e_class21 Standard deviation of normalised flux
in class 21
199-206 F8.6 --- e_class22 Standard deviation of normalised flux
in class 22
208-216 F9.6 --- e_class23 Standard deviation of normalised flux
in class 23
218-225 F8.6 --- e_class24 Standard deviation of normalised flux
in class 24
227-234 F8.6 --- e_class25 Standard deviation of normalised flux
in class 25
236-243 F8.6 --- e_class26 Standard deviation of normalised flux
in class 26
245-252 F8.6 --- e_class27 Standard deviation of normalised flux
in class 27
254-261 F8.6 --- e_class28 Standard deviation of normalised flux
in class 28
263-270 F8.6 --- e_class29 Standard deviation of normalised flux
in class 29
272-280 F9.6 --- e_class30 Standard deviation of normalised flux
in class 30
282-289 F8.6 --- e_class31 Standard deviation of normalised flux
in class 31
291-300 F10.6 --- e_class32 Standard deviation of normalised flux
in class 32
302-311 F10.6 --- e_class33 Standard deviation of normalised flux
in class 33
313-320 F8.6 --- e_class34 Standard deviation of normalised flux
in class 34
322-332 F11.6 --- e_class35 Standard deviation of normalised flux
in class 35
334-341 F8.6 --- e_class36 Standard deviation of normalised flux
in class 36
343-352 F10.6 --- e_class37 Standard deviation of normalised flux
in class 37
354-362 F9.6 --- e_class38 Standard deviation of normalised flux
in class 38
364-373 F10.6 --- e_class39 Standard deviation of normalised flux
in class 39
375-385 F11.6 --- e_class40 Standard deviation of normalised flux
in class 40
387-396 F10.6 --- e_class41 Standard deviation of normalised flux
in class 41
398-405 F8.6 --- e_class42 Standard deviation of normalised flux
in class 42
407-418 F12.6 --- e_class43 Standard deviation of normalised flux
in class 43
420-430 F11.6 --- e_class44 Standard deviation of normalised flux
in class 44
432-442 F11.6 --- e_class45 Standard deviation of normalised flux
in class 45
444-455 F12.6 --- e_class46 Standard deviation of normalised flux
in class 46
457-466 F10.6 --- e_class47 Standard deviation of normalised flux
in class 47
468-479 F12.6 --- e_class48 Standard deviation of normalised flux
in class 48
481-492 F12.6 --- e_class49 Standard deviation of normalised flux
in class 49
494-503 F10.4 0.1nm Wavelength Wavelength in vacuum
505 I1 --- Mask [0/1] 1 if the pixel is used in k-means,
0 otherwise.
--------------------------------------------------------------------------------
Acknowledgements:
Rafael Garcia-Dias, rafaelagd(at)gmail.com
(End) Rafael Garcia-Dias Patricia Vannier [CDS] 07-Feb-2018