J/MNRAS/476/2117 Outliers and similarity in APOGEE (Reis+, 2018)
Detecting outliers and learning complex structures with large spectroscopic
surveys - a case study with APOGEE stars.
Reis I., Poznanski D., Baron D., Zasowski G., Shahaf S.
<Mon. Not. R. Astron. Soc. 476, 2117 (2018)>
=2018MNRAS.476.2117R 2018MNRAS.476.2117R (SIMBAD/NED BibCode)
ADC_Keywords: Spectroscopy ; Stars, normal
Keywords: methods: data analysis - stars: general - stars: peculiar
Abstract:
In this work we apply and expand on a recently introduced outlier
detection algorithm that is based on an unsupervised random forest. We
use the algorithm to calculate a similarity measure for stellar
spectra from the Apache Point Observatory Galactic Evolution
Experiment (APOGEE). We show that the similarity measure traces
non-trivial physical properties and contains information about complex
structures in the data. We use it for visualization and clustering of
the dataset, and discuss its ability to find groups of highly similar
objects, including spectroscopic twins. Using the similarity matrix to
search the dataset for objects allows us to find objects that are
impossible to find using their best fitting model parameters. This
includes extreme objects for which the models fail, and rare objects
that are outside the scope of the model. We use the similarity measure
to detect outliers in the dataset, and find a number of previously
unknown Be-type stars, spectroscopic binaries, carbon rich stars,
young stars, and a few that we cannot interpret. Our work further
demonstrates the potential for scientific discovery when combining
machine learning methods with modern survey data.
Description:
t-SNE is a dimensionality reduction algorithm that is particularly
well suited for the visualization of high-dimensional datasets. We use
t-SNE to visualize our distance matrix.
A-priori, these distances could define a space with almost as many
dimensions as objects, i.e., tens of thousand of dimensions.
Obviously, since many stars are quite similar, and their spectra are
defined by a few physical parameters, the minimal spanning space might
be smaller. By using t-SNE we can examine the structure of our sample
projected into 2D. We use our distance matrix as input to the t-SNE
algorithm and in return get a 2D map of the objects in our dataset.
For each star in a sample of 183232 APOGEE stars, the APOGEE IDs of
the 99 stars with most similar spectra (according to the method
described in paper), ordered by similarity.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
apogeenn.dat 1899 183232 Nearest neighbors APOGEE IDs
distance.dat 1602 183232 Distances to nearest neighbors
tsnecoor.dat 56 193556 t-SNE coordinates (map in paper)
--------------------------------------------------------------------------------
See also:
J/AJ/146/133 : SDSS-III APOGEE DR10 stellar parameters (Meszaros+, 2013)
J/ApJ/794/125 : IN-SYNC. I. APOGEE stellar parameters (Cottaar+, 2014)
J/A+A/589/A80 : APOGEE strings (Hacar+, 2016)
J/A+A/594/A43 : APOGEE/Kepler sample stars abundances (Hawkins+, 2016)
J/MNRAS/460/3179 : APOGEE stars distance and extinction (Wang+, 2016)
Byte-by-byte Description of file: apogeenn.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 18 A18 --- Target Target name
20- 37 A18 --- NN1 1st nearest neighbor of Target object
39- 56 A18 --- NN2 2sd nearest neighbor of Target object
58- 75 A18 --- NN3 3rd nearest neighbor of Target object
77- 94 A18 --- NN4 4th nearest neighbor of Target object
96- 113 A18 --- NN5 5th nearest neighbor of Target object
115- 132 A18 --- NN6 6th nearest neighbor of Target object
134- 151 A18 --- NN7 7th nearest neighbor of Target object
153- 170 A18 --- NN8 8th nearest neighbor of Target object
172- 189 A18 --- NN9 9th nearest neighbor of Target object
191- 208 A18 --- NN10 10th nearest neighbor of Target object
210- 227 A18 --- NN11 11th nearest neighbor of Target object
229- 246 A18 --- NN12 12th nearest neighbor of Target object
248- 265 A18 --- NN13 13th nearest neighbor of Target object
267- 284 A18 --- NN14 14th nearest neighbor of Target object
286- 303 A18 --- NN15 15th nearest neighbor of Target object
305- 322 A18 --- NN16 16th nearest neighbor of Target object
324- 341 A18 --- NN17 17th nearest neighbor of Target object
343- 360 A18 --- NN18 18th nearest neighbor of Target object
362- 379 A18 --- NN19 19th nearest neighbor of Target object
381- 398 A18 --- NN20 20th nearest neighbor of Target object
400- 417 A18 --- NN21 21th nearest neighbor of Target object
419- 436 A18 --- NN22 22th nearest neighbor of Target object
438- 455 A18 --- NN23 23th nearest neighbor of Target object
457- 474 A18 --- NN24 24th nearest neighbor of Target object
476- 493 A18 --- NN25 25th nearest neighbor of Target object
495- 512 A18 --- NN26 26th nearest neighbor of Target object
514- 531 A18 --- NN27 27th nearest neighbor of Target object
533- 550 A18 --- NN28 28th nearest neighbor of Target object
552- 569 A18 --- NN29 29th nearest neighbor of Target object
571- 588 A18 --- NN30 30th nearest neighbor of Target object
590- 607 A18 --- NN31 31th nearest neighbor of Target object
609- 626 A18 --- NN32 32th nearest neighbor of Target object
628- 645 A18 --- NN33 33th nearest neighbor of Target object
647- 664 A18 --- NN34 34th nearest neighbor of Target object
666- 683 A18 --- NN35 35th nearest neighbor of Target object
685- 702 A18 --- NN36 36th nearest neighbor of Target object
704- 721 A18 --- NN37 37th nearest neighbor of Target object
723- 740 A18 --- NN38 38th nearest neighbor of Target object
742- 759 A18 --- NN39 39th nearest neighbor of Target object
761- 778 A18 --- NN40 40th nearest neighbor of Target object
780- 797 A18 --- NN41 41th nearest neighbor of Target object
799- 816 A18 --- NN42 42th nearest neighbor of Target object
818- 835 A18 --- NN43 43th nearest neighbor of Target object
837- 854 A18 --- NN44 44th nearest neighbor of Target object
856- 873 A18 --- NN45 45th nearest neighbor of Target object
875- 892 A18 --- NN46 46th nearest neighbor of Target object
894- 911 A18 --- NN47 47th nearest neighbor of Target object
913- 930 A18 --- NN48 48th nearest neighbor of Target object
932- 949 A18 --- NN49 49th nearest neighbor of Target object
951- 968 A18 --- NN50 50th nearest neighbor of Target object
970- 987 A18 --- NN51 51th nearest neighbor of Target object
989-1006 A18 --- NN52 52th nearest neighbor of Target object
1008-1025 A18 --- NN53 53th nearest neighbor of Target object
1027-1044 A18 --- NN54 54th nearest neighbor of Target object
1046-1063 A18 --- NN55 55th nearest neighbor of Target object
1065-1082 A18 --- NN56 56th nearest neighbor of Target object
1084-1101 A18 --- NN57 57th nearest neighbor of Target object
1103-1120 A18 --- NN58 58th nearest neighbor of Target object
1122-1139 A18 --- NN59 59th nearest neighbor of Target object
1141-1158 A18 --- NN60 60th nearest neighbor of Target object
1160-1177 A18 --- NN61 61th nearest neighbor of Target object
1179-1196 A18 --- NN62 62th nearest neighbor of Target object
1198-1215 A18 --- NN63 63th nearest neighbor of Target object
1217-1234 A18 --- NN64 64th nearest neighbor of Target object
1236-1253 A18 --- NN65 65th nearest neighbor of Target object
1255-1272 A18 --- NN66 66th nearest neighbor of Target object
1274-1291 A18 --- NN67 67th nearest neighbor of Target object
1293-1310 A18 --- NN68 68th nearest neighbor of Target object
1312-1329 A18 --- NN69 69th nearest neighbor of Target object
1331-1348 A18 --- NN70 70th nearest neighbor of Target object
1350-1367 A18 --- NN71 71th nearest neighbor of Target object
1369-1386 A18 --- NN72 72th nearest neighbor of Target object
1388-1405 A18 --- NN73 73th nearest neighbor of Target object
1407-1424 A18 --- NN74 74th nearest neighbor of Target object
1426-1443 A18 --- NN75 75th nearest neighbor of Target object
1445-1462 A18 --- NN76 76th nearest neighbor of Target object
1464-1481 A18 --- NN77 77th nearest neighbor of Target object
1483-1500 A18 --- NN78 78th nearest neighbor of Target object
1502-1519 A18 --- NN79 79th nearest neighbor of Target object
1521-1538 A18 --- NN80 80th nearest neighbor of Target object
1540-1557 A18 --- NN81 81th nearest neighbor of Target object
1559-1576 A18 --- NN82 82th nearest neighbor of Target object
1578-1595 A18 --- NN83 83th nearest neighbor of Target object
1597-1614 A18 --- NN84 84th nearest neighbor of Target object
1616-1633 A18 --- NN85 85th nearest neighbor of Target object
1635-1652 A18 --- NN86 86th nearest neighbor of Target object
1654-1671 A18 --- NN87 87th nearest neighbor of Target object
1673-1690 A18 --- NN88 88th nearest neighbor of Target object
1692-1709 A18 --- NN89 89th nearest neighbor of Target object
1711-1728 A18 --- NN90 90th nearest neighbor of Target object
1730-1747 A18 --- NN91 91th nearest neighbor of Target object
1749-1766 A18 --- NN92 92th nearest neighbor of Target object
1768-1785 A18 --- NN93 93th nearest neighbor of Target object
1787-1804 A18 --- NN94 94th nearest neighbor of Target object
1806-1823 A18 --- NN95 95th nearest neighbor of Target object
1825-1842 A18 --- NN96 96th nearest neighbor of Target object
1844-1861 A18 --- NN97 97th nearest neighbor of Target object
1863-1880 A18 --- NN98 98th nearest neighbor of Target object
1882-1899 A18 --- NN99 99th nearest neighbor of Target object
--------------------------------------------------------------------------------
Byte-by-byte Description of file: distance.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 18 A18 --- Target Target name
20- 34 F15.13 --- Dist1 Distance matrix to 1st nearest neighbor
of Target
36- 50 F15.13 --- Dist2 Distance matrix to 2nd nearest neighbor
of Target
52- 66 F15.13 --- Dist3 Distance matrix to 3rd nearest neighbor
of Target
68- 82 F15.13 --- Dist4 Distance matrix to 4th nearest neighbor
of Target
84- 98 F15.13 --- Dist5 Distance matrix to 5th nearest neighbor
of Target
100- 114 F15.13 --- Dist6 Distance matrix to 6th nearest neighbor
of Target
116- 130 F15.13 --- Dist7 Distance matrix to 7th nearest neighbor
of Target
132- 146 F15.13 --- Dist8 Distance matrix to 8th nearest neighbor
of Target
148- 162 F15.13 --- Dist9 Distance matrix to 9th nearest neighbor
of Target
164- 178 F15.13 --- Dist10 Distance matrix to 10th nearest neighbor
of Target
180- 194 F15.13 --- Dist11 Distance matrix to 11th nearest neighbor
of Target
196- 210 F15.13 --- Dist12 Distance matrix to 12th nearest neighbor
of Target
212- 226 F15.13 --- Dist13 Distance matrix to 13th nearest neighbor
of Target
228- 242 F15.13 --- Dist14 Distance matrix to 14th nearest neighbor
of Target
244- 258 F15.13 --- Dist15 Distance matrix to 15th nearest neighbor
of Target
260- 274 F15.13 --- Dist16 Distance matrix to 16th nearest neighbor
of Target
276- 290 F15.13 --- Dist17 Distance matrix to 17th nearest neighbor
of Target
292- 306 F15.13 --- Dist18 Distance matrix to 18th nearest neighbor
of Target
308- 322 F15.13 --- Dist19 Distance matrix to 19th nearest neighbor
of Target
324- 338 F15.13 --- Dist20 Distance matrix to 20th nearest neighbor
of Target
340- 354 F15.13 --- Dist21 Distance matrix to 21th nearest neighbor
of Target
356- 370 F15.13 --- Dist22 Distance matrix to 22th nearest neighbor
of Target
372- 386 F15.13 --- Dist23 Distance matrix to 23th nearest neighbor
of Target
388- 402 F15.13 --- Dist24 Distance matrix to 24th nearest neighbor
of Target
404- 418 F15.13 --- Dist25 Distance matrix to 25th nearest neighbor
of Target
420- 434 F15.13 --- Dist26 Distance matrix to 26th nearest neighbor
of Target
436- 450 F15.13 --- Dist27 Distance matrix to 27th nearest neighbor
of Target
452- 466 F15.13 --- Dist28 Distance matrix to 28th nearest neighbor
of Target
468- 482 F15.13 --- Dist29 Distance matrix to 29th nearest neighbor
of Target
484- 498 F15.13 --- Dist30 Distance matrix to 30th nearest neighbor
of Target
500- 514 F15.13 --- Dist31 Distance matrix to 31th nearest neighbor
of Target
516- 530 F15.13 --- Dist32 Distance matrix to 32th nearest neighbor
of Target
532- 546 F15.13 --- Dist33 Distance matrix to 33th nearest neighbor
of Target
548- 562 F15.13 --- Dist34 Distance matrix to 34th nearest neighbor
of Target
564- 578 F15.13 --- Dist35 Distance matrix to 35th nearest neighbor
of Target
580- 594 F15.13 --- Dist36 Distance matrix to 36th nearest neighbor
of Target
596- 610 F15.13 --- Dist37 Distance matrix to 37th nearest neighbor
of Target
612- 626 F15.13 --- Dist38 Distance matrix to 38th nearest neighbor
of Target
628- 642 F15.13 --- Dist39 Distance matrix to 39th nearest neighbor
of Target
644- 658 F15.13 --- Dist40 Distance matrix to 40th nearest neighbor
of Target
660- 674 F15.13 --- Dist41 Distance matrix to 41th nearest neighbor
of Target
676- 690 F15.13 --- Dist42 Distance matrix to 42th nearest neighbor
of Target
692- 706 F15.13 --- Dist43 Distance matrix to 43th nearest neighbor
of Target
708- 722 F15.13 --- Dist44 Distance matrix to 44th nearest neighbor
of Target
724- 738 F15.13 --- Dist45 Distance matrix to 45th nearest neighbor
of Target
740- 754 F15.13 --- Dist46 Distance matrix to 46th nearest neighbor
of Target
756- 770 F15.13 --- Dist47 Distance matrix to 47th nearest neighbor
of Target
772- 786 F15.13 --- Dist48 Distance matrix to 48th nearest neighbor
of Target
788- 802 F15.13 --- Dist49 Distance matrix to 49th nearest neighbor
of Target
804- 818 F15.13 --- Dist50 Distance matrix to 50th nearest neighbor
of Target
820- 834 F15.13 --- Dist51 Distance matrix to 51th nearest neighbor
of Target
836- 850 F15.13 --- Dist52 Distance matrix to 52th nearest neighbor
of Target
852- 866 F15.13 --- Dist53 Distance matrix to 53th nearest neighbor
of Target
868- 882 F15.13 --- Dist54 Distance matrix to 54th nearest neighbor
of Target
884- 898 F15.13 --- Dist55 Distance matrix to 55th nearest neighbor
of Target
900- 914 F15.13 --- Dist56 Distance matrix to 56th nearest neighbor
of Target
916- 930 F15.13 --- Dist57 Distance matrix to 57th nearest neighbor
of Target
932- 946 F15.13 --- Dist58 Distance matrix to 58th nearest neighbor
of Target
948- 962 F15.13 --- Dist59 Distance matrix to 59th nearest neighbor
of Target
964- 978 F15.13 --- Dist60 Distance matrix to 60th nearest neighbor
of Target
980- 994 F15.13 --- Dist61 Distance matrix to 61th nearest neighbor
of Target
996-1010 F15.13 --- Dist62 Distance matrix to 62th nearest neighbor
of Target
1012-1026 F15.13 --- Dist63 Distance matrix to 63th nearest neighbor
of Target
1028-1042 F15.13 --- Dist64 Distance matrix to 64th nearest neighbor
of Target
1044-1058 F15.13 --- Dist65 Distance matrix to 65th nearest neighbor
of Target
1060-1074 F15.13 --- Dist66 Distance matrix to 66th nearest neighbor
of Target
1076-1090 F15.13 --- Dist67 Distance matrix to 67th nearest neighbor
of Target
1092-1106 F15.13 --- Dist68 Distance matrix to 68th nearest neighbor
of Target
1108-1122 F15.13 --- Dist69 Distance matrix to 69th nearest neighbor
of Target
1124-1138 F15.13 --- Dist70 Distance matrix to 70th nearest neighbor
of Target
1140-1154 F15.13 --- Dist71 Distance matrix to 71th nearest neighbor
of Target
1156-1170 F15.13 --- Dist72 Distance matrix to 72th nearest neighbor
of Target
1172-1186 F15.13 --- Dist73 Distance matrix to 73th nearest neighbor
of Target
1188-1202 F15.13 --- Dist74 Distance matrix to 74th nearest neighbor
of Target
1204-1218 F15.13 --- Dist75 Distance matrix to 75th nearest neighbor
of Target
1220-1234 F15.13 --- Dist76 Distance matrix to 76th nearest neighbor
of Target
1236-1250 F15.13 --- Dist77 Distance matrix to 77th nearest neighbor
of Target
1252-1266 F15.13 --- Dist78 Distance matrix to 78th nearest neighbor
of Target
1268-1282 F15.13 --- Dist79 Distance matrix to 79th nearest neighbor
of Target
1284-1298 F15.13 --- Dist80 Distance matrix to 80th nearest neighbor
of Target
1300-1314 F15.13 --- Dist81 Distance matrix to 81th nearest neighbor
of Target
1316-1330 F15.13 --- Dist82 Distance matrix to 82th nearest neighbor
of Target
1332-1346 F15.13 --- Dist83 Distance matrix to 83th nearest neighbor
of Target
1348-1362 F15.13 --- Dist84 Distance matrix to 84th nearest neighbor
of Target
1364-1378 F15.13 --- Dist85 Distance matrix to 85th nearest neighbor
of Target
1380-1394 F15.13 --- Dist86 Distance matrix to 86th nearest neighbor
of Target
1396-1410 F15.13 --- Dist87 Distance matrix to 87th nearest neighbor
of Target
1412-1426 F15.13 --- Dist88 Distance matrix to 88th nearest neighbor
of Target
1428-1442 F15.13 --- Dist89 Distance matrix to 89th nearest neighbor
of Target
1444-1458 F15.13 --- Dist90 Distance matrix to 90th nearest neighbor
of Target
1460-1474 F15.13 --- Dist91 Distance matrix to 91th nearest neighbor
of Target
1476-1490 F15.13 --- Dist92 Distance matrix to 92th nearest neighbor
of Target
1492-1506 F15.13 --- Dist93 Distance matrix to 93th nearest neighbor
of Target
1508-1522 F15.13 --- Dist94 Distance matrix to 94th nearest neighbor
of Target
1524-1538 F15.13 --- Dist95 Distance matrix to 95th nearest neighbor
of Target
1540-1554 F15.13 --- Dist96 Distance matrix to 96th nearest neighbor
of Target
1556-1570 F15.13 --- Dist97 Distance matrix to 97th nearest neighbor
of Target
1572-1586 F15.13 --- Dist98 Distance matrix to 98th nearest neighbor
of Target
1588-1602 F15.13 --- Dist99 Distance matrix to 99th nearest neighbor
of Target
--------------------------------------------------------------------------------
Byte-by-byte Description of file: tsnecoor.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 18 A18 --- Target Target name
20- 37 E18.15 --- t-SNE-X t-SNE map X coordinate
39- 56 E18.15 --- t-SNE-Y t-SNE map Y coordinate
--------------------------------------------------------------------------------
Acknowledgements:
Itamar Reis, itamarreis(at)mail.tau.ac.il
(End) Itamar Reis [Tel-Aviv Uni.], Patricia Vannier [CDS] 28-Dec-2017