J/A+A/522/A88 Photometric identification of BHB stars (Smith+, 2010)
Photometric identification of blue horizontal branch stars.
Smith K.W., Bailer-Jones C.A.L., Klement R.J., Xue X.X.
<Astron. Astrophys. 522, A88 (2010)>
=2010A&A...522A..88S 2010A&A...522A..88S
ADC_Keywords: Milky Way ; Stars, horizontal branch ; Photometry, SDSS
Keywords: methods: statistical - stars: horizontal-branch - Galaxy: structure
Abstract:
We investigate the performance of some common machine learning
techniques in identifying Blue Horizontal Branch (BHB) stars from
photometric data. To train the machine learning algorithms, we use
previously published spectroscopic identifications of BHB stars from
Sloan Digital Sky Survey (SDSS) data. We investigate the performance
of three different techniques, namely k nearest neighbour
classification, kernel density estimation for discriminant analysis
and a support vector machine (SVM). We discuss the performance of the
methods in terms of both completeness (what fraction of input BHB
stars are successfully returned as BHB stars) and contamination (what
fraction of contaminating sources end up in the output BHB sample). We
discuss the prospect of trading off these values, achieving lower
contamination at the expense of lower completeness, by adjusting
probability thresholds for the classification. We also discuss the
role of prior probabilities in the classification performance, and we
assess via simulations the reliability of the dataset used for
training. Overall it seems that no-prior gives the best completeness,
but adopting a prior lowers the contamination. We find that the
support vector machine generally delivers the lowest contamination for
a given level of completeness, and so is our method of choice.
Finally, we classify a large sample of SDSS Data Release 7 (DR7)
photometry using the SVM trained on the spectroscopic sample. We
identify 27,074 probable BHB stars out of a sample of 294,652 stars.
We derive photometric parallaxes and demonstrate that our results are
reasonable by comparing to known distances for a selection of globular
clusters. We attach our classifications, including probabilities, as
an electronic table, so that they can be used either directly as a BHB
star catalogue, or as priors to a spectroscopic or other
classification method. We also provide our final models so that they
can be directly applied to new data.
Description:
A catalogue of classifications of candidate BHB stars is presented.
The sample is drawn from SDSS data release 7, and has then been
filtered for consistency with the classifier training set as
described in the paper.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
table7.dat 314 294652 Catalogue of SDSS BHB candidates
tablea3.dat 78 152 Model data for the one-class SVM
(support vector machine) model
tablea4.dat 78 2645 Model data for two-class SVM model
--------------------------------------------------------------------------------
See also:
II/294 : The SDSS Photometric Catalog, Release 7 (Adelman-McCarthy+, 2009)
http://www.sdss.org : SDSS Home Page
Byte-by-byte Description of file: table7.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 18 A18 --- ObjID SDSS ObjID from PhotoObj table
23- 26 I4 --- plate ?=0 SDSS plate from specObj table when available
31- 35 I5 --- MJD ?=0 SDSS MJD from specObj table when available
40- 42 I3 --- fiber ?=0 SDSS fiber from specObj table when available
47- 58 F12.8 deg RAdeg Right ascension in decimal degrees (J2000.0) (1)
63- 74 F12.8 deg DEdeg Declination in decimal degrees (J2000.0) (1)
79- 90 F12.8 deg GLON Galactic longitude (1)
95-106 F12.8 deg GLAT Galactic latitude (1)
111-117 F7.4 mag umag SDSS u-band psf magnitude
120-126 F7.4 mag gmag SDSS g-band psf magnitude
129-135 F7.4 mag rmag SDSS r-band psf magnitude
138-144 F7.4 mag imag SDSS i-band psf magnitude
147-153 F7.4 mag zmag SDSS z-band psf magnitude
156-162 F7.4 mag e_umag SDSS u-band magnitude error
165-171 F7.4 mag e_gmag SDSS g-band magnitude error
174-180 F7.4 mag e_rmag SDSS r-band magnitude error
183-189 F7.4 mag e_imag SDSS i-band magnitude error
192-198 F7.4 mag e_zmag SDSS z-band magnitude error
201-207 F7.4 mag uext u-band extinction from SDSS pipeline (2)
210-216 F7.4 mag gext g-band extinction from SDSS pipeline (2)
219-225 F7.4 mag rext r-band extinction from SDSS pipeline (2)
228-234 F7.4 mag iext i-band extinction from SDSS pipeline (2)
237-243 F7.4 mag zext z-band extinction from SDSS pipeline (2)
246-250 A5 --- Type Object category, from Xue et al.,
2008ApJ...684.1143X 2008ApJ...684.1143X (3)
255-260 F6.4 --- Psvm Probability of BHB from SVM (support vector
machine) model (4)
265-270 F6.4 --- e_Psvm Standard deviation of Psvm for 10 trials (4)
275-280 F6.4 --- Prior 2d prior applied to probability.
285-291 F7.5 --- modP Psvm probability modified by prior (5)
296-301 F6.2 kpc Dist Estimated distance in kpc (6)
306-314 F9.4 --- e_Dist Fractional (relative) error on the distance
--------------------------------------------------------------------------------
Note (1): RA, DE, l, and b are taken from SDSS PhotoObj table
Note (2): Extinctions calculated by comparing model magnitude
(from PhotoObj table) to dereddened magnitude dered_u etc.
Note (3): Category from treatment by Xue et al. (2008ApJ...684.1143X 2008ApJ...684.1143X):
None = does not appear,
BHB = classified as BHB by both D0.2-fm and cγ-bγ methods,
BS = classified as blue straggler from D0.2-fm method,
MS = classified as main sequence star from D0.2-fm method,
Other = classified as BHB star by D0.2-fm method but not from
cγ-bγ method.
Note (4): If less than 10-4 set to 10-4
Note (5): If less than 10-5 set to 10-5
Note (6): Under the assumption the star is a BHB.
Not valid for non-BHBs cases.
--------------------------------------------------------------------------------
Byte-by-byte Description of file: tablea3.dat tablea4.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 14 F14.8 --- y.alpha Product of weight and class label for each
support vector
19- 30 F12.8 --- u-g Standardized dereddened u-g colour
35- 46 F12.8 --- g-r Standardized dereddened g-r colour
51- 62 F12.8 --- r-i Standardized dereddened r-i colour
67- 78 F12.8 --- i-z Standardized dereddened i-z colour
--------------------------------------------------------------------------------
Acknowledgements:
K.W. Smith, smith(at)mpia-hd.mpg.de
References:
Xue, Rix, Zhao, et al., 2008ApJ...684.1143X 2008ApJ...684.1143X
(End) Patricia Vannier [CDS] 05-Aug-2010