J/MNRAS/427/2917    Classification of Hipparcos variables    (Rimoldini+, 2012)

Automated classification of Hipparcos unsolved variables. Rimoldini L., Dubath P., Suveges M., Lopez M., Sarro L.M., Blomme J., De Ridder J., Cuypers J., Guy L., Mowlavi N., Lecoeur-Taibi I., Beck M., Jan A., Nienartowicz K., Ordonez-Blanco D., Lebzelter T., Eyer L. <Mon. Not. R. Astron. Soc. 427, 2917 (2012)> =2012MNRAS.427.2917R 2012MNRAS.427.2917R
ADC_Keywords: Models ; Stars, variable ; Photometry, classification Keywords: methods: data analysis - catalogues - stars: variables: general Description: The Hipparcos catalogue (ESA 1997, Cat. I/239) and the AAVSO Variable Star Index (Watson et al., 2011, Cat. B/vsx) are employed to complement the training set of periodic variables of Dubath et al. (2011, Cat. J/MNRAS/414/2602) with irregular and non-periodic representatives, leading to 3881 sources in total which described 24 variability types. The attributes employed to characterize light-curve features are selected according to their relevance for classification. Classifier models are produced with random forests and a multi-stage methodology based on Bayesian networks, achieving overall misclassification rates under 12%. Both classifiers are applied to predict variability types for 6051 Hipparcos variables associated with uncertain or missing types in the literature. File Summary: -------------------------------------------------------------------------------- FileName Lrecl Records Explanations -------------------------------------------------------------------------------- ReadMe 80 . This file table2.dat 111 3881 Training set of Hipparcos variable stars table4.dat 101 6051 Prediction set of Hipparcos unsolved variables table5.dat 68 6051 Predictions of variability types tablec1.dat 176 6051 Full random forest prediction probability arrays tablec2.dat 176 6051 Full multi-stage Bayesian nets prediction probability arrays ori.tar 512 7080 Original files -------------------------------------------------------------------------------- See also: I/239 : The Hipparcos and Tycho Catalogues (ESA 1997) I/311 : Hipparcos, the New Reduction (van Leeuwen, 2007) B/vsx : AAVSO International Variable Star Index VSX (Watson+, 2006-12) J/MNRAS/414/2602 : HIP variable automated classification (Dubath+, 2011) Byte-by-byte Description of file: table[24].dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 6 I6 --- HIP [1/120404] Hipparcos number 8- 12 F5.2 mag V-I Reddened V-I colour index in Cousins' system, as provided by ESA (1997) (1) 14- 19 F6.2 --- Skew Unbiased skewness of the distribution of HIP magnitudes (Skewness) (2) 21- 25 F5.2 [mag] logAmp Decadic logarithm of the difference between the faintest and the brightest values of the light-curve model (LogAmplitude) (3) 27- 33 F7.4 [d] logPer Decadic logarithm of the period (LogPeriod) (4) 35- 40 F6.2 mag MAG Absolute magnitude in the Hipparcos band (AbsoluteMag) (5) 42- 48 F7.2 [-] logFAP [,0] Decadic logarithm of the probability that the maximum peak in the Lomb-Scargle periodogram (Scargle 1982ApJ...263..835S 1982ApJ...263..835S) is due to noise rather than the true signal (6) 50- 54 F5.2 [-] logP2P Decadic logarithm of the point-to-point scatter of the time series (LogP2PscatterFoldedRaw) (7) 56- 60 F5.2 [-] logQSOvar Decadic logarithm of the reduced chi-square of the source variability with respect to a parametrized QSO variance model (8) 62- 66 F5.2 [-] logScRaw Decadic logarithm of the ratio between the median of absolute deviations from the median of the raw time series and the median of absolute values of the residual time series (logScatterRawRes) (9) 68- 72 F5.2 [mas] logPlx Decadic logarithm of the parallax value as provided by ESA (2007) (LogParallax) (10) 74- 78 F5.2 [mag] logSt Decadic logarithm of the unbiased standard deviation of the residual time series (logStdDevRes) (11) 80- 84 F5.2 [mag] logSVar Decadic logarithm of the average of absolute values of magnitude differences between all pairs of measurements separated by time-scales from 0.01 to 0.1 day (logShortVar) (12) 86- 89 F4.2 --- Sum Ratio between the sum of squared residuals of the model from the raw data and the sum of squared deviations of the raw time series from its mean value (SumSqResRaw) (13) 91- 95 F5.2 deg |b| Absolute value of the Galactic latitude of the source position (AbsGLAT) (14) 97-101 F5.2 10-4/d eFreq Error estimate of the derived frequency (FrequencyError) (15) 103-111 A9 --- Type Variability type, only in table2 (16) -------------------------------------------------------------------------------- Note (1): The reddened V-I colour index in Cousins' system, as provided by ESA (1997, I/239). Note (2): The unbiased skewness of the distribution of Hipparcos magnitudes, weighted by the inverse of squared measurement uncertainties. Note (3): The decadic logarithm of the difference between the faintest and the brightest values of the light-curve model. Note (4): The decadic logarithm of the period computed with the generalized Lomb-Scargle method (Zechmeister & Kurster 2009A&A...496..577Z 2009A&A...496..577Z) for sources with weighted skewness of the magnitude distribution smaller than 1.6. Periods of sources with skewness greater than 1.6 are computed with the classical (unweighted) Lomb-Scargle method (Lomb 1976Ap&SS..39..447L 1976Ap&SS..39..447L) Scargle 1982ApJ...263..835S 1982ApJ...263..835S). Limitations regarding the recovered periods are described in Sec. 4.2 of the paper. Note (5): The absolute magnitude in the Hipparcos band employing the parallax described in logPlx and neglecting interstellar absorption. Note (6): The decadic logarithm of the probability that the maximum peak in the the Lomb-Scargle periodogram (Scargle 1982ApJ...263..835S 1982ApJ...263..835S) is due to noise rather than the true signal, employing the beta distribution as indicated by Schwarzenberg-Czerny (1998MNRAS.301..831S 1998MNRAS.301..831S). The computation assumed a number of independent frequencies equal to the number of frequencies tested divided by an oversampling factor (estimated by the largest value between one and the inverse of the product of the frequency spacing employed and the time-series duration). Note (7): The decadic logarithm of the point-to-point scatter of the time series folded with twice the recovered period (measured by the sum of squared magnitude differences between successive measurements in phase) divided by the same quantity computed on the raw time series (i.e., with respect to successive measurements in time). Note (8): The decadic logarithm of the reduced chi-square of the source variability with respect to a parametrized quasar variance model, denoted by χ2QSO/ν in Butler & Bloom (2011AJ....141...93B 2011AJ....141...93B). Following Richards et al. (2011ApJ...733...10R 2011ApJ...733...10R), the parameter values employed for the Hipparcos data correspond to the SDSS g-band at fixed magnitude of 19. Note (9): The decadic logarithm of the ratio between the median of absolute deviations from the median of the raw time series and the median of absolute values of the residual time series (obtained by subtracting model values from the raw time series). Note (10): The decadic logarithm of the parallax value as from the new reduction of the Hipparcos raw data (van Leeuwen, 2007, I/311). Non-positive values of parallax are replaced by positive values randomly extracted from a Gaussian distribution with zero mean and standard deviation equal to the measurement uncertainty. Note (11): The decadic logarithm of the unbiased standard deviation of the residual time series, weighted by the inverse of squared measurement uncertainties. Note (12): The decadic logarithm of the average of absolute values of magnitude differences between all pairs of measurements separated by time-scales from 0.01 to 0.1 day. Note (13): The ratio between the sum of squared residuals of the model from the raw data and the sum of squared deviations of the raw time series from its mean value. Note (14): The absolute value of the Galactic latitude of the source position. Note (15): The error estimate of the derived frequency (multiplied by 10000), under the assumption of equidistant observations of a sinusoidal signal (Kovacs 1981Ap&SS..78..175K 1981Ap&SS..78..175K; Baliunas et al. 1985ApJ...294..310B 1985ApJ...294..310B; Gilliland & Fisher 1985PASP...97..285G 1985PASP...97..285G). Note (16): Variability types mostly from the AAVSO Variable Star Index (Watson et al., 2011, Cat. B/vsx; see also the "Note (G1)" below); other sources are detailed in the paper. -------------------------------------------------------------------------------- Byte-by-byte Description of file: table5.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 6 I6 --- HIP [1/120404] Hipparcos number 8- 9 A2 --- Set Hipparcos sets from which the sources have been selected (HipparcosSet) (17) 11- 15 A5 --- HIPtype Variability types as listed in HIP (HipparcosType) (18) 17- 38 A22 --- VXtype Variability types as listed in AAVSO (19) 40- 48 A9 --- RFtype Variability types predicted by random forests (PredictedTypeRF) (20) 50- 58 A9 --- MBtype Variability types predicted by a multi-stage methodology based on Bayesian networks (PredictedTypeMB) (21) 60- 63 F4.2 --- prRF [0/1] Probability of the variability type predicted by random forests (ProbabilityRF) 65- 68 F4.2 --- prMB [0/1] Probability of the variability type predicted by a multi-stage methodology based on Bayesian networks (ProbabilityMB) -------------------------------------------------------------------------------- Note (17): The Hipparcos sets from which the sources have been selected (U1, U2 [unsolved], and M [micro-variable]), see Sec. 2 of the paper. Note (18): Variability types from literature as listed in the Hipparcos catalogue (ESA 1997, I/239), when available Note (19): Variability types from literature included in the AAVSO Variable Star Index, Version 2011-01-16 (Watson et al., 2011, B/vsx), when available. Note (20): Variability types predicted by random forests (limited to single types only); see the types in the "Note (G1)" below. Note (21): Variability types predicted by a multi-stage methodology based on Bayesian networks (limited to single types only); see the types in the "Note (G1)" section below. -------------------------------------------------------------------------------- Byte-by-byte Description of file: tablec?.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations -------------------------------------------------------------------------------- 1- 6 I6 --- HIP [1/120404] Hipparcos number 8- 11 F4.2 --- IX Probability of the source to be of type I_X, as predicted by random forests (ProbabilityI_X) (G1) 13- 16 F4.2 --- LPVP Probability of the source to be of type LPV_P, as predicted by random forests (ProbabilityLPV_P) (G1) 18- 21 F4.2 --- LPVX Probability of the source to be of type LPV_X, as predicted by random forests (ProbabilityLPV_X) (G1) 23- 26 F4.2 --- RS+BYP Probability of the source to be of type RS+BY_P, as predicted by random forests (ProbabilityRS+BY_P) (G1) 28- 31 F4.2 --- RS+BYX Probability of the source to be of type RS+BY_X, as predicted by random forests (ProbabilityRS+BY_X) (G1) 33- 36 F4.2 --- BE+GCASP Probability of the source to be of type BE+GCAS_P as predicted by random forests (ProbabilityBE+GCAS_P) (G1) 38- 41 F4.2 --- BE+GCASX Probability of the source to be of type BE+GCAS_X as predicted by random forests (ProbabilityBE+GCAS_X) (G1) 43- 46 F4.2 --- SPBP Probability of the source to be of type SPB_P, as predicted by random forests (ProbabilitySPB_P) (G1) 48- 51 F4.2 --- ACVP Probability of the source to be of type ACV_P, as predicted by random forests (ProbabilityACV_P) (G1) 53- 56 F4.2 --- ACVX Probability of the source to be of type ACV_X, as predicted by random forests (ProbabilityACV_X) (G1) 58- 61 F4.2 --- EAP Probability of the source to be of type EA_P, as predicted by random forests (ProbabilityEA_P) (G1) 63- 66 F4.2 --- EAX Probability of the source to be of type EA_X, as predicted by random forests (ProbabilityEA_X) (G1) 68- 71 F4.2 --- EBP Probability of the source to be of type EB_P, as predicted by random forests (ProbabilityEB_P) (G1) 73- 76 F4.2 --- EWP Probability of the source to be of type EW_P, as predicted by random forests (ProbabilityEW_P) (G1) 78- 81 F4.2 --- ELLP Probability of the source to be of type ELL_P, as predicted by random forests (ProbabilityELL_P) (G1) 83- 86 F4.2 --- ACYGP Probability of the source to be of type ACYG_P, as predicted by random forests (ProbabilityACYG_P) (G1) 88- 91 F4.2 --- ACYGX Probability of the source to be of type ACYG_X, as predicted by random forests (ProbabilityACYG_X) (G1) 93- 96 F4.2 --- BCEPP Probability of the source to be of type BCEP_P, as predicted by random forests (ProbabilityBCEP_P) (G1) 98-101 F4.2 --- BCEPX Probability of the source to be of type BCEP_X, as predicted by random forests (ProbabilityBCEP_X) (G1) 103-106 F4.2 --- DCEPSP Probability of the source to be of type DCEPS_P, as predicted by random forests (ProbabilityDCEPS_P) (G1) 108-111 F4.2 --- DCEPP Probability of the source to be of type DCEP_P, as predicted by random forests (ProbabilityDCEP_P) (G1) 113-116 F4.2 --- CEP(B)P Probability of the source to be of type CEP(B)_P, as predicted by random forests (ProbabilityCEP(B)_P) (G1) 118-121 F4.2 --- RRABP Probability of the source to be of type RRAB_P, as predicted by random forests (ProbabilityRRAB_P) (G1) 123-126 F4.2 --- RRCP Probability of the source to be of type RRC_P, as predicted by random forests (ProbabilityRRC_P) (G1) 128-131 F4.2 --- GDORP Probability of the source to be of type GDOR_P, as predicted by random forests (ProbabilityGDOR_P) (G1) 133-136 F4.2 --- GDORX Probability of the source to be of type GDOR_X, as predicted by random forests (ProbabilityGDOR_X) (G1) 138-141 F4.2 --- DSCTP Probability of the source to be of type DSCT_P, as predicted by random forests (ProbabilityDSCT_P) (G1) 143-146 F4.2 --- DSCTX Probability of the source to be of type DSCT_X, as predicted by random forests (ProbabilityDSCT_X) (G1) 148-151 F4.2 --- DSCTCP Probability of the source to be of type DSCTC_P, as predicted by random forests (ProbabilityDSCTC_P) (G1) 153-156 F4.2 --- DSCTCX Probability of the source to be of type DSCTC_X, as predicted by random forests (ProbabilityDSCTC_X) (G1) 158-161 F4.2 --- CWAP Probability of the source to be of type CWA_P, as predicted by random forests (ProbabilityCWA_P) (G1) 163-166 F4.2 --- CWBP Probability of the source to be of type CWB_P, as predicted by random forests (ProbabilityCWB_P) (G1) 168-171 F4.2 --- SXARIP Probability of the source to be of type SXARI_P, as predicted by random forests (ProbabilitySXARI_P) (G1) 173-176 F4.2 --- RVP Probability of the source to be of type RV_P, as predicted by random forests (ProbabilityRV_P) (G1) -------------------------------------------------------------------------------- Global Notes: Note (G1): the variability types include a suffix _P for periodic variables, or _X for unsolved variables. The classifications, defined in table 1 of paper, are: I = Irregular LPV = Long period variables RS+BY = RS CVn- and BY Dra- type variables (rotating dKe or dMe stars) BE+GCAS = B-type emission line star and γ Cas variables SPB = Slowly pulsating B-type stars ACV = α2CVn-type variables (rotating Ap stars) EB = eclipsing binaryies of Algol type (detached) EB = eclipsing binaryies of β Lyr type (semi-detached) EW = eclipsing binaryies of W UMa type (contact) ELL = ellipsoidal rotating variables ACYG = α Cyg variables (pulsating early-type supergiants) BCEP = β Cep variables (pulsating early-type) DCEP = classical Cepheid (δ Cep type) DCEPS = First overtone Cepheid CEP(B) = Multimode Cepheid RRAB = RR Lyr asymmetric light curve RRC = RR Lyr with nearly symmetric light curve GDOR = γ Dor type (early F-type pulsating star) DSCT = δ Scuti variable (pulsating A0-F5 stars); includes SX Phe-type stars DSCTC = low-amplitude δ Scuti variables CWA = pulsating variables of W Vir type with period>8d CWB = pulsating variables of W Vir type with period<8d (BL Her-type) SXARI = SX Ari-type star (rotating variable of Bp type) RV = RV Tau-type (radially pulsating F-G supergiants) History: * 29-Nov-2012: on-line version * 24-Jan-2013: tables 2, 4 and 5 corrected (from author) * 20-Aug-2013: label logScFol corrected into logScQSO (from author) Acknowledgements: Lorenzo Rimoldini, lorenzo(at)rimoldini.info
(End) L. Rimoldini [Geneva Obs./ISDC, Switzerland], P. Vannier [CDS] 19-Mar-2012
The document above follows the rules of the Standard Description for Astronomical Catalogues; from this documentation it is possible to generate f77 program to load files into arrays or line by line