J/MNRAS/509/2566 VIVACE, VIrac VAriable Classification Ensemble (Molnar+, 2022)
Variable star classification across the Galactic bulge and disc with the VISTA
Variables in the Via Lactea survey.
Molnar T.A., Sanders J.L., Smith, L.C., Belokurov V., Lucas P., Minniti D.
<Mon. Not. R. Astron. Soc. 509, 2566-2592 (2022)>
=2022MNRAS.509.2566M 2022MNRAS.509.2566M (SIMBAD/NED BibCode)
ADC_Keywords: Surveys ; Stars, variable ; Binaries, eclipsing
Keywords: catalogues - surveys - binaries: eclipsing -
stars: variables: general - stars: variables: RR Lyrae
Abstract:
We present VIVACE, the VIrac VAriable Classification Ensemble, a
catalogue of variable stars extracted from an automated classification
pipeline for the Vista Variables in the Via Lactea (VVV) infrared
survey of the Galactic bar/bulge and southern disc. Our procedure
utilises a two-stage hierarchical classifier to first isolate likely
variable sources using simple variability summary statistics and
training sets of non-variable sources from the Gaia early third data
release, and then classify candidate variables using more detailed
light curve statistics and training labels primarily from OGLE and
VSX. The methodology is applied to point-spread-function photometry
for ∼490 million light curves from the VIRAC v2 astrometric and
photometric catalogue resulting in a catalogue of ∼1.4 million likely
variable stars, of which ∼39000 are high-confidence (classification
probability >0.9) RR Lyrae ab stars, ∼8000 RR Lyrae c/d stars, ∼187000
detached/semi-detached eclipsing binaries, ∼18000 contact eclipsing
binaries, ∼1400 classical Cepheid variables and ∼2200 Type II Cepheid
variables. Comparison with OGLE-4 suggests a completeness of around 90
per cent for RRab and ∼60 per cent for RRc/d, and a
misclassification rate for known RR Lyrae stars of around 1 per cent
for the high confidence sample. We close with two science
demonstrations of our new VIVACE catalogue: first, a brief
investigation of the spatial and kinematic properties of the RR Lyrae
stars within the disc/bulge, demonstrating the spatial elongation of
bar-bulge RR Lyrae stars is in the same sense as the more metal-rich
red giant population whilst having a slower rotation rate of ∼40 km
s-1kpc-1; and secondly, an investigation of the Gaia EDR3 parallax
zeropoint using contact eclipsing binaries across the Galactic disc
plane and bulge.
Description:
VIVACE (VIrac VAriable Classification Ensemble)
A catalogue of variable classifications resulting from an automated
hierarchical classification algorithm applied to the VIRAC-2 of the
VVV data.
VIRAC-2 uses point-spread-function photometry on all VVV images and
VVV-X images in the original VVV footprint. There is a new photometric
zeropoint calibration relative to 2MASS. The source catalogue has been
grouped into detections of the same source to produce sets of light
curves and an astrometric solution.
VIVACE applies an initial classification to the set of VIRAC-2 light
curves using a set of simple variability statistics trained on a
combined set of variable sources and constant sources identified from
Gaia EDR3. This gives a set of likely variable candidates we consider
for further processing. More detailed features are computed for these
candidates (e.g. Fourier features) and a second classification is
performed to determine the detailed variable type.
The table is split into 4 sections.
1. the properties computed from our light curve processing
(e.g. Fourier parameters),
2. recalibrated photometry from VIRAC-2,
3. variability indices computed from the VIRAC-2 photometric
timeseries,
4. Gaia EDR3 astrometry and photometry for sources cross-matched
within 0.5 arcsec (using proper motions to correct for the epoch
difference).
Note that some quantities in the "1" duplicate quantities in "5" but
the quality cuts performed on the light curves are different so the
quantities will not exactly agree. The first stage used statistics
computed using all unambiguous detections (detections shared with
non-duplicate sources).
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
vivace.dat 1539 1364732 VIrac VAriable Classification Ensemble catalog
--------------------------------------------------------------------------------
See also:
II/337 : VISTA Variables in the Via Lactea Survey DR1 (Saito+, 2012)
II/348 : VISTA Variable in the Via Lactea Survey DR2 (Minniti+, 2017)
Byte-by-byte Description of file: vivace.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 7 I7 --- VIVACE Unique VIVACE source ID,
a running index (vivace_id)
9- 18 A10 --- Class Most likely variability class
(class) (1)
20- 29 F10.8 --- Prob Predicted probability of most likely
variability class (prob) (2)
31- 53 F23.18 d Period Recommended period (corrected for
aliases where possible and matches
conventions for EW) (period)
55- 72 F18.14 deg RAdeg VIRAC-2 right ascension (ICRS) at
epoch 2014.0 if proper motion
computed (ra)
74- 92 F19.15 deg DEdeg VIRAC-2 declination (ICRS)
at epoch 2014.0 if proper
motion computed (dec)
94- 115 E22.17 deg GLON Galactic longitude computed from
VIRAC positions (l)
117- 140 E24.17 deg GLAT Galactic latitude computed from
VIRAC positions (b)
142- 145 I4 --- lsqKsEp [21/2052] Number of Ks epochs used in
cleaned light curve (kslsqn_epochs)
147- 158 E12.7 mag lsqKsAmp Amplitude computed from the
best-fitting Fourier series
(kslsqamplitude) (3)
160- 182 F23.18 d lsqPer Best period from the Fourier
least-squares routine
(lsq_period) (4)
184- 197 E14.9 d e_lsqPer Error of best period from Fourier
least-squares routine
(lsqperioderror)
199- 200 I2 --- lsqN [4/10] No. of Fourier terms used in
best-fitting Fourier series
(lsq_nterms) (5)
202- 214 E13.8 mag lsqAmp1 Amplitude of the 1st Fourier term
(lsqamp1)
216- 228 E13.8 mag lsqAmp2 Amplitude of the 2nd Fourier term
(lsqamp2)
230- 242 E13.8 mag lsqAmp3 Amplitude of the 3rd Fourier term
(lsqamp3)
244- 256 E13.8 mag lsqAmp4 Amplitude of the 4th Fourier term
(lsqamp4)
258- 272 E15.8 rad lsqphi1 Phase of the 1st Fourier term
(lsqphi1)
274- 289 E16.9 rad lsqphi2 Phase of the 2nd Fourier term
(lsqphi2)
291- 305 E15.8 rad lsqphi3 Phase of the 3rd Fourier term
(lsqphi3)
307- 321 E15.8 rad lsqphi4 Phase of the 4th Fourier term
(lsqphi4)
323- 335 E13.8 mag lsqAmpd1 Amplitude of the 1st Fourier term at
twice the best-fitting Fourier period
(lsqampdouble_1)
337- 349 E13.8 mag lsqAmpd2 Amplitude of the 2nd Fourier term at
twice the best-fitting Fourier period
(lsqampdouble_2)
351- 364 E14.9 mag lsqAmpd3 Amplitude of the 3rd Fourier term at
twice the best-fitting Fourier period
(lsqampdouble_3)
366- 378 E13.8 mag lsqAmpd4 Amplitude of the 4th Fourier term at
twice the best-fitting Fourier period
(lsqampdouble_4)
380- 394 E15.8 rad lsqphid1 Phase of the 1st Fourier term at twice
the best-fitting Fourier period
(lsqphidouble_1)
396- 410 E15.8 rad lsqphid2 Phase of the 2nd Fourier term at
twice the best-fitting Fourier period
(lsqphidouble_2)
412- 426 E15.8 rad lsqphid3 Phase of the 3rd Fourier term at
twice the best-fitting Fourier period
(lsqphidouble_3)
428- 442 E15.8 rad lsqphid4 Phase of the 4th Fourier term at
twice the best-fitting Fourier period
(lsqphidouble_4)
444- 454 F11.9 mag KsAmp Max-Min magnitude (ks_amplitude)
456- 466 F11.9 --- Ksfracdet Fraction of Ks observations beyond one
standard deviation of mean magnitude
(ksfracdetectionsoutside1sigma)
468- 484 F17.8 --- lsqdlogl Difference in log- likelihood between
Fourier fit and constant magnitude
solution (no scaling of
uncertainties) (deltalogllsq_const)
486- 500 E15.8 --- log10LSfap Base-10 logarithm of the false-alarm
probability from the Lomb-Scargle
periodogram (log10lombscarglefap)
502- 516 F15.9 --- LSmaxpow Maximum power in Lomb- Scargle
periodogram (lombscarglemaxpow)
518- 529 F12.10 --- Ksphase Maximum phase difference between
consecutive observations in
phase-folded lightcurve
(ksphasediffmax) (6)
531- 543 F13.9 --- LSpowRat Maximum Lomb-Scargle power (max_pow)
minus mean power all over standard
deviation of power
(lombscarglepowmaxdevstd_ratio)
545- 554 F10.7 --- KsphdiffRat Maximum phase difference in
phase-folded Ks light-curve minus
mean phase difference divided by
standard deviation of phase
differences (same caveat as
ksphasediffmax)
(ksphasediffmaxdevstd_ratio)
556- 560 E5.4 --- SecMin [0/1] Flag indicating whether a
significant secondary minimum is
present
(significantsecondminimum) (7)
562- 572 F11.9 --- KsminRatModel Ratio between consecutive minima in
the best- fitting Fourier model
(ksconsecminimumratiomodel) (8)
574- 587 E14.7 --- KsminRatioData ? Ratio between consecutive minima as
computed using at least three
measurements within at least 0.025
phase of minimum
(ksconsecminimumratiodata) (9)
589- 598 F10.8 --- prob1ststage [0.65/1.0] Classification probability
for the 1st stage of classification
to distinguish between constant and
variable sources (using the
proportion of similar classifications
by the trees) (prob1ststage)
600- 609 F10.8 --- ProbVar [0.5/1.0] Probability of variability
from 2nd stage classifier
(1-probability of constant source)
(prob_var)
611- 630 F20.12 --- Zscalefac ? Factor by which the Fourier model
for the Ks band needs to be scaled to
match measurements in Z band
(zscalefactor)
632- 653 F22.11 --- Zrms/Ksrms ? Ratio of the inverse-variance-
weighted (from uncertainties)
root- mean-squared (RMS) observations
in Z band compared to RMS of Ks
Fourier model evaluated at times of
Z observations (zrmsksrmsratio)
655- 674 F20.12 --- Yscalefac ? Factor by which the Fourier model
for the Ks band needs to be scaled to
match measurements in Y band
(yscalefactor)
676- 697 F22.11 --- Yrms/Ksrms ? Ratio of the inverse-variance-
weighted (from uncertainties)
root-mean-squared (RMS) observations
in Y band compared to RMS of Ks
Fourier model evaluated at times of
Y observations (yrmsksrmsratio)
699- 718 F20.12 --- Jscalefac ? Factor by which the Fourier model
for the Ks band needs to be scaled to
match measurements in J band
(jscalefactor)
720- 741 F22.11 --- Jrms/Ksrms ? Ratio of the inverse-variance-
weighted (from uncertainties)
root-mean-squared (RMS)
observations in J band compared to
RMS of Ks Fourier model evaluated at
times of J observations
(jrmsksrmsratio)
743- 762 F20.12 --- Hscalefac ? Factor by which the Fourier model
for the Ks band needs to be scaled to
match measurements in H band
(hscalefactor)
764- 776 E13.7 --- Hrms/Ksrms ? Ratio of the inverse- variance-
weighted (from uncertainties)
root-mean-squared (RMS) observations
in H band compared to RMS of Ks
Fourier model evaluated at times of
H observations (hrmsksrmsratio)
778- 791 E14.8 mag J-Ks ? Colour excess in (J-Ks) computed
using the colour of red clump stars
assuming (J-Ks)0=0.62 (ejk_rc) (10)
793- 807 E15.8 mag H-Ks ? Colour excess in (H-Ks). Red clump
assumed (H-Ks)0=0.09 (ehk_rc)
809- 820 F12.6 mag Zivwmag ? Inverse-variance-weighted Z
magnitude of all b-grade
photometry (11)
822- 830 F9.6 mag e_Zivwmag ? Inverse- variance-weighted
uncertainty on Z magnitude
(see zivwmean_mag) (12)
832- 841 F10.6 mag Yivwmag ? Inverse-variance-weighted y
magnitude of all b-grade
photometry (11)
843- 851 F9.6 mag e_Yivwmag ? Inverse-variance-weighted
uncertainty on Y magnitude
(see yivwmean_mag) (12)
853- 865 F13.6 mag Jivwmag ? Inverse-variance-weighted
J magnitude of all b-grade
photometry (11)
867- 875 F9.6 mag e_Jivwmag ? Inverse-variance-weighted
uncertainty on J magnitude
(see jivwmean_mag) (12)
877- 887 F11.6 mag Hivwmag ? Inverse-variance-weighted
H magnitude of all b-grade
photometry (11)
889- 897 F9.6 mag e_Hivwmag ? Inverse-variance-weighted
uncertainty on H magnitude
(see hivwmean_mag) (12)
899- 907 F9.6 mag Ksivwmag Inverse-variance-weighted Ks
magnitude of all b-grade
photometry (11)
909- 913 E5.1 mag e_Ksivwmag Inverse-variance-weighted uncertainty
on Ks magnitude
(see ksivwmean_mag) (12)
915- 918 I4 --- KsNphot [1/2074] Number of Ks detections used
for statistics (ksnphot)
920- 923 I4 --- KsNEp [21/2074] Number of Ks epochs used in
computation of variability statistics
employed in 1st classification stage
(ksvarstatsn_epochs)
925- 935 F11.6 mag s_Ksmag Standard deviation of Ks magnitudes
(note slightly different quality cuts
compared to ksstdmag)
(ksstdvarstats)
937- 945 E9.2 --- KsKur Kurtosis of Ks magnitudes
(ks_kurtosis)
947- 955 E9.2 --- Ksskew Skew of Ks magnitudes (ks_skew)
957- 969 F13.6 mag Ksp0 0th percentile of Ks photometry
(ks_p0)
971- 979 F9.6 mag Ksp1 1st percentile of Ks photometry
(ks_p1)
981- 989 F9.6 mag Ksp2 2nd percentile of Ks photometry
(ks_p2)
991- 999 F9.6 mag Ksp4 4th percentile of Ks photometry
(ks_p3)
1001-1009 F9.6 mag Ksp5 5th percentile of Ks photometry
(ks_p5)
1011-1019 F9.6 mag Ksp8 8th percentile of Ks photometry
(ks_p8)
1021-1029 F9.6 mag Ksp16 16th percentile of Ks photometry
(ks_p16)
1031-1039 F9.6 mag Ksp25 25th percentile of Ks photometry
(ks_p25)
1041-1049 F9.6 mag Ksp32 32nd percentile of Ks photometry
(ks_p32)
1051-1059 F9.6 mag Ksp50 50th percentile of Ks photometry
(ks_p50)
1061-1069 F9.6 mag Ksp68 68th percentile of Ks photometry
(ks_p68)
1071-1079 F9.6 mag Ksp75 75th percentile of Ks photometry
(ks_p75)
1081-1089 F9.6 mag Ksp84 84th percentile of Ks photometry
(ks_p84)
1091-1099 F9.6 mag Ksp92 92nd percentile of Ks photometry
(ks_p92)
1101-1109 F9.6 mag Ksp95 95th percentile of Ks photometry
(ks_p95)
1111-1119 F9.6 mag Ksp96 96th percentile of Ks photometry
(ks_p96)
1121-1129 F9.6 mag Ksp98 98th percentile of Ks photometry
(ks_p98)
1131-1139 F9.6 mag Ksp99 99th percentile of Ks photometry
(ks_p99)
1141-1149 F9.6 mag Ksp100 100th percentile of Ks photometry
(ks_p100)
1151-1158 F8.6 mag Ksmad Median absolute deviation of Ks
magnitudes (ks_mad)
1160-1167 F8.6 --- Kseta von Neumann ratio of the mean square
successive magnitude difference
to the variance (ks_eta) (13)
1169-1187 F19.6 --- e_Kseta Error-weighted von- Neumann ratio
(ksetae) (14)
1189-1192 I4 --- KsSin Number of pairs of measurements
(ksstetsonin) (15)
1194-1202 E9.2 --- KsSI ? Stetson I: correlation between
consecutive brightness measurements
(ksstetsoni) (16)
1204-1218 F15.6 --- KsSJ Stetson J: correlation between
consecutive brightness measurements
(ksstetsonj) (17)
1220-1227 F8.6 --- KsSK Stetson K: robust kurtosis measure
(ksstetsonk) (18)
1229-1247 I19 --- GaiaEDR3 ?=-9999 Gaia EDR3 source ID
(gaiaedr3source_id)
1249-1267 F19.14 deg RAGdeg ? Gaia EDR3 right ascension (ICRS) at
Ep=2016 (gaiaedr3ra)
1269-1287 F19.15 deg DEGdeg ? Gaia EDR3 declination (ICRS) at
Ep=2016 (gaiaedr3dec)
1289-1312 E24.17 mas Plx ? Gaia EDR3 parallax
(gaiaedr3parallax)
1314-1337 E24.17 mas/yr pmRA ? Gaia EDR3 right ascension proper
motion, pmRA*cosDE (gaiaedr3pmra)
1339-1362 E24.17 mas/yr pmDE ? Gaia EDR3 declination proper motion
(gaiaedr3pmdec)
1364-1376 F13.10 mas e_Plx ? Gaia EDR3 parallax uncertainty
(gaiaedr3parallax_error)
1378-1390 F13.10 mas/yr e_pmRA ? Gaia EDR3 right ascension proper
motion uncertainty (e_pmRA*cosDE)
(gaiaedr3pmra_error)
1392-1404 F13.10 mas/yr e_pmDE ? Gaia EDR3 declination proper motion
uncertainty (gaiaedr3pmdec_error)
1406-1420 E15.9 --- PlxpmRAcor ? Gaia EDR3 parallax pmra correlation
(gaiaedr3parallaxpmracorr)
1422-1437 E16.9 --- PlxpmDEcor ? Gaia EDR3 parallax pmdec correlation
(gaiaedr3parallaxpmdeccorr)
1439-1453 E15.8 --- pmRApmDEcor ? Gaia EDR3 pmra pmdec correlation
(gaiaedr3pmrapmdeccorr)
1455-1465 F11.7 mag Gmag ? Gaia EDR3 G magnitude
(gaiaedr3photgmean_mag)
1467-1477 F11.7 mag BPmag ? Gaia EDR3 G_BP magnitude
(gaiaedr3photbpmean_mag)
1479-1489 F11.7 mag RPmag ? Gaia EDR3 G_RP magnitude
(gaiaedr3photrpmean_mag)
1491-1513 E23.17 arcsec SepVG ? Angular separation between VIRAC2
and Gaia-EDR3 (corrected for epoch
difference using proper motion from
Gaia EDR3). Maximum separation = 0.5
(virac2gedr3sep)
1515-1525 F11.6 mag E(B-V) ? Schlegel, Finkbeiner & Davis
(1998ApJ...500..525S 1998ApJ...500..525S) E(B-V)
(ebv_schlegel)
1527-1538 F12.8 --- RUWE ? Gaia EDR3 Reduced Unit Weight Error
(gaiaedr3ruwe)
--------------------------------------------------------------------------------
Note (1): RRab, RRcd, T2CEP, CEP, LPV, EA/EB, EW, Ell, CONST output from
classifier.
Note (2): XGBoost constructs a model for each class C by minimising a loss
function (yC-FC(x))2 comparing the class labels yC (=1 or 0 depending
on whether a member of class C or not) to the predictions FC(x) (essentially
the proportion of training set which correctly classed by model) where x are
input features. Probabilities are found using a softmax function
PC(x)=exp{FC(x)}/∑iexp{Fi(x)}.
Note (3): Note this has only been computed in a post- processing step so only
the first four Fourier terms are used whilst up to 10 terms could have been
used in the series. It should therefore be treated with some caution for
sources with lsqnterms>4 (e.g. EA/EB)
Note (4): this is the 'raw. period which sometimes does not match the true
period in the case of aliases or for EW binaries. For further work, use
period instead. This column is retained as it is used in the classification.
Note (5): min. 4, max. 10 - assessed using Akaike information criterion.
Note (6): caveat: did not correctly consider phase-wrapping i.e. equivalence
of phase=0 and phase=1
Note (7): This is evaluated using the best-fitting Fourier model and checking if
there is a minimum in the phase range 0.35-0.65 with depth 7σ compared
to neighbouring maxima.
Note (8): Minimum depth is computed using an inverse-variance-weighted mean.
If significantsecondminimum =0 we use the minima at phase=(0,2,4,...)
compared to phase=(1,3,5,...), else if significant_se cond_minimum=0 the
minima are located at phase=(0,1,2,3,...) compared to
phase=(0.5 ,1.5,2.5,3.5,...).
Note (9): If significantsecond minimum=0, this is 1.
Note (10): In regions of high extinction the bulge red clump is too faint in J
for VVV. This means the extinction is underestimated and stars will appear
intrinsically too red.
Note (11): This is the recommended magnitude to use.
1 = B-grade photometry.
2 = within 3 sigma of the astrometric trajectory
3 = detection not ambiguously associated with more than 1 source
(i.e. blended)
4 = none of it's ubercal coefficients have fewer than 100 reference sources
5 = its on a detector with >100 secondary calibration reference sources
6 = it's not at the outer edge of the illumination map (i.e. is ≳16 pixels
from the image edge) (from Leigh Smith)
Note (12):_This is the recommended magnitude uncertainty to use.
Note (13): ∑{i=1}{N-1}\left(m_{i+1} - m_i\right)^2
/∑{i=1}N\left(m_i-bar{m}\right)^2
Note (14): see (http://isadoranun.github.io/tsfeat/
FeaturesDocumentation.html#Variability-index-$η^e$))
Note (15): N_p within 1h of each other used in Stetson I
computation
Note (16): sqrt(rac{1}{Np\left(Np-1\right)})
∑{i=1}{N_p}
\left(rac{m{1,i}-bar{m}}σ1,i
\right)
\left(rac{m
{2,i}-bar{m}}σ_2,i\right).
This is Infinity if ksstetsonin=1.
Note (17): rac{1}{N-1}∑_{i=1}^{N-1}
sgn(δiδ{i+1})|δiδ{i+1}|
where δi=sqrt(N/(N-1))(mi-bar{m})/σ_i
Note (18): rac{1}{N}∑iN|δ_i|
/sqrt(rac{1}{N}∑iNδ_i^2)
where δi=sqrt(N/(N-1))(mi-bar{m})/σ_i
--------------------------------------------------------------------------------
Acknowledgements:
Jason Sanders, jason.sanders(at)ucl.ac.uk
(End) Patricia Vannier [CDS] 10-Jan-2022