J/A+A/675/A195 ZTF DR11 classification in ZTF/4MOST sky (Sanchez-Saez+, 2023)
Persistent and occasional: Searching for the variable population of the
ZTF/4MOST sky using ZTF data release 11.
Sanchez-Saez P., Arredondo J., Bayo A., Arevalo P., Bauer F.E.,
Cabrera-Vives G., Catelan M., Coppi P., Estevez P.A., Forster F.,
Hernandez-Garcia L., Huijse P., Kurtev R., Lira P., Munoz Arancibia A.M.,
Pignata G.
<Astron. Astrophys. 675, A195 (2023)>
=2023A&A...675A.195S 2023A&A...675A.195S (SIMBAD/NED BibCode)
ADC_Keywords: Active gal. nuclei ; Photometry ; Optical ; Redshifts
Keywords: galaxies: active - stars: variables: general - supernovae: general -
surveys - methods: statistical - methods: data analysis - AAVSO
Abstract:
We present a variability, color and morphology based classifier,
designed to identify multiple classes of transients, persistently
variable, and non-variable sources, from the Zwicky Transient Facility
(ZTF) Data Release 11 (DR11) light curves of extended and point
sources. The main motivation to develop this model was to identify
active galactic nuclei (AGN) at different redshift ranges to be
observed by the 4MOST Chilean AGN/Galaxy Evolution Survey (ChANGES).
Still, it serves as a more general time-domain astronomy study.
The model uses nine colors computed from CatWISE and PanSTARRS1 (PS1),
a morphology score from PS1, and 61 single-band variability features
computed from the ZTF DR11 g and r light curves. We trained two
versions of the model, one for each ZTF band, since ZTF DR11 treats
independently the light curves observed in a particular combination of
field, filter, and CCD-quadrant. We used a hierarchical local
classifier per parent node approach, where each node was composed of a
balanced random forest model. We adopted a 17-class taxonomy,
including non-variable stars and galaxies, three transients (SNIa,
SN-other, and CV/Nova), five classes of stochastic variables
(lowz-AGN, midz-AGN, highz-AGN, Blazar, and YSO), and seven classes of
periodic variables (LPV, EA, EB/EW, DSCT, RRL, CEP, and
Periodic-other).
The macro averaged precision, recall and F1-score are 0.61, 0.75, and
0.62 for the g-band model, and 0.60, 0.74, and 0.61, for the r-band
model. When grouping the four AGN classes (lowz-AGN, midz-AGN,
highz-AGN, and Blazar) into one single class, its precision, recall,
and F1-score are 1.00, 0.95, and 0.97, respectively, for both the $g$
and r bands. This demonstrates the good performance of the model
classifying AGN candidates. We applied the model to all the sources in
the ZTF/4MOST overlapping sky (-28≤DE≤8.5), avoiding ZTF fields
covering the Galactic bulge (|galb|≤9 and gall≤50). This area
includes 86576577 light curves in the g-band and 140409824 in the
r-band, with 20 or more observations, and with an average magnitude in
the corresponding band lower than 20.5. Only 0.73% of the $g$-band
light curves and 2.62% of the $r$-band light curves were classified as
stochastic, periodic, or transient with high probability
(Pinit≥0.9). Even though the metrics obtained for both models are
similar, we found that, in general, more reliable results are obtained
when using the g-band model. Using the latter, we identified 384242
AGN candidates (including low-, mid-, and high-redshift AGNs and
Blazars), 287156 of which have Pinit≥0.9.
Description:
In this work we provide classifications of Zwicky Transient Facility
(ZTF) Data Release 11 (DR11) light curves. Here we provide
classifications for 86576577 sources in the g band and for
140409824 in the r band. We also provide the labeled sets used to
train the models for each band, and the master catalog used to
construct the labeled sets.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
candztfg.dat 942 86576577 Candidates from the ZTF g-band light curves
candztfr.dat 942 140409824 Candidates from the ZTF r-band light curves
lz-ztfg.dat 1003 741263 Labeled set (LS) for ZTF g-band model
lz-ztfr.dat 1003 936145 Labeled set (LS) for ZTF r-band model
mast.dat 156 1907096 Master catalog used to generate the LS
--------------------------------------------------------------------------------
See also:
https://www.ztf.caltech.edu/ztf-public-releases.html : ZTF Home Pgare
Byte-by-byte Description of file: candztfg.dat candztfr.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 15 I15 --- ObjectId ZTF DR11 Identification number
(objectid)
17- 26 F10.6 deg RAdeg Right ascension (J2000) (ra)
28- 37 F10.6 deg DEdeg Declination (J2000) (dec)
39- 41 I3 --- Nepochs Number of good epochs (catflags=0)
(nepochs)
43- 52 F10.6 deg GLAT Galactic latitude (gal_b)
54- 63 F10.6 deg GLON [0/360] Galactic Longitude (gal_l)
65- 76 E12.6 --- MHPSratio ?=- Ratio between the low and high
frequency variances for the MHPS model
(MHPS_ratio)
78- 89 E12.6 --- MHPSlow ?=- Low frequency variance for the MHPS
model (MHPS_low)
91-102 E12.6 --- MHPShigh ?=- High frequency variance for the MHPS
model (MHPS_high)
104-115 E12.6 mJy SPMA Amplitude of the SPM model (SPM_A)
117-128 E12.6 d SPMt0 Cero time for the SPM model (SPM_t0)
130-138 F9.5 d SPMgamma Plateau duration SPM model (SPM_gamma)
140-147 F8.6 mJy/d SPMbeta Plateau slope SPM model (d SPM_beta)
149-157 F9.5 d SPMtaurise Rise time SPM model (SPMtaurise)
159-167 F9.5 d SPMtaufall Decline time SPM model (SPMtaufall)
169-178 F10.6 --- SPMchi Reduced chi2 SPM model (SPM_chi)
180-191 E12.6 mJy SPMC Baseline flux SPM model (SPM_C)
193-200 F8.6 mag Amp Half of the difference between the
median of the maximum 5% and of the
minimum 5% magnitudes (Amplitude)
202-209 F8.6 --- ADarling Test of whether a sample of data comes
from a population with a specific
distribution (AndersonDarling)
211-213 I3 --- AutocorLength Lag value where the autocorrelation
function becomes smaller than Eta_e
(Autocor_length)
215-222 F8.6 --- Beyond1Std Percentage of points with photometric
mag that lie beyond 1σ from the
mean (Beyond1Std)
224-231 F8.6 --- Con Number of three consecutive data points
brighter/fainter than 2sigma of the
light curve (Con)
233-244 E12.6 --- Etae Ratio of the mean of the squares of
successive mag differences to the
variance of the light curve (Eta_e)
246-254 F9.6 --- Gskew Median-based measure of the skew (Gskew)
256-267 E12.6 mag/d MaxSlope Maximum absolute magnitude slope between
two consecutive observations
(d MaxSlope)
269-275 F7.4 mag Mean Average magnitude (Mean)
277-284 F8.6 --- MeanVar Ratio of the standard deviation to the
mean magnitude (Meanvariance)
286-293 F8.6 mag MedianAbsDev Median discrepancy of the data from the
median data (MedianAbsDev)
295-302 F8.6 --- MedianBRP Fraction of photometric points within
amplitude/10 of the median mag
(MedianBRP)
304-312 F9.6 --- PairST Fraction of increasing first differences
minus fraction of decreasing first
differences over the last 30
time-sorted mag measures
(PairSlopeTrend)
314-321 F8.6 % PerAmp Largest percentage difference between
either max or min mag and median mag
(PercentAmplitude)
323-330 F8.6 mag Q31 Difference between the 3rd and the 1st
quartile of the light curve (Q31)
332-339 F8.6 --- Rcs Range of a cumulative sum (Rcs)
341-350 F10.6 --- Skew Skewness measure (Skew)
352-362 F11.6 --- SmallKurt Small sample kurtosis of the magnitudes
(SmallKurtosis)
364-371 F8.6 mag Std Standard deviation of the light curve
(Std)
373-380 F8.6 --- StetsonK Robust kurtosis measure (StetsonK)
382-389 F8.6 --- Pvar Probability that the source is
intrinsically variable (Pvar)
391-402 E12.6 --- ExcessVar Normalized measure of the intrinsic
variability amplitude (ExcessVar)
404-413 F10.6 --- SFMLAmp rms magnitude difference of the SF,
computed over a 1yr timescale
(SFMLamplitude)
415-423 F9.6 --- SFMLgamma Logarithmic gradient of the mean change
in magnitude (SFMLgamma)
425-432 F8.6 --- IARphi Level of autocorrelation using a
discrete-time representation of a
DRW model (IAR_phi)
434-445 E12.6 mag/d LinearTrend Slope of a linear fit to the light
curve (d LinearTrend)
447-458 E12.6 mag2 GPDRWsigma ?=-Amplitude of the variability from
DRW modeling ( GPDRWsigma)
460-471 E12.6 d GPDRWtau ?=- Relaxation time from DRW modeling
(GPDRWtau)
473-482 F10.6 d Per Light curve period (Period)
484-496 E13.6 --- PPE Multiband Periodogram Pseudo Entropy
(PPE)
498-509 E12.6 --- PowerR1/4 Ratio between the power of the
periodogram obtained for the best
period candidate (P) and P/4
(Powerrate1/4)
511-522 E12.6 --- PowerR1/3 Ratio between the power of the
periodogram obtained for the best
period candidate (P) and P/3
(Powerrate1/3)
524-535 E12.6 --- PowerR1/2 Ratio between the power of the
periodogram obtained for the best
period candidate (P) and P/2
(Powerrate1/2)
537-548 E12.6 --- PowerR2 Ratio between the power of the
periodogram obtained for the best
period candidate (P) and 2*P
(Powerrate2)
550-561 E12.6 --- PowerR3 Ratio between the power of the
periodogram obtained for the best
period candidate (P) and 3*P
(Powerrate3)
563-574 E12.6 --- PowerR4 Ratio between the power of the
periodogram obtained for the best
period candidate (P) and 4*P
(Powerrate4)
576-583 F8.6 --- PsiCS Range of a cumulative sum applied to the
phase-folded light curve (Psi_CS)
585-592 F8.6 --- Psieta Eta_e index calculated from the folded
light curve (Psi_eta)
594-605 E12.6 mag Hmag1 Amplitude of the first component of the
harmonic series (Harmonicsmag1)
607-618 E12.6 mag Hmag2 Amplitude of the second component of the
harmonic series (Harmonicsmag2)
620-631 E12.6 mag Hmag3 Amplitude of the third component of the
harmonic series (Harmonicsmag3)
633-644 E12.6 mag Hmag4 Amplitude of the fourth component of the
harmonic series (Harmonicsmag4)
646-657 E12.6 mag Hmag5 Amplitude of the fifth component of the
harmonic series (Harmonicsmag5)
659-670 E12.6 mag Hmag6 Amplitude of the sixth component of the
harmonic series (Harmonicsmag6)
672-683 E12.6 mag Hmag7 Amplitude of the seventh component of
the harmonic series (Harmonicsmag7)
685-696 E12.6 --- HPhase2 Phase of the second component of the
harmonic series (Harmonicsphase2)
698-709 E12.6 --- HPhase3 Phase of the third component of the
harmonic series (Harmonicsphase3)
711-722 E12.6 --- HPhase4 Phase of the fourth component of the
harmonic series (Harmonicsphase4)
724-735 E12.6 --- HPhase5 Phase of the fifth component of the
harmonic series (Harmonicsphase5)
737-748 E12.6 --- HPhase6 Phase of the sixth component of the
harmonic series (Harmonicsphase6)
750-761 E12.6 --- HPhase7 Phase of the seventh component of the
harmonic series (Harmonicsphase7)
763-774 E12.6 --- Harmse Mean square error of the harmonic series
modelling (Harmonics_mse)
776-783 F8.6 --- psScore ?=- PanSTARRS1 morphology score from
Tachibana and Miller 2018 (ps_score)
785-794 F10.6 mag g-r ?=- PanSTARRS1 g-r color (gps1-rps1)
796-805 F10.6 mag r-i ?=- PanSTARRS1 r-i color (rps1-ips1)
807-816 F10.6 mag g-W1 ?=- PanSTARRS1 g - CatWISE W1 color
(gps1-W1)
818-827 F10.6 mag g-W2 ?=- PanSTARRS1 g - CatWISE W2 color
(gps1-W2)
829-838 F10.6 mag r-W1 ?=- PanSTARRS1 r - CatWISE W1 color
(rps1-W1)
840-849 F10.6 mag r-W2 ?=- PanSTARRS1 r - CatWISE W2 color
(rps1-W2)
851-860 F10.6 mag i-W1 ?=- PanSTARRS1 i - CatWISE W1 color
(ips1-W1)
862-871 F10.6 mag i-W2 ?=- PanSTARRS1 i - CatWISE W2 color
(ips1-W2)
873-881 F9.6 mag W1-W2 ?=- CatWISE W1-W2 color (W1-W2)
883-895 A13 --- PredIClass Predicted class from node_init
(predinitclass)
897-901 F5.3 --- PredIClassProb Predicted class probability from
node_init (predinitclass_prob)
903-912 A10 --- PredVarClass Predicted class from node_variable
(predvarclass)
914-918 F5.3 --- PredVClassProb ?=- Predicted class probability from
node_init (predvarclass_prob)
920-933 A14 --- PredClass Final predicted class (pred_class)
935-942 F8.6 --- PredClassProb Final predicted class probability
(predclassprob)
--------------------------------------------------------------------------------
Byte-by-byte Description of file: lz-ztfg.dat lz-ztfr.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 32 A32 --- SourceId ID from source catalog (source_id)
34- 43 F10.6 deg RASdeg Right ascension from source catalog
(J2000) (source_ra)
45- 54 F10.6 deg DESdeg Declination from source catalog (J2000)
(source_dec)
56- 73 A18 d SperS Period from source catalog
(source_period)
75- 98 A24 --- SClass Class from source catalog (source_class)
100- 113 A14 --- ClassALeRCE Class for the ALeRCE broker taxonomy
(Sanchez-Saez et al.,
2021AJ....161..141S 2021AJ....161..141S) (classALeRCE)
115- 130 A16 --- SCat Name of the source catalog (source_cat)
132- 140 F9.6 --- Sz ?=- Redshift from the source catalog
(source_redshift)
142- 155 A14 --- ClassDR11 Class adopted in this work (class_DR11)
157- 172 I16 --- ObjectId ZTF DR11 Identification number (objectid)
174- 183 F10.6 deg RAdeg Right ascension (J2000) (ra)
185- 194 F10.6 deg DEdeg Declination (J2000) (dec)
196- 199 I4 --- Nepochs Number of good epochs (catflags=0)
(nepochs)
201- 210 F10.6 deg GLAT Galactic latitude (gal_b)
212- 221 F10.6 deg GLON Galactic Longitude (gal_l)
223- 234 E12.6 --- MHPSratio ?=- Ratio between the low and high
frequency variances for the MHPS model
(MHPS_ratio)
236- 247 E12.6 --- MHPSlow ?=- Low frequency variance for the
MHPS model (MHPS_low)
249- 258 F10.6 --- MHPShigh ?=- High frequency variance for the
MHPS model (MHPS_high)
260- 269 F10.6 mJy SPMA Amplitude of the SPM model (SPM_A)
271- 281 F11.6 d SPMt0 Cero time for the SPM model (SPM_t0)
283- 291 F9.5 d SPMgamma Plateau duration SPM model (SPM_gamma)
293- 300 F8.6 mJy/d SPMbeta Plateau slope SPM model (SPM_beta)
302- 310 F9.5 d SPMtaurise Rise time SPM model (SPMtaurise)
312- 320 F9.5 d SPMtaufall Decline time SPM model (SPMtaufall)
322- 331 F10.6 --- SPMchi Reduced chi2 SPM model (SPM_chi)
333- 344 E12.6 mJy SPMC Baseline flux SPM model (SPM_C)
346- 353 F8.6 mag Amp Half of the difference between the median
of the maximum 5% and of the minimum 5%
magnitudes (Amplitude)
355- 362 F8.6 --- ADarling Test of whether a sample of data comes
from a population with a specific
distribution (AndersonDarling)
364- 366 I3 --- AutocorLength Lag value where the autocorrelation
function becomes smaller than Eta_e
(Autocor_length)
368- 375 F8.6 --- Beyond1Std Percentage of points with photometric mag
that lie beyond 1σ from the mean
(Beyond1Std)
377- 384 F8.6 --- Con Number of three consecutive data points
brighter/fainter than 2sigma of the
light curve (Con)
386- 397 E12.6 --- Etae Ratio of the mean of the squares of
successive mag differences to the
variance of the light curve (Eta_e)
399- 407 F9.6 --- Gskew Median-based measure of the skew (Gskew)
409- 420 E12.6 mag/d MaxSlope Maximum absolute magnitude slope between
two consecutive observations (MaxSlope)
422- 428 F7.4 mag Mean Average magnitude (Mean)
430- 437 F8.6 --- MeanVar Ratio of the standard deviation to the
mean magnitude (Meanvariance)
439- 446 F8.6 mag MedianAbsDev Median discrepancy of the data from the
median data (MedianAbsDev)
448- 455 F8.6 --- MedianBRP Fraction of photometric points within
amplitude/10 of the median mag
(MedianBRP)
457- 465 F9.6 --- PairST Fraction of increasing first differences
minus fraction of decreasing first
differences over the last 30 time-sorted
mag measures (PairSlopeTrend)
467- 474 F8.6 --- PertAmp Largest percentage difference between
either max or min mag and median mag
(PercentAmplitude)
476- 483 F8.6 mag Q31 Difference between the 3rd and the 1st
quartile of the light curve (Q31)
485- 492 F8.6 --- Rcs Range of a cumulative sum (Rcs)
494- 503 F10.6 --- Skew Skewness measure (Skew)
505- 515 F11.6 --- SmallKur Small sample kurtosis of the magnitudes
(SmallKurtosis)
517- 524 F8.6 mag Std Standard deviation of the light curve
(Std)
526- 533 F8.6 --- StetsonK Robust kurtosis measure (StetsonK)
535- 542 F8.6 --- Pvar Probability that the source is
intrinsically variable (Pvar)
544- 555 E12.6 --- ExcessVar Normalized measure of the intrinsic
variability amplitude (ExcessVar)
557- 566 F10.6 mag SFMLAmp rms magnitude difference of the SF,
computed over a 1yr timescale
(SFMLamplitude)
568- 576 F9.6 --- SFMLgamma Logarithmic gradient of the mean change
in magnitude (SFMLgamma)
578- 585 F8.6 --- IARphi Level of autocorrelation using a
discrete-time representation of a
DRW model (IAR_phi)
587- 595 F9.6 mag/d LinearTrend Slope of a linear fit to the light
curve (LinearTrend)
597- 608 E12.6 mag2 GPDRWsigma Amplitude of the variability from DRW
modeling (GPDRWsigma)
610- 621 E12.6 d GPDRWtau ?=- Relaxation time from DRW modeling
(GPDRWtau)
623- 632 F10.6 d Per light curve period (Period)
634- 641 F8.6 --- PPE Multiband Periodogram Pseudo Entropy
(PPE)
643- 650 F8.6 --- PowerR1/4 Ratio between the power of the
periodogram obtained for the best period
candidate (P) and P/4 (Powerrate1/4)
652- 659 F8.6 --- PowerR1/3 Ratio between the power of the
periodogram obtained for the best period
candidate (P) and P/3 (Powerrate1/3)
661- 673 E13.6 --- PowerR1/2 Ratio between the power of the
periodogram obtained for the best period
candidate (P) and P/2 (Powerrate1/2)
675- 687 E13.6 --- PowerR2 Ratio between the power of the
periodogram obtained for the best period
candidate (P) and 2*P (Powerrate2)
689- 701 E13.6 --- PowerR3 Ratio between the power of the
periodogram obtained for the best period
candidate (P) and 3*P (Powerrate3)
703- 715 E13.6 --- PowerR4 Ratio between the power of the
periodogram obtained for the best period
candidate (P) and 4*P (Powerrate4)
717- 724 F8.6 --- PsiCS Range of a cumulative sum applied to
the phase-folded light curve (Psi_CS)
726- 733 F8.6 --- Psieta Eta_e index calculated from the folded
light curve (Psi_eta)
735- 746 E12.6 mag Hmag1 Amplitude of the first component of the
harmonic series (Harmonicsmag1)
748- 759 E12.6 mag Hmag2 Amplitude of the second component of the
harmonic series (Harmonicsmag2)
761- 772 E12.6 mag Hmag3 Amplitude of the third component of the
harmonic series (Harmonicsmag3)
774- 785 E12.6 mag Hmag4 Amplitude of the fourth component of the
harmonic series (Harmonicsmag4)
787- 798 E12.6 mag Hmag5 Amplitude of the fifth component of the
harmonic series (Harmonicsmag5)
800- 811 E12.6 mag Hmag6 Amplitude of the sixth component of the
harmonic series (Harmonicsmag6)
813- 824 E12.6 mag Hmag7 Amplitude of the seventh component of the
harmonic series (Harmonicsmag7)
826- 837 E12.6 --- HPhase2 Phase of the second component of the
harmonic series (Harmonicsphase2)
839- 850 E12.6 --- HPhase3 Phase of the third component of the
harmonic series (Harmonicsphase3)
852- 863 E12.6 --- HPhase4 Phase of the fourth component of the
harmonic series (Harmonicsphase4)
865- 876 E12.6 --- HPhase5 Phase of the fifth component of the
harmonic series (Harmonicsphase5)
878- 889 E12.6 --- HPhase6 Phase of the sixth component of the
harmonic series (Harmonicsphase6)
891- 902 E12.6 --- HPhase7 Phase of the seventh component of the
harmonic series (Harmonicsphase7)
904- 915 E12.6 --- Harmse Mean square error of the harmonic
series modelling (Harmonics_mse)
917- 924 F8.6 --- psScore ?=- PanSTARRS1 morphology score from
Tachibana and Miller 2018 (ps_score)
926- 935 F10.6 mag g-r ?=- PanSTARRS1 g-r color (gps1-rps1)
937- 945 F9.6 mag r-i ?=- PanSTARRS1 r-i color (rps1-ips1)
947- 953 F7.4 mag g-W1 ?=- PanSTARRS1 g - CatWISE W1 color
(gps1-W1)
955- 961 F7.4 mag g-W2 ?=- PanSTARRS1 g - CatWISE W2 color
(gps1-W2)
963- 969 F7.4 mag r-W1 ?=- PanSTARRS1 r - CatWISE W1 color
(rps1-W1)
971- 977 F7.4 mag r-W2 ?=- PanSTARRS1 r - CatWISE W2 color
(rps1-W2)
979- 985 F7.4 mag i-W1 ?=- PanSTARRS1 i - CatWISE W1 color
(ips1-W1)
987- 993 F7.4 mag i-W2 ?=- PanSTARRS1 i - CatWISE W2 color
(ips1-W2)
995-1003 F9.6 mag W1-W2 ?=- CatWISE W1 - W2 color (W1-W2)
--------------------------------------------------------------------------------
Byte-by-byte Description of file: mast.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 33 A33 --- SourceId ID from source catalog (source_id)
35- 44 F10.6 deg RAdeg Right ascension from source catalog (J2000)
(source_ra)
46- 55 F10.6 deg DEdeg Declination from source catalog (J2000)
(source_dec)
57- 74 A18 d Per Period from source catalog (source_period)
76- 99 A24 --- Class Class from source catalog (source_class)
101-114 A14 --- ClassALeRCE Class for the ALeRCE broker taxonomy
(Sanchez-Saez et al., 2021AJ....161..141S 2021AJ....161..141S)
(classALeRCE)
116-131 A16 --- Cat Name of the source catalog (source_cat)
133-141 F9.6 --- z ?=- Redshift from the source catalog
(source_redshift)
143-156 A14 --- ClassDR11 Class adopted in this work (class_DR11)
--------------------------------------------------------------------------------
Acknowledgements:
Paula Sanchez Sa1ez, pasanchezsaez(at)gmail.com
(End) Patricia Vannier [CDS] 14-Apr-2023