J/A+A/576/A96 New code for PCA in spectral analysis (Bu+, 2015)
Restricted Boltzmann machine: a non-linear substitute for PCA in spectral
processing.
Bu Y., Zhao G., Luo A-l., Pan J., Chen Y.
<Astron. Astrophys. 576, A96 (2015)>
=2015A&A...576A..96B 2015A&A...576A..96B
ADC_Keywords: Models ; Spectroscopy
Keywords: methods: statistical - methods: data analysis - methods: numerical
Abstract:
Principal component analysis (PCA) is widely used to repair incomplete
spectra, to perform spectral denoising, and to reduce dimensionality.
Presently, no method has been found to be comparable to PCA on these
three problems. New methods have been proposed, but are often specific
to one problem. For example, locally linear embedding outperforms PCA
in dimensionality reduction. However, it cannot be used in spectral
denoising and spectral reparing. Wavelet transform can be used to
denoise spectra; however, it cannot be used in dimensionality
reduction.
We provide a new method that can substitute PCA in incomplete spectra
repairing, spectral denoising and spectral dimensionality reduction.
A new method, restricted Boltzmann machine (RBM), is introduced in
spectral processing. RBM is a particular type of Markov random field
with two-layer architecture, and use Gibbs sampling method to train
the algorithm. It can be used in spectral denoising, dimensionality
reduction and spectral repairing.
Description:
Source code of RBM algorithm which can be executed in MATLAB 2013.
This code is based on the more general code by Hinton and coauthors
(Hinton & Salakhutdinov 2006).
The supplementary material contains two files. We only need to run
file rbm_dimreduction to reduce the dimension of the data. The other
file is the function file that will be called when running
rbm_dimreduction.
Data format. The format of data should be MAT-file format. Each row of
the data matrix X=[x1T,x2T,...,xnT]T represents a
spectrum vector.
Variables in the script file rbm_dimreduction. It needs two inputs to
run rbm_dimreduction: variable h, the dimension of the hidden vector;
and variable T1, the spectra data. After running rbm_dimreduction, we
can get the output variable rbm. Variable rbm.hiddata is the data set
of RBM components. Each row of rbm.hiddata represents the low
dimensional projection of a spectrum vector. Variable rbm.rec is the
data set of the reconstructed spectra (denosied or repaired spectra).
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
rbm_train.m 86 134 Function file calling by rbm_dimreduction.m
rbm_dim.m 140 17 File used to reduce the dimension of the data
--------------------------------------------------------------------------------
Acknowledgements:
Yude Bu, 123974934(at)qq.com
(End) Patricia Vannier [CDS] 08-Apr-2015