Loading [MathJax]/extensions/TeX/boldsymbol.js

Thursday, May 22, 2014

[paper] Probabilistic linear discriminant Analysis for acoustic model

Link to the paper: http://homepages.inf.ed.ac.uk/srenals/plda-spl2014.pdf

PLDA is formulated by a generative model, where an acoustic feature vector \boldsymbol{y}_t from the j-th HMM state at time index t can be expressed as

\boldsymbol{y}_t | j, m = \boldsymbol{U}_m \boldsymbol{x}_{jmt} + \boldsymbol{G}_m \boldsymbol{z}_{jm} + \boldsymbol{b}_m + \epsilon_{mt},

where m is the Gaussian component index of the GMM for state j.

\boldsymbol{z}_{jm} is the component dependent variable, shared by the whole set of acoustic feature frames generated by the j-th state's m-th Gaussian.

\boldsymbol{x}_{jmt} is the channel variable which explains the per-frame variations.

In their work, the prior distributions of \boldsymbol{z}_{jm} and \boldsymbol{x}_{jmt} are assumed to be \mathcal{N}(\boldsymbol{0}, \boldsymbol{I}).

\boldsymbol{b} denotes the bias.

\epsilon_t is the residual noise which is Gaussian with a zero mean and diagonal covariance, i.e. \epsilon_t \sim \mathcal{N}(\boldsymbol{0}, \boldsymbol{\lambda})



No comments:

Post a Comment