Coding & Learning: [paper] Probabilistic linear discriminant Analysis for acoustic model

Thursday, May 22, 2014

Link to the paper: http://homepages.inf.ed.ac.uk/srenals/plda-spl2014.pdf

PLDA is formulated by a generative model, where an acoustic feature vector

$\boldsymbol{y}_t$ from the

$j$ -th HMM state at time index

$t$ can be expressed as

$\boldsymbol{y}_t | j, m = \boldsymbol{U}_m \boldsymbol{x}_{jmt} + \boldsymbol{G}_m \boldsymbol{z}_{jm} + \boldsymbol{b}_m + \epsilon_{mt}$ ,

where

$m$ is the Gaussian component index of the GMM for state

$j$ .

$\boldsymbol{z}_{jm}$ is the component dependent variable, shared by the whole set of acoustic feature frames generated by the

$j$ -th state's

$m$ -th Gaussian.

$\boldsymbol{x}_{jmt}$ is the channel variable which explains the per-frame variations.

In their work, the prior distributions of

$\boldsymbol{z}_{jm}$ and

$\boldsymbol{x}_{jmt}$ are assumed to be

$\mathcal{N}(\boldsymbol{0}, \boldsymbol{I})$ .

$\boldsymbol{b}$ denotes the bias.

$\epsilon_t$ is the residual noise which is Gaussian with a zero mean and diagonal covariance, i.e.

$\epsilon_t \sim \mathcal{N}(\boldsymbol{0}, \boldsymbol{\lambda})$

Coding & Learning