Bayesian adaptation and combination of deep models for automatic speech recognition

Zhen Huang

Abstracts

by Zhen Huang

Institution:	Georgia Tech
Department:
Year:	2017
Keywords:	Deep neural networks
Posted:	02/01/2018
Record ID:	2219819
Full text PDF:	http://hdl.handle.net/1853/58653

Abstract

The objective of the proposed research is to deploy a Bayesian adaptation and combination framework for deep model based automatic speech recognition systems to combat the degradation of the recognition accuracy, which is typically observed under potential mismatched conditions between training and testing. This dissertation addresses the problem in three directions. The first direction is to perform Bayesian adaptation directly on the discriminative deep neural network models. Maximum a posteriori estimation and multi-task learning techniques are employed in the manner of regularization in the deep neural network updating formula. In the second direction, deep neural network is cast into a generative model to better leverage Bayesian techniques. Classic structured maximum a posteriori adaption is adopted by using bottleneck features derived from deep neural networks. In the third direction, a hierarchical Bayesian system combination technique is employed to further enhance the adaptation performance by leveraging the complementarity of the discriminative and generative adaptive models.Advisors/Committee Members: Lee, Chin-Hui (advisor), Juang, Biing-Hwang (committee member), Clements, Mark A. (committee member), Li, Geoffrey Ye (committee member), Siniscalchi, Sabato Marco (committee member).