AbstractsComputer Science

Speaker Localization, tracking and remote speech pickup in a conference room.

by Yasir Malik




Institution: Blekinge Institute of Technology
Department:
Year: 2009
Keywords: signalbehandling; signal processing; speaker localization; tracking; srp-phat.
Record ID: 1341486
Full text PDF: http://www.bth.se/fou/cuppsats.nsf/6753b78eb2944e0ac1256608004f0535/f9bbbe89af99cf55c12576620040b35f?OpenDocument


Abstract

Effective speech communication using microphone Array is getting significant research in speech acquisition methods such as speaker localization and tracking. Localization techniques play an important role for automatic camera in videoconferencing system and for other human machine interfaces. To locate the accurate Direction Of Arrival (DOA) from the source, it is necessary to design a suitable microphone array system with minimum internal hardware noise and more efficient localization algorithm. There are many algorithms developed for estimating the number of sources and locating the DOA, such as Bayesian algorithm, kalman filtering, Generalized Cross Correlation (GCC) and Steered Response Power (SRP) algorithm. But SRP algorithm with its steered beam forming technique for speaker localization is more robust using microphone array. The Phase Alignment Transform (PHAT) has gained a lot of attention in the recent research for its quite robust response in low noise, but reverberant environment. So combining SRP-PHAT will become the robust localizer in reverberant environment. This project aims at designing and installing a remote speech pickup system functioning as a frontend to a VoIP system in the biometric lab. A large microphone array is designed and installed on the ceiling of the biometric lab and integrated it with a signal processing software suit for speaker localization and tracking, SRP-PHAT algorithm is used as a localizer. Experiments were done on real time recorded data of human talkers. The algorithm gives accurate DOA from the dominant speaker and is suitable for real time processing.