|University of Manchester
|GRBAS; Machine learning; Digital Signal Processing; Voice Quality Assessment
|Full text PDF:
Vocal cord vibration is the source of voiced phonemes in speech. Voice quality depends on the nature of this vibration. Vocal cords can be damaged by infection, neck or chest injury, tumours and more serious diseases such as laryngeal cancer. This kind of physical damage can cause loss of voice quality. To support the diagnosis of such conditions and also to monitor the effect of any treatment, voice quality assessment is required. Traditionally, this is done ‘subjectively’ by Speech and Language Therapists (SLTs) who, in Europe, use a well-known assessment approach called ‘GRBAS’. GRBAS is an acronym for a five dimensional scale of measurements of voice properties. The scale was originally devised and recommended by the Japanese Society of Logopeadics and Phoniatrics and several European research publications. The proper- ties are ‘Grade’, ‘Roughness’, ‘Breathiness’, ‘Asthenia’ and ‘Strain’. An SLT listens to and assesses a person’s voice while the person performs specific vocal maneuvers. The SLT is then required to record a discrete score for the voice quality in range of 0 to 3 for each GRBAS component. In requiring the services of trained SLTs, this subjective assessment makes the traditional GRBAS procedure expensive and time-consuming to administer. This thesis considers the possibility of using computer programs to perform objective assessments of voice quality conforming to the GRBAS scale. To do this, Digital Signal Processing (DSP) algorithms are required for measuring voice features that may indicate voice abnormality. The computer must be trained to convert DSP measurements to GRBAS scores and a ‘machine learning’ approach has been adopted to achieve this. This research was made possible by the development, by Manchester Royal Infirmary (MRI) Hospital Trust, of a ‘speech database’ with the participation of clinicians, SLT’s, patients and controls. The participation of five SLTs scorers allowed norms to be established for GRBAS scoring which provided ‘reference’ data for the machine learning approach. To support the scoring procedure carried out at MRI, a software package, referred to as GRBAS Presentation and Scoring Package (GPSP), was developed for presenting voice recordings to each of the SLTs and recording their GRBAS scores. A means of assessing intra-scorer consistency was devised and built into this system. Also, the assessment of inter-scorer consistency was advanced by the invention of a new form of the ‘Fleiss Kappa’ which is applicable to ordinal as well as categorical scoring. The means of taking these assessments of scorer consistency into account when producing ‘reference’ GRBAS scores are presented in this thesis. Such reference scores are required for training the machine learning algorithms. The DSP algorithms required for feature measurements are generally well known and available as published or commercial software packages. However, an appraisal of these algorithms and the development of some DSP ‘thesis software’ was found to be necessary. Two ‘machine learning’ regression models… Advisors/Committee Members: Lujan Moreno, Mikel Lujan.