AbstractsComputer Science

Studies on unsupervised and weakly supervised methods in computational modeling of early language acquisition

by Okko Räsänen




Institution: Aalto University
Department:
Year: 2013
Keywords: Computer science; Linguistics; computational modeling; language acquisition; pattern discovery; speech processing; cognitive modeling; speech segmentation; unsupervised learning; laskennallinen mallinnus; kielenoppiminen; hahmojen etsintä; puheenkäsittely; kognitiivinen mallinnus; puheen segmentointi; ohjaamaton oppiminen
Record ID: 1131689
Full text PDF: https://aaltodoc.aalto.fi/handle/123456789/10247


Abstract

This thesis addresses computational modeling of early language acquisition using statistical learning mechanisms. There is a constantly increasing amount of evidence from experimental psychology and brain imaging studies that human infants are sensitive to the statistical structure of sensory input and that their ability to extract statistics of speech signals plays a central role in learning of the native language. The idea of domain-general statistical learning mechanisms in language acquisition is in contrast to the nativist view of language acquisition, in which many language-specific innate factors have been traditionally assumed to exist in the human brain. This thesis presents a series of computational studies addressing the questions of what kind of representations are learnable from speech signals and what kind of computational mechanisms are needed for the learning. The core idea is to model language acquisition from the perspective of a tabula rasa agent that does not have any advance knowledge of language or its relevant units such as phones, phonemes, syllables, or words, but simply comes into being with a number of generic statistical learning algorithms. When exposed to speech input in different experimental settings, these algorithms then start to model recurring patterns in the data and link these patterns to contextual variables such as simulated visual input associated with the speech contents. From a machine learning perspective, the studied methods correspond to unsupervised and weakly supervised machine learning algorithms, since language learning takes place without explicit supervision. As a result of these studies, it is shown that spoken words can be learned from continuous speech based on the statistical structure of the speech input and without assuming a phonetic or other linguistically motivated intermediate representation of language. Different strategies for grounding the acoustic word patterns into their visual referents are also studied, and new methods for segmentation of speech into phone-like units and clustering of acoustic features into discrete categories are presented. Finally, it is shown that frequency characteristics of the human auditory system can also be derived from the statistics of speech signals, suggesting that distributional learning in auditory perception may not be limited to learning of linguistic representations of speech. Tämä väitöskirja käsittelee varhaisen kielenoppimisen laskennallista mallinnusta hyödyntäen tilastollisia oppimismenetelmiä. Jatkuvasti kasvava määrä kokeellisen psykologian ja aivotutkimuksen tutkimuksia on osoittanut että ihmislapset ovat herkkiä aistiärsykkeiden tilastollisille ominaisuuksille, ja että näillä tilastollisilla ominaisuuksilla on keskeinen rooli varhaisessa äidinkielen kehityksessä. Ajatus kielen omaksumisesta pelkkänä mukautumisena aistiärsykkeiden rakenteellisiin ominaisuuksiin ilman synnynnäisiä kielispesifejä oppimismekanismeja on ristiriidassa niin kutsutun perinteisen nativistisen ajattelumallin kanssa.…