AbstractsComputer Science

Approximation Strategies for Structure Learning in Bayesian Networks

by Teppo Niinimäki




Institution: University of Helsinki
Department:
Year: 2015
Keywords: tietojenkäsittelytiede
Posted: 02/05/2017
Record ID: 2109874
Full text PDF: http://hdl.handle.net/10138/156050


Abstract

Bayesian networks are probabilistic graphical models, which can compactly represent complex probabilistic dependencies between a set of variables. Once learned from data or constructed by some other means, they can both give insight into the modeled domain and be used for probabilistic reasoning tasks, such as prediction of future data points. Learning a Bayesian network consists of two tasks: discovering a graphical dependency structure on variables, and finding the numerical parameters of a conditional distribution for each variable. Structure discovery has attracted considerable interest in the recent decades. Attention has mostly been paid to finding a structure that best fits the data under certain criterion. The optimization approach can lead to noisy and partly arbitrary results due to the uncertainty caused by a small amount of data. The so-called full Bayesian approach addresses this shortcoming by learning the posterior distribution of structures. In practice, the posterior distribution is summarized by constructing a representative sample of structures, or by computing marginal posterior probabilities of individual arcs or other substructures. This thesis presents algorithms for the full Bayesian approach to structure learning in Bayesian networks. Because the existing exact algorithms only scale to small networks of up to about 25 variables, we investigate sampling based, Monte Carlo methods. The state-of-the-art sampling algorithms draw orderings of variables along a Markov chain. We propose several improvements to this algorithm. First, we show that sampling partial orders instead of linear orders can lead to radically improved mixing of the Markov chain and consequently better estimates. Second, we suggest replacing Markov chain Monte Carlo by annealed importance sampling. This can further improve the accuracy of estimates and has also other advantages such as independent samples and easy parallelization. Third, we propose a way to correct the bias that is caused by sampling orderings of variables instead of structures. Fourth, we present an algorithm that can significantly speed up per-sample computations via approximation. In addition, the thesis proposes a new algorithm for so-called local learning of the Bayesian network structure. In local learning the task is to discover the neighborhood of a given target variable. In contrast to previous algorithms that are based on conditional independence tests between variables, our algorithm gives scores to larger substructures. This approach often leads to more accurate results. Bayes-verkot ovat todennäköisyysmalleja, joilla voidaan mallintaa muuttujien välisiä monimutkaisia tilastollisia riippuvuuksia. Havainnollisuutensa vuoksi ne voivat auttaa mallinnuksen kohteena olevan ilmiön syvemmässä ymmärtämisessä. Muun muassa tästä syystä Bayes-verkkoja halutaan muodostaa eli 'oppia' automaattisesti havaintoaineistojen perusteella. Tärkein ja samalla vaikein tehtävä Bayes-verkon oppimisessa on suunnatusta verkosta koostuvan rakenteen muodostaminen.…