Giorgos Felekis will present his master thesis on Wednesday 13 January 2021 at 12.00, entitled Generalised Variational Inference posteriors in Probabilistic Deep Learning.
Abstract
The ultimate goal of machine learning models is to make reliable predictions and automated decisions as efficiently as possible. Machine learning and especially deep learning has attracted a lot of attention from information engineering fields such as computer vision, voice recognition, language and image processing but also from scientific and critical decision-making fields such as physics, biology and medical diagnosis. Representing model uncertainty in the latter ones is of utmost importance as we do not want our model to make mistakes by any chance, thus it is clear that it is important to know what we don’t know. Nowadays the most popular neural network architectures (ReLU networks) only return point estimates of parameters and predictions, typically lacking a representation of uncertainty, and even when they return probability values (multiclass classification) they suer from overconfidence. So, there is a need to come up with new models (or adapt the current architectures) that not only provide point estimates, but also incorporate a confidence measure. In this direction, Bayesian methods provide a natural probabilistic representation of uncertainty in deep learning. Bayesian modelling is the gold standard method to capture uncertainty in a really simple way. Instead of having point estimates as an output, in the Bayesian set up the output is a posterior distribution derived by our well-known Bayes rule. Usually, this is assumed to be a Gaussian distribution in advance and so all we need to do is to predict the mean and the variance of it. The main issue of Bayesian modelling is that most of the times it is extremely computationally expensive and time-consuming as most of these posterior distributions are intractable. The recent success of these methods in practice, including the creation of Bayesian Deep Learning field, relies on Approximate Inference methods. These Approximation schemes perform Bayesian reasoning at a relatively low cost (with respect to time and memory) when compared to traditional methods, as they are not computing straight the posterior distribution but instead they create approximations of it either in a stochastic way (MCMC) or in a deterministic one (Variational Inference) and thus allow Bayesian modelling to be applied to many practical tasks. In this work we are going to mainly focus on the Variational Inference methods, first because they are currently the most popular ones in the research community, but also to appoint core issues related with these methods and hence to show their inability to appropriately tackle real-world problems. We are going to do that to motivate a recently published framework, by T.Damoulas, J.Knoblauch, J.Jewson, called Generalised Variational Inference (GVI) which is a generalisation of standard Bayesian and Variational Inference strategies and seems to overcome a lot of drawbacks of standard Variational Inference. GVI can be seen as a generalisation of Bayesian Inference that is specified by an optimisation problem over a space of probability measures with three independent arguments: a loss, a divergence and a variational family. By advocating an optimisation-view of Bayesian Modelling, GVI posteriors can be seen as the optimal Q-constrained solution to the optimisation problem. We motivate GVI especially in the context of Bayesian Neural Networks where, most of the time, they suffer from certain inappropriate assumptions related to the prior, the likelihood and the computational resources. We find that, in certain cases, approximate posterior distributions derived from Generalised Variational Inference offer attractive properties with respect to uncertainty quantification, consistency and predictive performance.
Short CV
Giorgos Felekis has graduated from the Department of Mathematics of the National and Kapodistrian University of Athens. Last September he graduated from UCL university after completing his MSc in Machine Learning. In the past, he has worked for the Computational Intelligence Laboratory and the Institute of Informatics & Telecommunications of “Demokritos” Research Center in the area of Deep Learning for Remote Sensing. His research interests are mainly focused on the areas of Probabilistic Deep Learning and Computational Neuroscience.