An elementary introduction to information geometry

Nielsen, Frank

arXiv.org Machine Learning 

We present a concise and modern view of the basic structures lying at the heart of Information Geometry (IG), and report some applications of those information-geometric manifolds (termed "information manifolds") in statistics (Bayesian hypothesis testing) and machine learning (statistical mixture clustering). By analogy to Information Theory (IT) pioneered by Claude Shannon [62] (in 1948) which considers primarily the communication of messages over noisy transmission channels, we may define Information Sciences as the fields that study "communication" between (noisy/imperfect) data and families of models (postulated as a priori knowledge). In short, Information Sciences (IS) seek methods to distill information from data to models. Thus, information sciences encompass information theory but also include Probability & Statistics, Machine Learning (ML), Artificial Intelligence (AI), Mathematical Programming, just to name a few areas. In §5.2, we review some key milestones of information geometry and report some definitions of the field by its pioneers. A modern and broad definition of information geometry can be stated as the field that studies the geometry of decision making. This definition also includes model fitting (inference) that can be interpreted as a decision problem as illustrated in Figure 1: Namely, deciding which model parameter to choose from a family of parametric models. This framework was advocated by Abraham Wald [72, 73, 17] who considered all statistical problems as statistical decision problems. Distances play a crucial role not only for measuring the goodness-of-fit of data to model (say, likelihood in statistics, classifier loss functions in ML, objective functions in mathematical programming, etc.) but also for measuring the discrepancy (or deviance) between models.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found