Goto

Collaborating Authors

 Hartford County







A Theorem Proofs

Neural Information Processing Systems

In this section, we present the proofs to the theorems introduced in the main paper. The proof to Theorem 2 is presented as follows. Consider a classification task where the loss function is the cross entropy loss. This approximately holds for many applications with over-parameterized neural predictors. In this case, we have the following theorem: Theorem 3. If Equations (18) and (19) hold, that This contradicts with Equation (23).


Distribution-free inference for LightGBM and GLM with Tweedie loss

arXiv.org Machine Learning

Prediction uncertainty quantification is a key research topic in recent years scientific and business problems. In insurance industries (\cite{parodi2023pricing}), assessing the range of possible claim costs for individual drivers improves premium pricing accuracy. It also enables insurers to manage risk more effectively by accounting for uncertainty in accident likelihood and severity. In the presence of covariates, a variety of regression-type models are often used for modeling insurance claims, ranging from relatively simple generalized linear models (GLMs) to regularized GLMs to gradient boosting models (GBMs). Conformal predictive inference has arisen as a popular distribution-free approach for quantifying predictive uncertainty under relatively weak assumptions of exchangeability, and has been well studied under the classic linear regression setting. In this work, we propose new non-conformity measures for GLMs and GBMs with GLM-type loss. Using regularized Tweedie GLM regression and LightGBM with Tweedie loss, we demonstrate conformal prediction performance with these non-conformity measures in insurance claims data. Our simulation results favor the use of locally weighted Pearson residuals for LightGBM over other methods considered, as the resulting intervals maintained the nominal coverage with the smallest average width.


A Variational Information Theoretic Approach to Out-of-Distribution Detection

arXiv.org Artificial Intelligence

We present a theory for the construction of out-of-distribution (OOD) detection features for neural networks. We introduce random features for OOD through a novel information-theoretic loss functional consisting of two terms, the first based on the KL divergence separates resulting in-distribution (ID) and OOD feature distributions and the second term is the Information Bottleneck, which favors compressed features that retain the OOD information. We formulate a variational procedure to optimize the loss and obtain OOD features. Based on assumptions on OOD distributions, one can recover properties of existing OOD features, i.e., shaping functions. Furthermore, we show that our theory can predict a new shaping function that out-performs existing ones on OOD benchmarks. Our theory provides a general framework for constructing a variety of new features with clear explainability.


The Value of Information in Multi-Scale Feedback Systems

arXiv.org Artificial Intelligence

Complex adaptive systems (CAS) can be described as systems of information flows dynamically interacting across scales in order to adapt and survive. CAS often consist of many components that work towards a shared goal, and interact across different informational scales through feedback loops, leading to their adaptation. In this context, understanding how information is transmitted among system components and across scales becomes crucial for understanding the behavior of CAS. Shannon entropy, a measure of syntactic information, is often used to quantify the size and rarity of messages transmitted between objects and observers, but it does not measure the value that information has for each specific observer. For this, semantic and pragmatic information have been conceptualized as describing the influence on an observer's knowledge and actions. Building on this distinction, we describe the architecture of multi-scale information flows in CAS through the concept of Multi-Scale Feedback Systems, and propose a series of syntactic, semantic and pragmatic information measures to quantify the value of information flows. While the measurement of values is necessarily context-dependent, we provide general guidelines on how to calculate semantic and pragmatic measures, and concrete examples of their calculation through four case studies: a robotic collective model, a collective decision-making model, a task distribution model, and a hierarchical oscillator model. Our results contribute to an informational theory of complexity, aiming to better understand the role played by information in the behavior of Multi-Scale Feedback Systems.


Subtitling Your Life

The New Yorker

A little over thirty years ago, when he was in his mid-forties, my friend David Howorth lost all hearing in his left ear, a calamity known as single-sided deafness. "It happened literally overnight," he said. "My doctor told me, 'We really don't understand why.' " At the time, he was working as a litigator in the Portland, Oregon, office of a large law firm. His hearing loss had no impact on his job--"In a courtroom, you can get along fine with one ear"--but other parts of his life were upended. The brain pinpoints sound sources in part by analyzing minute differences between left-ear and right-ear arrival times, the same process that helps bats and owls find prey they can't see.