AITopics

The recent success of large language models gives new urgency to the question of how model performance should be evaluated. In many tasks, models can be evaluated for the accuracy of their outputs. However, models can also be evaluated along other important dimensions. For example, we can assess models for the transparency or interpretability of their judgments (Creel 2020; Vredenburgh 2022). We can also assess models for the presence of problematic biases (Kelly 2023; Johnson 2020). Most work on biases in large language models focuses on a conception of bias closely tied to unfairness, especially as affecting marginalized social groups. However, recent work has alleged that large language models also show a number of classic cognitive biases familiar from work in the psychology of reasoning, behavioral economics, and judgment and decisionmaking (Dasgupta et al. 2022; Lin and Ng 2023; Jones and Steinhardt 2022). This development is exciting because it raises the possibility of using cognitive bias as a novel metric by which to evaluate the performance of large language models.

cognitive bias, language model, reasoning, (16 more...)

2311.10932

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Africa (0.04)
North America > Puerto Rico > Peñuelas > Peñuelas (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Jafarigol, Elaheh, Trafalis, Theodore, Razzaghi, Talayeh, Zamankhani, Mona

Exploring Machine Learning Models for Federated Learning: A Review of Approaches, Performance, and Limitations

In the growing world of artificial intelligence, federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data. Federated learning lays the groundwork for collaborative research in areas where the data is sensitive. Federated learning has several implications for real-world problems. In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data. This distributed approach enables us to leverage information from multiple sources and gain more diverse insights. This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specifically, we have presented an extensive review of supervised/unsupervised machine learning algorithms, ensemble methods, meta-heuristic approaches, blockchain technology, and reinforcement learning used in the framework of federated learning, in addition to an overview of federated learning applications. This paper reviews the literature on the components of federated learning and its applications in the last few years. The main purpose of this work is to provide researchers and practitioners with a comprehensive overview of federated learning from the machine learning point of view. A discussion of some open problems and future research directions in federated learning is also provided.

application, federated learning, learning, (12 more...)

2311.10832

Country:

North America > United States > Oklahoma (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(5 more...)

Adaptive Modelling Approach for Row-Type Dependent Predictive Analysis (RTDPA): A Framework for Designing Machine Learning Models for Credit Risk Analysis in Banking Sector

Rath, Minati, Date, Hema

In many real-world datasets, rows may have distinct characteristics and require different modeling approaches for accurate predictions. In this paper, we propose an adaptive modeling approach for row-type dependent predictive analysis(RTDPA). Our framework enables the development of models that can effectively handle diverse row types within a single dataset. Our dataset from XXX bank contains two different risk categories, personal loan and agriculture loan. each of them are categorised into four classes standard, sub-standard, doubtful and loss. We performed tailored data pre processing and feature engineering to different row types. We selected traditional machine learning predictive models and advanced ensemble techniques. Our findings indicate that all predictive approaches consistently achieve a precision rate of no less than 90%. For RTDPA, the algorithms are applied separately for each row type, allowing the models to capture the specific patterns and characteristics of each row type. This approach enables targeted predictions based on the row type, providing a more accurate and tailored classification for the given dataset.Additionally, the suggested model consistently offers decision makers valuable and enduring insights that are strategic in nature in banking sector.

dataset, prediction, row type, (12 more...)

2311.10799

Country:

Asia > India > Maharashtra > Mumbai (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.89)

Industry:

Banking & Finance > Loans (1.00)
Banking & Finance > Credit (1.00)
Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(5 more...)

Schmitt, Marvin, Radev, Stefan T., Bürkner, Paul-Christian

Fuse It or Lose It: Deep Fusion for Multimodal Simulation-Based Inference

We present multimodal neural posterior estimation (MultiNPE), a method to integrate heterogeneous data from different sources in simulation-based inference with neural networks. Inspired by advances in attention-based deep fusion learning, it empowers researchers to analyze data from different domains and infer the parameters of complex mathematical models with increased accuracy. We formulate different multimodal fusion approaches for MultiNPE (early, late, and hybrid) and evaluate their performance in three challenging numerical experiments. MultiNPE not only outperforms na\"ive baselines on a benchmark model, but also achieves superior inference on representative scientific models from neuroscience and cardiology. In addition, we systematically investigate the impact of partially missing data on the different fusion strategies. Across our different experiments, late and hybrid fusion techniques emerge as the methods of choice for practical applications of multimodal simulation-based inference.

fusion, inference, neural network, (17 more...)

2311.10671

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Catulo, João Simões, Soares, Cláudia, Guimarães, Marta

Predicting the Probability of Collision of a Satellite with Space Debris: A Bayesian Machine Learning Approach

Space is becoming more crowded in Low Earth Orbit due to increased space activity. Such a dense space environment increases the risk of collisions between space objects endangering the whole space population. Therefore, the need to consider collision avoidance as part of routine operations is evident to satellite operators. Current procedures rely on the analysis of multiple collision warnings by human analysts. However, with the continuous growth of the space population, this manual approach may become unfeasible, highlighting the importance of automation in risk assessment. In 2019, ESA launched a competition to study the feasibility of applying machine learning in collision risk estimation and released a dataset that contained sequences of Conjunction Data Messages (CDMs) in support of real close encounters. The competition results showed that the naive forecast and its variants are strong predictors for this problem, which suggests that the CDMs may follow the Markov property. The proposed work investigates this theory by benchmarking Hidden Markov Models (HMM) in predicting the risk of collision between two resident space objects by using one feature of the entire dataset: the sequence of the probability in the CDMs. In addition, Bayesian statistics are used to infer a joint distribution for the parameters of the models, which allows the development of robust and reliable probabilistic predictive models that can incorporate physical or prior knowledge about the problem within a rigorous theoretical framework and provides prediction uncertainties that nicely reflect the accuracy of the predicted risk. This work shows that the implemented HMM outperforms the naive solution in some metrics, which further adds to the idea that the collision warnings may be Markovian and suggests that this is a powerful method to be further explored.

cdm, prediction, sequence, (14 more...)

2311.10633

Country: Europe > Portugal (0.04)

Genre:

Research Report > New Finding (0.54)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Tomlinson, Kiran, Benson, Austin R.

Graph-Based Methods for Discrete Choice

Choices made by individuals have widespread impacts--for instance, people choose between political candidates to vote for, between social media posts to share, and between brands to purchase--moreover, data on these choices are increasingly abundant. Discrete choice models are a key tool for learning individual preferences from such data. Additionally, social factors like conformity and contagion influence individual choice. Traditional methods for incorporating these factors into choice models do not account for the entire social network and require hand-crafted features. To overcome these limitations, we use graph learning to study choice in networked contexts. We identify three ways in which graph learning techniques can be used for discrete choice: learning chooser representations, regularizing choice model parameters, and directly constructing predictions from a network. We design methods in each category and test them on real-world choice datasets, including county-level 2016 US election results and Android app installation and usage data. We show that incorporating social network structure can improve the predictions of the standard econometric choice model, the multinomial logit. We provide evidence that app installations are influenced by social context, but we find no such effect on app usage among the same participants, which instead is habit-driven. In the election data, we highlight the additional insights a discrete choice framework provides over classification or regression, the typical approaches. On synthetic data, we demonstrate the sample complexity benefit of using social information in choice models.

choice model, chooser, laplacian regularization, (15 more...)

doi: 10.1017/nws.2023.20

2205.11365

Country:

North America > United States > California (0.14)
North America > United States > South Dakota > Oglala Lakota County (0.14)
North America > United States > Utah (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Bencomo, Gianluca M., Snell, Jake C., Griffiths, Thomas L.

Implicit Maximum a Posteriori Filtering via Adaptive Optimization

arXiv.org Machine LearningNov-17-2023

Bayesian filtering approximates the true underlying behavior of a time-varying system by inverting an explicit generative model to convert noisy measurements into state estimates. This process typically requires either storage, inversion, and multiplication of large matrices or Monte Carlo estimation, neither of which are practical in high-dimensional state spaces such as the weight spaces of artificial neural networks. Instead of maintaining matrices for the filtering equations or simulating particles, we specify an optimizer that defines the Bayesian filter implicitly. In the linear-Gaussian setting, we show that every Kalman filter has an equivalent formulation using K steps of gradient descent. In the nonlinear setting, our experiments demonstrate that our framework results in filters that are effective, robust, and scalable to high-dimensional systems, comparing well against the standard toolbox of Bayesian filtering solutions. We suggest that it is easier to fine-tune an optimizer than it is to specify the correct filtering equations, making our framework an attractive option for high-dimensional filtering problems. Time-varying systems are ubiquitous in science, engineering, and machine learning. Consider a multielectrode array receiving raw voltage signals from thousands of neurons during a visual perception task. The goal is to infer some underlying neural state that is not directly observable, such that we can draw connections between neural activity and visual perception, but raw voltage signals are a sparse representation of neural activity that is shrouded in noise. To confound the problem further, the underlying neural state changes throughout time in both expected and unexpected ways. This problem, and most time-varying prediction problems, can be formalized as a probablistic state space model where latent variables evolve over time and emit observations (Simon, 2006). One solution to such a problem is to apply a Bayesian filter, a type of probabilistic model that can infer the values of latent variables from observations.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2311.1058

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Hosseini, Bamdad, Hsu, Alexander W., Taghvaei, Amirhossein

Conditional Optimal Transport on Function Spaces

arXiv.org Machine LearningNov-17-2023

We present a systematic study of conditional triangular transport maps in function spaces from the perspective of optimal transportation and with a view towards amortized Bayesian inference. More specifically, we develop a theory of constrained optimal transport problems that describe block-triangular Monge maps that characterize conditional measures along with their Kantorovich relaxations. This generalizes the theory of optimal triangular transport to separable infinite-dimensional function spaces with general cost functions. We further tailor our results to the case of Bayesian inference problems and obtain regularity estimates on the conditioning maps from the prior to the posterior. Finally, we present numerical experiments that demonstrate the computational applicability of our theoretical results for amortized and likelihood-free inference of functional parameters.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2311.05672

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > New York (0.04)
North America > United States > Michigan (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)

arXiv.org Machine LearningNov-17-2023

Inferential Moments of Uncertain Multivariable Systems

Vanslette, Kevin

This article expands the framework of Bayesian inference and provides direct probabilistic methods for approaching inference tasks that are typically handled with information theory. We treat Bayesian probability updating as a random process and uncover intrinsic quantitative features of joint probability distributions called inferential moments. Inferential moments quantify shape information about how a prior distribution is expected to update in response to yet to be obtained information. Further, we quantify the unique probability distribution whose statistical moments are the inferential moments in question. We find a power series expansion of the mutual information in terms of inferential moments, which implies a connection between inferential theoretic logic and elements of information theory. Of particular interest is the inferential deviation, which is the expected variation of the probability of one variable in response to an inferential update of another. We explore two applications that analyze the inferential deviations of a Bayesian network to improve decision-making. We implement simple greedy algorithms for exploring sensor tasking using inferential deviations that generally outperform similar greedy mutual information algorithms in terms of root mean squared error between epistemic probability estimates and the ground truth probabilities they are estimating.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2305.01841

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Lopez, L. Julian Lechuga, Rudner, Tim G. J., Shamout, Farah E.

Informative Priors Improve the Reliability of Multimodal Clinical Data Classification

arXiv.org Artificial IntelligenceNov-16-2023

Machine learning-aided clinical decision support has the potential to significantly improve patient care. However, existing efforts in this domain for principled quantification of uncertainty have largely been limited to applications of ad-hoc solutions that do not consistently improve reliability. In this work, we consider stochastic neural networks and design a tailor-made multimodal data-driven (M2D2) prior distribution over network parameters. We use simple and scalable Gaussian mean-field variational inference to train a Bayesian neural network using the M2D2 prior. We train and evaluate the proposed approach using clinical time-series data in MIMIC-IV and corresponding chest X-ray images in MIMIC-CXR for the classification of acute care conditions. Our empirical results show that the proposed method produces a more reliable predictive model compared to deterministic and Bayesian neural network baselines.

disease 0, neural network, reliability, (15 more...)

2312.00794

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.65)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)