AITopics

2311.15691

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia (0.04)
North America > United States > Oregon (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Law > Statutes (0.88)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Bigelow, Eric J., Lubana, Ekdeep Singh, Dick, Robert P., Tanaka, Hidenori, Ullman, Tomer D.

In-Context Learning Dynamics with Random Binary Sequences

arXiv.org Artificial IntelligenceNov-27-2023

Large language models (LLMs) trained on huge corpora of text datasets demonstrate intriguing capabilities, achieving state-of-the-art performance on tasks they were not explicitly trained for. The precise nature of LLM capabilities is often mysterious, and different prompts can elicit different capabilities through in-context learning. We propose a framework that enables us to analyze in-context learning dynamics to understand latent concepts underlying LLMs' behavioral patterns. This provides a more nuanced understanding than success-or-failure evaluation benchmarks, but does not require observing internal activations as a mechanistic interpretation of circuits would. Inspired by the cognitive science of human randomness perception, we use random binary sequences as context and study dynamics of in-context learning by manipulating properties of context data, such as sequence length. In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages, with striking in-context learning dynamics where model outputs transition sharply from seemingly random behaviors to deterministic repetition.

arxiv preprint arxiv, probability, sequence, (14 more...)

2310.17639

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Education > Curriculum > Subject-Specific Education (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceNov-27-2023

Emerging Trends in Federated Learning: From Model Fusion to Federated X Learning

Ji, Shaoxiong, Tan, Yue, Saravirta, Teemu, Yang, Zhiqin, Vasankari, Lauri, Pan, Shirui, Long, Guodong, Walid, Anwar

Federated learning is a new learning paradigm that decouples data collection and model training via multi-party computation and model aggregation. As a flexible learning setting, federated learning has the potential to integrate with other learning frameworks. We conduct a focused survey of federated learning in conjunction with other learning algorithms. Specifically, we explore various learning algorithms to improve the vanilla federated averaging algorithm and review model fusion methods such as adaptive aggregation, regularization, clustered methods, and Bayesian methods. Following the emerging trends, we also discuss federated learning in the intersection with other learning paradigms, termed federated X learning, where X includes multitask learning, meta-learning, transfer learning, unsupervised learning, and reinforcement learning. This survey reviews the state of the art, challenges, and future directions.

federated learning, international conference, learning, (15 more...)

2102.1292

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
(2 more...)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
Education (0.67)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.89)
(4 more...)

Frauen, Dennis, Imrie, Fergus, Curth, Alicia, Melnychuk, Valentyn, Feuerriegel, Stefan, van der Schaar, Mihaela

A Neural Framework for Generalized Causal Sensitivity Analysis

Unobserved confounding is common in many applications, making causal inference from observational data challenging. As a remedy, causal sensitivity analysis is an important tool to draw causal conclusions under unobserved confounding with mathematical guarantees. In this paper, we propose NeuralCSA, a neural framework for generalized causal sensitivity analysis. Unlike previous work, our framework is compatible with (i) a large class of sensitivity models, including the marginal sensitivity model, f-sensitivity models, and Rosenbaum's sensitivity model; (ii) different treatment types (i.e., binary and continuous); and (iii) different causal queries, including (conditional) average treatment effects and simultaneous effects on multiple outcomes. The generality of \frameworkname is achieved by learning a latent distribution shift that corresponds to a treatment intervention using two conditional normalizing flows. We provide theoretical guarantees that NeuralCSA is able to infer valid bounds on the causal query of interest and also demonstrate this empirically using both simulated and real-world data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2311.16026

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Bayesian Approach to Linear Bayesian Networks

Hwang, Seyong, Lee, Kyoungjae, Oh, Sunmin, Park, Gunwoong

This study proposes the first Bayesian approach for learning high-dimensional linear Bayesian networks. The proposed approach iteratively estimates each element of the topological ordering from backward and its parent using the inverse of a partial covariance matrix. The proposed method successfully recovers the underlying structure when Bayesian regularization for the inverse covariance matrix with unequal shrinkage is applied. Specifically, it shows that the number of samples $n = \Omega( d_M^2 \log p)$ and $n = \Omega(d_M^2 p^{2/m})$ are sufficient for the proposed algorithm to learn linear Bayesian networks with sub-Gaussian and 4m-th bounded-moment error distributions, respectively, where $p$ is the number of nodes and $d_M$ is the maximum degree of the moralized graph. The theoretical findings are supported by extensive simulation studies including real data analysis. Furthermore the proposed method is demonstrated to outperform state-of-the-art frequentist approaches, such as the BHLSM, LISTEN, and TD algorithms in synthetic data.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2311.1561

Country:

Asia > South Korea > Seoul > Seoul (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Canary Islands (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Nickelsen, Daniel, Bah, Bubacarr

Improved identification accuracy in equation learning via comprehensive $\boldsymbol{R^2}$-elimination and Bayesian model selection

In the field of equation learning, exhaustively considering all possible equations derived from a basis function dictionary is infeasible. Sparse regression and greedy algorithms have emerged as popular approaches to tackle this challenge. However, the presence of multicollinearity poses difficulties for sparse regression techniques, and greedy steps may inadvertently exclude terms of the true equation, leading to reduced identification accuracy. In this article, we present an approach that strikes a balance between comprehensiveness and efficiency in equation learning. Inspired by stepwise regression, our approach combines the coefficient of determination, $R^2$, and the Bayesian model evidence, $p(\boldsymbol y|\mathcal M)$, in a novel way. Our procedure is characterized by a comprehensive search with just a minor reduction of the model space at each iteration step. With two flavors of our approach and the adoption of $p(\boldsymbol y|\mathcal M)$ for bi-directional stepwise regression, we present a total of three new avenues for equation learning. Through three extensive numerical experiments involving random polynomials and dynamical systems, we compare our approach against four state-of-the-art methods and two standard approaches. The results demonstrate that our comprehensive search approach surpasses all other methods in terms of identification accuracy. In particular, the second flavor of our approach establishes an efficient overfitting penalty solely based on $R^2$, which achieves highest rates of exact equation recovery.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2311.13265

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Germany (0.04)
Africa > The Gambia (0.04)
Africa > South Africa > Western Cape > Cape Town (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

An efficient likelihood-free Bayesian inference method based on sequential neural posterior estimation

Xiong, Yifei, Yang, Xiliang, Zhang, Sanguo, He, Zhijian

Sequential neural posterior estimation (SNPE) techniques have been recently proposed for dealing with simulation-based models with intractable likelihoods. Unlike approximate Bayesian computation, SNPE techniques learn the posterior from sequential simulation using neural network-based conditional density estimators by minimizing a specific loss function. The SNPE method proposed by Lueckmann et al. (2017) used a calibration kernel to boost the sample weights around the observed data, resulting in a concentrated loss function. However, the use of calibration kernels may increase the variances of both the empirical loss and its gradient, making the training inefficient. To improve the stability of SNPE, this paper proposes to use an adaptive calibration kernel and several variance reduction techniques. The proposed method greatly speeds up the process of training, and provides a better approximation of the posterior than the original SNPE method and some existing competitors as confirmed by numerical experiments.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2311.1253

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

arXiv.org Artificial IntelligenceNov-26-2023

A Generic Stochastic Hybrid Car-following Model Based on Approximate Bayesian Computation

Jiang, Jiwan, Zhou, Yang, Wang, Xin, Ahn, Soyoung

Car following (CF) models are fundamental to describing traffic dynamics. However, the CF behavior of human drivers is highly stochastic and nonlinear. As a result, identifying the "best" CF model has been challenging and controversial despite decades of research. Introduction of automated vehicles has further complicated this matter as their CF controllers remain proprietary, though their behavior appears different than human drivers. This paper develops a stochastic learning approach to integrate multiple CF models, rather than relying on a single model. The framework is based on approximate Bayesian computation that probabilistically concatenates a pool of CF models based on their relative likelihood of describing observed behavior. The approach, while data-driven, retains physical tractability and interpretability. Evaluation results using two datasets show that the proposed approach can better reproduce vehicle trajectories for both human-driven and automated vehicles than any single CF model considered.

cf model, hybrid model, particle, (12 more...)

2312.10042

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Massachusetts (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Passenger (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Mathew, Sunil, Sra, Jasbir, Rowe, Daniel B.

Generation of patient specific cardiac chamber models using generative neural networks under a Bayesian framework for electroanatomical mapping

arXiv.org Artificial IntelligenceNov-26-2023

Electroanatomical mapping is a technique used in cardiology to create a detailed 3D map of the electrical activity in the heart. It is useful for diagnosis, treatment planning and real time guidance in cardiac ablation procedures to treat arrhythmias like atrial fibrillation. A probabilistic machine learning model trained on a library of CT/MRI scans of the heart can be used during electroanatomical mapping to generate a patient-specific 3D model of the chamber being mapped. The use of probabilistic machine learning models under a Bayesian framework provides a way to quantify uncertainty in results and provide a natural framework of interpretability of the model. Here we introduce a Bayesian approach to surface reconstruction of cardiac chamber models from a sparse 3D point cloud data acquired during electroanatomical mapping. We show how probabilistic graphical models trained on segmented CT/MRI data can be used to generate cardiac chamber models from few acquired locations thereby reducing procedure time and x-ray exposure. We show how they provide insight into what the neural network learns from the segmented CT/MRI images used to train the network, which provides explainability to the resulting cardiac chamber models generated by the model.

cardiac chamber model, neural network, patient specific cardiac chamber model, (13 more...)

2311.16197

Country:

North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceNov-26-2023

Ultra-Range Gesture Recognition using an RGB Camera in Human-Robot Interaction

Bamani, Eran, Nissinman, Eden, Meir, Inbar, Koenigsberg, Lisa, Sintov, Avishai

Hand gestures play a significant role in human interactions where non-verbal intentions, thoughts and commands are conveyed. In Human-Robot Interaction (HRI), hand gestures offer a similar and efficient medium for conveying clear and rapid directives to a robotic agent. However, state-of-the-art vision-based methods for gesture recognition have been shown to be effective only up to a user-camera distance of seven meters. Such a short distance range limits practical HRI with, for example, service robots, search and rescue robots and drones. In this work, we address the Ultra-Range Gesture Recognition (URGR) problem by aiming for a recognition distance of up to 25 meters and in the context of HRI. We propose a novel deep-learning framework for URGR using solely a simple RGB camera. First, a novel super-resolution model termed HQ-Net is used to enhance the low-resolution image of the user. Then, we propose a novel URGR classifier termed Graph Vision Transformer (GViT) which takes the enhanced image as input. GViT combines the benefits of a Graph Convolutional Network (GCN) and a modified Vision Transformer (ViT). Evaluation of the proposed framework over diverse test data yields a high recognition rate of 98.1%. The framework has also exhibited superior performance compared to human recognition in ultra-range distances. With the framework, we analyze and demonstrate the performance of an autonomous quadruped robot directed by human gestures in complex ultra-range indoor and outdoor environments.

gesture recognition, model certainty, recognition, (14 more...)

2311.15361

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
South America > Ecuador (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)