AITopics

2306.12027

Country: Asia > Indonesia (0.05)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Data Science > Data Mining > Web Mining (0.96)
(3 more...)

Reward Shaping via Diffusion Process in Reinforcement Learning

Kumar, Peeyush

In this article, I take inspiration from stochastic thermodynamics to derive a problem formulation for online learning in uncertain MDPs while grounded in system dynamics. The system balances the diffusion process with drif dynamics as a way to formulate the explorationexploitation trade-off. To this effect, I make an explicit link between the information entropy and the stochastic dynamics of a system coupled to an environment. I analyze various sources of entropy production: due to the decision-maker's uncertainty about the system-environment interaction characteristics; due to the stochastic nature of system dynamics; and the interaction of the decision maker's knowledge with system dynamics. This analysis provides a framework that can be formulated either as a maximum entropy program to derive efficient policies that balance the exploration and exploitation trade-off, or as a modified cost optimization program that includes informational costs and benefits.

artificial intelligence, information, machine learning, (19 more...)

2306.11885

Country:

North America > United States (0.93)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.93)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningJun-20-2023

Conditional Independence Testing with Heteroskedastic Data and Applications to Causal Discovery

Günther, Wiebke, Ninad, Urmi, Wahl, jonas, Runge, Jakob

Conditional independence (CI) testing is frequently used in data analysis and machine learning for various scientific fields and it forms the basis of constraint-based causal discovery. Oftentimes, CI testing relies on strong, rather unrealistic assumptions. One of these assumptions is homoskedasticity, in other words, a constant conditional variance is assumed. We frame heteroskedasticity in a structural causal model framework and present an adaptation of the partial correlation CI test that works well in the presence of heteroskedastic noise, given that expert knowledge about the heteroskedastic relationships is available. Further, we provide theoretical consistency results for the proposed CI test which carry over to causal discovery under certain assumptions. Numerical causal discovery experiments demonstrate that the adapted partial correlation CI test outperforms the standard test in the presence of heteroskedasticity and is on par for the homoskedastic case. Finally, we discuss the general challenges and limits as to how expert knowledge about heteroskedasticity can be accounted for in causal discovery.

ci test, heteroskedasticity, pc algorithm, (12 more...)

arXiv.org Machine Learning

2306.11498

Country:

Europe > Germany > Berlin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Spain > Canary Islands (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Statistical mechanics of continual learning: variational principle and mean-field potential

Li, Chan, Huang, Zhenye, Zou, Wenxuan, Huang, Haiping

An obstacle to artificial general intelligence is set by continual learning of multiple tasks of different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory ground. Here, we focus on continual learning in single-layered and multi-layered neural networks of binary weights. A variational Bayesian learning setting is thus proposed, where the neural networks are trained in a field-space, rather than gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and modulates synaptic resources among tasks. From a physics perspective, we translate the variational continual learning into Franz-Parisi thermodynamic potential framework, where previous task knowledge acts as a prior and a reference as well. We thus interpret the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which solves the continual learning with binary weights using multi-layered neural networks, and performs better than the currently available metaplasticity algorithm where binary synapses bear hidden continuous states and the synaptic plasticity is modulated by a heuristic regularization function. Our proposed principled frameworks also connect to elastic weight consolidation, weight-uncertainty modulated learning, and neuroscience inspired metaplasticity, providing a theory-grounded method for the real-world multi-task learning with deep networks.

artificial intelligence, bayesian inference, machine learning, (19 more...)

doi: 10.1103/PhysRevE.108.014309

2212.02846

Country:

Europe > France (0.04)
Asia > Singapore (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Education (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Li, Depeng, Zeng, Zhigang

Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning

In the scenario of class-incremental learning (CIL), deep neural networks have to adapt their model parameters to non-stationary data distributions, e.g., the emergence of new classes over time. However, CIL models are challenged by the well-known catastrophic forgetting phenomenon. Typical methods such as rehearsal-based ones rely on storing exemplars of old classes to mitigate catastrophic forgetting, which limits real-world applications considering memory resources and privacy issues. In this paper, we propose a novel rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our approach involves jointly optimizing a plastic CNN feature extractor and an analytical feed-forward classifier. The inaccessibility of historical data is tackled by holistically controlling the parameters of a well-trained model, ensuring that the decision boundary learned fits new classes while retaining recognition of previously learned classes. Specifically, the trainable CNN feature extractor provides task-dependent knowledge separately without interference; and the final classifier integrates task-specific knowledge incrementally for decision-making without forgetting. In each CIL session, it accommodates new tasks by attaching a tiny set of declarative parameters to its backbone, in which only one matrix per task or one vector per class is kept for knowledge retention. Extensive experiments on a variety of task sequences show that our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order robustness. Furthermore, to make the non-growing backbone (i.e., a model with limited network capacity) suffice to train on more incoming tasks, a graceful forgetting implementation on previously learned trivial tasks is empirically investigated.

artificial intelligence, deep learning, machine learning, (18 more...)

2306.11967

Country:

North America > United States > Massachusetts (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Voelcker, Claas, Liao, Victor, Garg, Animesh, Farahmand, Amir-massoud

Value Gradient weighted Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) is a sample efficient technique to obtain control policies, yet unavoidable modeling errors often lead performance deterioration. The model in MBRL is often solely fitted to reconstruct dynamics, state observations in particular, while the impact of model error on the policy is not captured by the training objective. This leads to a mismatch between the intended goal of MBRL, enabling good policy and value learning, and the target of the loss function employed in practice, future state prediction. Naive intuition would suggest that value-aware model learning would fix this problem and, indeed, several solutions to this objective mismatch problem have been proposed based on theoretical analysis. However, they tend to be inferior in practice to commonly used maximum likelihood (MLE) based approaches. In this paper we propose the Value-gradient weighted Model Learning (VaGraM), a novel method for value-aware model learning which improves the performance of MBRL in challenging settings, such as small model capacity and the presence of distracting state dimensions. We analyze both MLE and value-aware approaches and demonstrate how they fail to account for exploration and the behavior of function approximation when learning value-aware models and highlight the additional goals that must be met to stabilize optimization in the deep learning setting. We verify our analysis by showing that our loss function is able to achieve high returns on the Mujoco benchmark suite while being more robust than maximum likelihood based approaches.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2204.01464

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
(2 more...)

A Survey on Safety-Critical Driving Scenario Generation -- A Methodological Perspective

Ding, Wenhao, Xu, Chejian, Arief, Mansur, Lin, Haohong, Li, Bo, Zhao, Ding

Autonomous driving systems have witnessed a significant development during the past years thanks to the advance in machine learning-enabled sensing and decision-making algorithms. One critical challenge for their massive deployment in the real world is their safety evaluation. Most existing driving systems are still trained and evaluated on naturalistic scenarios collected from daily life or heuristically-generated adversarial ones. However, the large population of cars, in general, leads to an extremely low collision rate, indicating that the safety-critical scenarios are rare in the collected real-world data. Thus, methods to artificially generate scenarios become crucial to measure the risk and reduce the cost. In this survey, we focus on the algorithms of safety-critical scenario generation in autonomous driving. We first provide a comprehensive taxonomy of existing algorithms by dividing them into three categories: data-driven generation, adversarial generation, and knowledge-based generation. Then, we discuss useful tools for scenario generation, including simulation platforms and packages. Finally, we extend our discussion to five main challenges of current works -- fidelity, efficiency, diversity, transferability, controllability -- and research opportunities lighted up by these challenges.

artificial intelligence, deep learning, machine learning, (17 more...)

2202.02215

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.14)
(5 more...)

Genre: Overview (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Radev, Stefan T., Schmitt, Marvin, Pratz, Valentin, Picchini, Umberto, Köthe, Ullrich, Bürkner, Paul-Christian

JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models

Neural networks trained on model simulations enable amortized inference: A pre-trained network can be stored and re-used for Bayesian inference on millions of data sets (von This work proposes "jointly amortized neural Krause et al., 2022). Crucially, most previous neural approaches approximation" (JANA) of intractable likelihood have tackled either SM or SBI in isolation, but little functions and posterior densities arising in attention has been paid to learning both tasks simultaneously. Bayesian surrogate modeling and simulation-based To address this gap, we propose JANA ("Jointly Amortized inference. We train three complementary networks Neural Approximation"), a Bayesian neural framework for in an end-to-end fashion: 1) a summary network simultaneously amortized SM and SBI, and show how it enables to compress individual data points, sets, or time novel solutions to challenging downstream tasks like series into informative embedding vectors; 2) a posterior the estimation of marginal and posterior predictive distributions network to learn an amortized approximate (see Figure 1). JANA also presents a major qualitative posterior; and 3) a likelihood network to learn an upgrade to the BayesFlow framework (Radev et al., 2020), amortized approximate likelihood. Their interaction which was originally designed for amortized SBI alone.

artificial intelligence, calibration, machine learning, (12 more...)

2302.09125

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Towards an Improved Understanding of Software Vulnerability Assessment Using Data-Driven Approaches

Le, Triet H. M.

The thesis advances the field of software security by providing knowledge and automation support for software vulnerability assessment using data-driven approaches. Software vulnerability assessment provides important and multifaceted information to prevent and mitigate dangerous cyber-attacks in the wild. The key contributions include a systematisation of knowledge, along with a suite of novel data-driven techniques and practical recommendations for researchers and practitioners in the area. The thesis results help improve the understanding and inform the practice of assessing ever-increasing vulnerabilities in real-world software systems. This in turn enables more thorough and timely fixing prioritisation and planning of these critical security issues.

data mining, machine learning, programming language, (24 more...)

2207.11708

Country:

Asia > China (0.04)
North America > United States > Hawaii (0.04)
North America > United States > Maryland > Baltimore (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.87)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Quality (1.00)
(15 more...)

Abdulsamad, Hany, Peters, Jan

Model-Based Reinforcement Learning via Stochastic Hybrid Models

Optimal control of general nonlinear systems is a central challenge in automation. Enabled by powerful function approximators, data-driven approaches to control have recently successfully tackled challenging applications. However, such methods often obscure the structure of dynamics and control behind black-box over-parameterized representations, thus limiting our ability to understand closed-loop behavior. This paper adopts a hybrid-system view of nonlinear modeling and control that lends an explicit hierarchical structure to the problem and breaks down complex dynamics into simpler localized units. We consider a sequence modeling paradigm that captures the temporal structure of the data and derive an expectation-maximization (EM) algorithm that automatically decomposes nonlinear dynamics into stochastic piecewise affine models with nonlinear transition boundaries. Furthermore, we show that these time-series models naturally admit a closed-loop extension that we use to extract local polynomial feedback controllers from nonlinear experts via behavioral cloning. Finally, we introduce a novel hybrid relative entropy policy search (Hb-REPS) technique that incorporates the hierarchical nature of hybrid models and optimizes a set of time-invariant piecewise feedback controllers derived from a piecewise polynomial approximation of a global state-value function.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2111.06211

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Energy > Renewable (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)