AITopics

2303.01618

Country:

Europe > United Kingdom (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceMay-7-2023

Robust Multi-agent Communication via Multi-view Message Certification

Yuan, Lei, Jiang, Tao, Li, Lihe, Chen, Feng, Zhang, Zongzhang, Yu, Yang

Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment. Major relevant works tackle this issue under specific assumptions, like a limited number of message channels would sustain perturbations, limiting the efficiency in complex scenarios. In this paper, we take a further step addressing this issue by learning a robust multi-agent communication policy via multi-view message certification, dubbed CroMAC. Agents trained under CroMAC can obtain guaranteed lower bounds on state-action values to identify and choose the optimal action under a worst-case deviation when the received messages are perturbed. Concretely, we first model multi-agent communication as a multi-view problem, where every message stands for a view of the state. Then we extract a certificated joint message representation by a multi-view variational autoencoder (MVAE) that uses a product-of-experts inference network. For the optimization phase, we do perturbations in the latent space of the state for a certificate guarantee. Then the learned joint message representation is used to approximate the certificated state representation during training. Extensive experiments in several cooperative multi-agent benchmarks validate the effectiveness of the proposed CroMAC.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2305.13936

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Takahashi, Kazuki, Fukai, Tomoki, Sakai, Yutaka, Takekawa, Takashi

Goal-oriented inference of environment from redundant observations

arXiv.org Artificial IntelligenceMay-7-2023

The agent learns to organize decision behavior to achieve a behavioral goal, such as reward maximization, and reinforcement learning is often used for this optimization. Learning an optimal behavioral strategy is difficult under the uncertainty that events necessary for learning are only partially observable, called as Partially Observable Markov Decision Process (POMDP). However, the real-world environment also gives many events irrelevant to reward delivery and an optimal behavioral strategy. The conventional methods in POMDP, which attempt to infer transition rules among the entire observations, including irrelevant states, are ineffective in such an environment. Supposing Redundantly Observable Markov Decision Process (ROMDP), here we propose a method for goal-oriented reinforcement learning to efficiently learn state transition rules among reward-related "core states'' from redundant observations. Starting with a small number of initial core states, our model gradually adds new core states to the transition diagram until it achieves an optimal behavioral strategy consistent with the Bellman equation. We demonstrate that the resultant inference model outperforms the conventional method for POMDP. We emphasize that our model only containing the core states has high explainability. Furthermore, the proposed method suits online learning as it suppresses memory consumption and improves learning speed.

artificial intelligence, machine learning, state space, (17 more...)

2305.04432

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Crothers, Evan, Japkowicz, Nathalie, Viktor, Herna

Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

arXiv.org Artificial IntelligenceMay-7-2023

Machine generated text is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NLG) systems is tempered by the multitude of avenues for abuse. Detection of machine generated text is a key countermeasure for reducing abuse of NLG models, with significant technical challenges and numerous open problems. We provide a survey that includes both 1) an extensive analysis of threat models posed by contemporary NLG systems, and 2) the most complete review of machine generated text detection methods to date. This survey places machine generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models, and ensuring detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.

large language model, machine learning, natural language, (17 more...)

2210.07321

Country:

Asia > Middle East > Iraq (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada (0.04)
(16 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Media (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

arXiv.org Artificial IntelligenceMay-6-2023

Autonomous Navigation for Robot-assisted Intraluminal and Endovascular Procedures: A Systematic Review

Pore, Ameya, Li, Zhen, Dall'Alba, Diego, Hernansanz, Albert, De Momi, Elena, Menciassi, Arianna, Casals, Alicia, Denkelman, Jenny, Fiorini, Paolo, Poorten, Emmanuel Vander

Increased demand for less invasive procedures has accelerated the adoption of Intraluminal Procedures (IP) and Endovascular Interventions (EI) performed through body lumens and vessels. As navigation through lumens and vessels is quite complex, interest grows to establish autonomous navigation techniques for IP and EI for reaching the target area. Current research efforts are directed toward increasing the Level of Autonomy (LoA) during the navigation phase. One key ingredient for autonomous navigation is Motion Planning (MP) techniques. This paper provides an overview of MP techniques categorizing them based on LoA. Our analysis investigates advances for the different clinical scenarios. Through a systematic literature analysis using the PRISMA method, the study summarizes relevant works and investigates the clinical aim, LoA, adopted MP techniques, and validation types. We identify the limitations of the corresponding MP methods and provide directions to improve the robustness of the algorithms in dynamic intraluminal environments. MP for IP and EI can be classified into four subgroups: node, sampling, optimization, and learning-based techniques, with a notable rise in learning-based approaches in recent years. One of the review's contributions is the identification of the limiting factors in IP and EI robotic systems hindering higher levels of autonomous navigation. In the future, navigation is bound to become more autonomous, placing the clinician in a supervisory position to improve control precision and reduce workload.

machine learning, navigation, reinforcement learning, (20 more...)

doi: 10.1109/TRO.2023.3269384

2305.04027

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(10 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Surgery (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

arXiv.org Artificial IntelligenceMay-6-2023

On Exact Sampling in the Two-Variable Fragment of First-Order Logic

Wang, Yuanhong, Pu, Juhua, Wang, Yuyi, Kuželka, Ondřej

In this paper, we study the sampling problem for first-order logic proposed recently by Wang et al. -- how to efficiently sample a model of a given first-order sentence on a finite domain? We extend their result for the universally-quantified subfragment of two-variable logic $\mathbf{FO}^2$ ($\mathbf{UFO}^2$) to the entire fragment of $\mathbf{FO}^2$. Specifically, we prove the domain-liftability under sampling of $\mathbf{FO}^2$, meaning that there exists a sampling algorithm for $\mathbf{FO}^2$ that runs in time polynomial in the domain size. We then further show that this result continues to hold even in the presence of counting constraints, such as $\forall x\exists_{=k} y: \varphi(x,y)$ and $\exists_{=k} x\forall y: \varphi(x,y)$, for some quantifier-free formula $\varphi(x,y)$. Our proposed method is constructive, and the resulting sampling algorithms have potential applications in various areas, including the uniform generation of combinatorial structures and sampling in statistical-relational models such as Markov logic networks and probabilistic logic programs.

artificial intelligence, logic & formal reasoning, machine learning, (21 more...)

2302.0273

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Czechia > Prague (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Improving Real-Time Bidding in Online Advertising Using Markov Decision Processes and Machine Learning Techniques

Sharma, Parikshit

Real-time bidding has emerged as an effective online advertising technique. With real-time bidding, advertisers can position ads per impression, enabling them to optimise ad campaigns by targeting specific audiences in real-time. This paper proposes a novel method for real-time bidding that combines deep learning and reinforcement learning techniques to enhance the efficiency and precision of the bidding process. In particular, the proposed method employs a deep neural network to predict auction details and market prices and a reinforcement learning algorithm to determine the optimal bid price. The model is trained using historical data from the iPinYou dataset and compared to cutting-edge real-time bidding algorithms. The outcomes demonstrate that the proposed method is preferable regarding cost-effectiveness and precision. In addition, the study investigates the influence of various model parameters on the performance of the proposed algorithm. It offers insights into the efficacy of the combined deep learning and reinforcement learning approach for real-time bidding. This study contributes to advancing techniques and offers a promising direction for future research.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2305.04889

Country:

Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Asia > Singapore (0.04)
Asia > India > Rajasthan (0.04)

Genre: Research Report (0.84)

Industry:

Marketing (1.00)
Banking & Finance > Trading (0.88)
Information Technology > Services (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.42)

Emerson, Harry, Guy, Matthew, McConville, Ryan

Offline Reinforcement Learning for Safer Blood Glucose Control in People with Type 1 Diabetes

The widespread adoption of effective hybrid closed loop systems would represent an important milestone of care for people living with type 1 diabetes (T1D). These devices typically utilise simple control algorithms to select the optimal insulin dose for maintaining blood glucose levels within a healthy range. Online reinforcement learning (RL) has been utilised as a method for further enhancing glucose control in these devices. Previous approaches have been shown to reduce patient risk and improve time spent in the target range when compared to classical control algorithms, but are prone to instability in the learning process, often resulting in the selection of unsafe actions. This work presents an evaluation of offline RL for developing effective dosing policies without the need for potentially dangerous patient interaction during training. This paper examines the utility of BCQ, CQL and TD3-BC in managing the blood glucose of the 30 virtual patients available within the FDA-approved UVA/Padova glucose dynamics simulator. When trained on less than a tenth of the total training samples required by online RL to achieve stable performance, this work shows that offline RL can significantly increase time in the healthy blood glucose range from 61.6 +\- 0.3% to 65.3 +/- 0.5% when compared to the strongest state-of-art baseline (p < 0.001). This is achieved without any associated increase in low blood glucose events. Offline RL is also shown to be able to correct for common and challenging control scenarios such as incorrect bolus dosing, irregular meal timings and compression errors.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2204.03376

Country:

North America > United States (1.00)
Europe > United Kingdom (0.28)
Europe > Switzerland (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Biophysical Cybernetics of Directed Evolution and Eco-evolutionary Dynamics

Bagley, Bryce Allen

Many major questions in the theory of evolutionary dynamics can in a meaningful sense be mapped to analyses of stochastic trajectories in game theoretic contexts. Often the approach is to analyze small numbers of distinct populations and/or to assume dynamics occur within a regime of population sizes large enough that deterministic trajectories are an excellent approximation of reality. The addition of ecological factors, termed "eco-evolutionary dynamics", further complicates the dynamics and results in many problems which are intractable or impractically messy for current theoretical methods. However, an analogous but underexplored approach is to analyze these systems with an eye primarily towards uncertainty in the models themselves. In the language of researchers in Reinforcement Learning and adjacent fields, a Partially Observable Markov Process. Here we introduce a duality which maps the complexity of accounting for both ecology and individual genotypic/phenotypic types onto a problem of accounting solely for underlying information-theoretic computations rather than drawing physical boundaries which do not change the computations. Armed with this equivalence between computation and the relevant biophysics, which we term Taak-duality, we attack the problem of "directed evolution" in the form of a Partially Observable Markov Decision Process. This provides a tractable case of studying eco-evolutionary trajectories of a highly general type, and of analyzing questions of potential limits on the efficiency of evolution in the directed case.

artificial intelligence, machine learning, matrix, (16 more...)

2305.0334

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Haugh, Martin, Singal, Raghav

Counterfactual Analysis in Dynamic Latent State Models

We provide an optimization-based framework to perform counterfactual analysis in a dynamic model with hidden states. Our framework is grounded in the ``abduction, action, and prediction'' approach to answer counterfactual queries and handles two key challenges where (1) the states are hidden and (2) the model is dynamic. Recognizing the lack of knowledge on the underlying causal mechanism and the possibility of infinitely many such mechanisms, we optimize over this space and compute upper and lower bounds on the counterfactual quantity of interest. Our work brings together ideas from causality, state-space models, simulation, and optimization, and we apply it on a breast cancer case study. To the best of our knowledge, we are the first to compute lower and upper bounds on a counterfactual query in a dynamic latent-state model.

artificial intelligence, counterfactual analysis, machine learning, (16 more...)

2205.13832

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)