AITopics

2411.14263

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Andalusia > Seville Province > Seville (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.89)
Government > Military (0.71)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-21-2024

Learning to Cooperate with Humans using Generative Agents

Liang, Yancheng, Chen, Daphne, Gupta, Abhishek, Du, Simon S., Jaques, Natasha

Training agents that can coordinate zero-shot with humans is a key mission in multi-agent reinforcement learning (MARL). Current algorithms focus on training simulated human partner policies which are then used to train a Cooperator agent. The simulated human is produced either through behavior cloning over a dataset of human cooperation behavior, or by using MARL to create a population of simulated agents. However, these approaches often struggle to produce a Cooperator that can coordinate well with real humans, since the simulated humans fail to cover the diverse strategies and styles employed by people in the real world. We show \emph{learning a generative model of human partners} can effectively address this issue. Our model learns a latent variable representation of the human that can be regarded as encoding the human's unique strategy, intention, experience, or style. This generative model can be flexibly trained from any (human or neural policy) agent interaction data. By sampling from the latent space, we can use the generative model to produce different partners to train Cooperator agents. We evaluate our method -- \textbf{G}enerative \textbf{A}gent \textbf{M}odeling for \textbf{M}ulti-agent \textbf{A}daptation (GAMMA) -- on Overcooked, a challenging cooperative cooking game that has become a standard benchmark for zero-shot coordination. We conduct an evaluation with real human teammates, and the results show that GAMMA consistently improves performance, whether the generative model is trained on simulated populations or human datasets. Further, we propose a method for posterior sampling from the generative model that is biased towards the human data, enabling us to efficiently improve performance with only a small amount of expensive human interaction data.

agent, generative model, human data, (16 more...)

2411.13934

Country:

Asia > China > Jiangsu Province > Yancheng (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.87)
Research Report > Experimental Study (0.68)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.88)

arXiv.org Artificial IntelligenceNov-20-2024

Rethinking the Intermediate Features in Adversarial Attacks: Misleading Robotic Models via Adversarial Distillation

Zhao, Ke, Huang, Huayang, Li, Miao, Wu, Yu

Language-conditioned robotic learning has significantly enhanced robot adaptability by enabling a single model to execute diverse tasks in response to verbal commands. Despite these advancements, security vulnerabilities within this domain remain largely unexplored. This paper addresses this gap by proposing a novel adversarial prompt attack tailored to language-conditioned robotic models. Our approach involves crafting a universal adversarial prefix that induces the model to perform unintended actions when added to any original prompt. We demonstrate that existing adversarial techniques exhibit limited effectiveness when directly transferred to the robotic domain due to the inherent robustness of discretized robotic action spaces. To overcome this challenge, we propose to optimize adversarial prefixes based on continuous action representations, circumventing the discretization process. Additionally, we identify the beneficial impact of intermediate features on adversarial attacks and leverage the negative gradient of intermediate self-attention features to further enhance attack efficacy. Extensive experiments on VIMA models across 13 robot manipulation tasks validate the superiority of our method over existing approaches and demonstrate its transferability across different model variants.

adversarial attack, artificial intelligence, machine learning, (15 more...)

2411.15222

Country:

North America > United States > California > San Mateo County > Menlo Park (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Machine LearningNov-20-2024

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

Nickel, Maximilian

Rapid model validation via the train-test paradigm has been a key driver for the breathtaking progress in machine learning and AI. However, modern AI systems often depend on a combination of tasks and data collection practices that violate all assumptions ensuring test validity. Yet, without rigorous model validation we cannot ensure the intended outcomes of deployed AI systems, including positive social impact, nor continue to advance AI research in a scientifically sound way. In this paper, I will show that for widely considered inference settings in complex social systems the train-test paradigm does not only lack a justification but is indeed invalid for any risk estimator, including counterfactual and causal estimators, with high probability. These formal impossibility results highlight a fundamental epistemic issue, i.e., that for key tasks in modern AI we cannot know whether models are valid under current data collection practices. Importantly, this includes variants of both recommender systems and reasoning via large language models, and neither na\"ive scaling nor limited benchmarks are suited to address this issue. I am illustrating these results via the widely used MovieLens benchmark and conclude by discussing the implications of these results for AI in social systems, including possible remedies such as participatory data curation and open science.

possible world, social system, validity, (17 more...)

2411.13653

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Retail (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceNov-19-2024

Adaptively Controllable Diffusion Model for Efficient Conditional Image Generation

Xing, Yucheng, Liu, Xiaodong, Wang, Xin

With the development of artificial intelligence, more and more attention has been put onto generative models, which represent the creativity, a very important aspect of intelligence. In recent years, diffusion models have been studied and proven to be more reasonable and effective than previous methods. However, common diffusion frameworks suffer from controllability problems. Although extra conditions have been considered by some work to guide the diffusion process for a specific target generation, it only controls the generation result but not its process. In this work, we propose a new adaptive framework, $\textit{Adaptively Controllable Diffusion (AC-Diff) Model}$, to automatically and fully control the generation process, including not only the type of generation result but also the length and parameters of the generation process. Both inputs and conditions will be first fed into a $\textit{Conditional Time-Step (CTS) Module}$ to determine the number of steps needed for a generation. Then according to the length of the process, the diffusion rate parameters will be estimated through our $\textit{Adaptive Hybrid Noise Schedule (AHNS) Module}$. We further train the network with the corresponding adaptive sampling mechanism to learn how to adjust itself according to the conditions for the overall performance improvement. To enable its practical applications, AC-Diff is expected to largely reduce the average number of generation steps and execution time while maintaining the same performance as done in the literature diffusion models.

artificial intelligence, diffusion model, machine learning, (14 more...)

2411.15199

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)
Instructional Material (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningNov-19-2024

Probably Approximately Precision and Recall Learning

Cohen, Lee, Mansour, Yishay, Moran, Shay, Shao, Han

Precision and Recall are foundational metrics in machine learning where both accurate predictions and comprehensive coverage are essential, such as in recommender systems and multi-label learning. In these tasks, balancing precision (the proportion of relevant items among those predicted) and recall (the proportion of relevant items successfully predicted) is crucial. A key challenge is that one-sided feedback--where only positive examples are observed during training--is inherent in many practical problems. For instance, in recommender systems like YouTube, training data only consists of videos that a user has actively selected, while unselected items remain unseen. Despite this lack of negative feedback in training, avoiding undesirable recommendations at test time is essential. We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions, such as between users and items. This framework subsumes the classical binary and multi-class PAC learning models as well as multi-label learning with partial feedback, where only a single random correct label per example is observed, rather than all correct labels. Our work uncovers a rich statistical and algorithmic landscape, with nuanced boundaries on what can and cannot be learned. Notably, classical methods like Empirical Risk Minimization fail in this setting, even for simple hypothesis classes with only two hypotheses. To address these challenges, we develop novel algorithms that learn exclusively from positive data, effectively minimizing both precision and recall losses. Specifically, in the realizable setting, we design algorithms that achieve optimal sample complexity guarantees. In the agnostic case, we show that it is impossible to achieve additive error guarantees--as is standard in PAC learning--and instead obtain meaningful multiplicative approximations.

graph, precision, precision loss, (16 more...)

2411.13029

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > Canada > Quebec > Montreal (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.89)

Sadhukhan, Tathagata, Paul, Manit, Dwivedi, Raaz

On adaptivity and minimax optimality of two-sided nearest neighbors

arXiv.org Machine LearningNov-19-2024

Nearest neighbor (NN) algorithms have been extensively used for missing data problems in recommender systems and sequential decision-making systems. Prior theoretical analysis has established favorable guarantees for NN when the underlying data is sufficiently smooth and the missingness probabilities are lower bounded. Here we analyze NN with non-smooth non-linear functions with vast amounts of missingness. In particular, we consider matrix completion settings where the entries of the underlying matrix follow a latent non-linear factor model, with the non-linearity belonging to a \Holder function class that is less smooth than Lipschitz. Our results establish following favorable properties for a suitable two-sided NN: (1) The mean squared error (MSE) of NN adapts to the smoothness of the non-linearity, (2) under certain regularity conditions, the NN error rate matches the rate obtained by an oracle equipped with the knowledge of both the row and column latent factors, and finally (3) NN's MSE is non-trivial for a wide range of settings even when several matrix entries might be missing deterministically. We support our theoretical findings via extensive numerical simulations and a case study with data from a mobile health study, HeartSteps.

algorithm, col, neighbor, (12 more...)

2411.12965

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.41)

arXiv.org Artificial IntelligenceNov-18-2024

Reviving Dormant Memories: Investigating Catastrophic Forgetting in Language Models through Rationale-Guidance Difficulty

Sun, Huashan, Gao, Yang

Although substantial efforts have been made to mitigate catastrophic forgetting in continual learning, the intrinsic mechanisms are not well understood. In this paper, we discover that when a forgetting model passively receives an externally provided partial appropriate rationale, its performance on the forgotten task can be restored. Furthermore, by simply adding a task-agnostic prefix to the original instruction, the forgetting model can actively generate an appropriate rationale to reach the correct answer. These findings suggest that the model does not actually ``forget'' the task knowledge; instead, the degraded performance can be attributed to the failure of the original instructions in guiding the model to generate the appropriate rationales. Based on this insight, we propose the Rationale-Guidance Difficulty metric to evaluate how effectively a given instruction guides the model in generating appropriate rationales. We apply this metric to optimize the allocation of replay data in replay-based continual learning algorithm. Experimental results demonstrate that our data allocation method effectively mitigates catastrophic forgetting and maintains better model plasticity simultaneously across models.

large language model, machine learning, natural language, (17 more...)

2411.11932

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Kurle, Richard, Klushyn, Alexej, Herbrich, Ralf

BALI: Learning Neural Networks via Bayesian Layerwise Inference

arXiv.org Machine LearningNov-18-2024

We introduce a new method for learning Bayesian neural networks, treating them as a stack of multivariate Bayesian linear regression models. The main idea is to infer the layerwise posterior exactly if we know the target outputs of each layer. We define these pseudo-targets as the layer outputs from the forward pass, updated by the backpropagated gradients of the objective function. The resulting layerwise posterior is a matrix-normal distribution with a Kronecker-factorized covariance matrix, which can be efficiently inverted. Our method extends to the stochastic mini-batch setting using an exponential moving average over natural-parameter terms, thus gradually forgetting older data. The method converges in few iterations and performs as well as or better than leading Bayesian neural network methods on various regression, classification, and out-of-distribution detection benchmarks.

international conference, natural parameter, posterior, (16 more...)

2411.12102

Country:

Asia > Indonesia > Bali (0.45)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(7 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Champion, Théophile, Grześ, Marek, Bowman, Howard

Structure learning with Temporal Gaussian Mixture for model-based Reinforcement Learning

arXiv.org Machine LearningNov-18-2024

Model-based reinforcement learning refers to a set of approaches capable of sample-efficient decision making, which create an explicit model of the environment. This model can subsequently be used for learning optimal policies. In this paper, we propose a temporal Gaussian Mixture Model composed of a perception model and a transition model. The perception model extracts discrete (latent) states from continuous observations using a variational Gaussian mixture likelihood. Importantly, our model constantly monitors the collected data searching for new Gaussian components, i.e., the perception model performs a form of structure learning (Smith et al., 2020; Friston et al., 2018; Neacsu et al., 2022) as it learns the number of Gaussian components in the mixture. Additionally, the transition model learns the temporal transition between consecutive time steps by taking advantage of the Dirichlet-categorical conjugacy. Both the perception and transition models are able to forget part of the data points, while integrating the information they provide within the prior, which ensure fast variational inference. Finally, decision making is performed with a variant of Q-learning which is able to learn Q-values from beliefs over states. Empirically, we have demonstrated the model's ability to learn the structure of several mazes: the model discovered the number of states and the transition probabilities between these states. Moreover, using its learned Q-values, the agent was able to successfully navigate from the starting position to the maze's exit.

agent, equation, update equation, (16 more...)

2411.11511

Country:

Europe > United Kingdom (0.14)
South America > Chile > Valparaíso Region > Valparaíso Province > Valparaíso (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)