AITopics

2412.09037

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.05)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology (1.00)
Health & Medicine > Consumer Health (0.46)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

arXiv.org Artificial IntelligenceDec-12-2024

Reinforcement Learning Within the Classical Robotics Stack: A Case Study in Robot Soccer

Labiosa, Adam, Wang, Zhihan, Agarwal, Siddhant, Cong, William, Hemkumar, Geethika, Harish, Abhinav Narayan, Hong, Benjamin, Kelle, Josh, Li, Chen, Li, Yuhao, Shao, Zisen, Stone, Peter, Hanna, Josiah P.

Robot decision-making in partially observable, real-time, dynamic, and multi-agent environments remains a difficult and unsolved challenge. Model-free reinforcement learning (RL) is a promising approach to learning decision-making in such domains, however, end-to-end RL in complex environments is often intractable. To address this challenge in the RoboCup Standard Platform League (SPL) domain, we developed a novel architecture integrating RL within a classical robotics stack, while employing a multi-fidelity sim2real approach and decomposing behavior into learned sub-behaviors with heuristic selection. Our architecture led to victory in the 2024 RoboCup SPL Challenge Shield Division. In this work, we fully describe our system's architecture and empirically analyze key design decisions that contributed to its success. Our approach demonstrates how RL-based behaviors can be integrated into complete robot behavior architectures.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2412.09417

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks

Gao, Chongkai, Zhang, Haozhuo, Xu, Zhixuan, Cai, Zhehao, Shao, Lin

We aim to develop a model-based planning framework for world models that can be scaled with increasing model and data budgets for general-purpose manipulation tasks with only language and vision inputs. To this end, we present FLow-centric generative Planning (FLIP), a model-based planning algorithm on visual space that features three key modules: 1. a multi-modal flow generation model as the general-purpose action proposal module; 2. a flow-conditioned video generation model as the dynamics module; and 3. a vision-language representation learning model as the value module. Given an initial image and language instruction as the goal, FLIP can progressively search for long-horizon flow and video plans that maximize the discounted return to accomplish the task. FLIP is able to synthesize long-horizon plans across objects, robots, and tasks with image flows as the general action representation, and the dense flow information also provides rich guidance for long-horizon video generation. In addition, the synthesized flow and video plans can guide the training of low-level control policies for robot execution. Experiments on diverse benchmarks demonstrate that FLIP can improve both the success rates and quality of long-horizon video plan synthesis and has the interactive world model property, opening up wider applications for future works.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

2412.08261

Country:

Asia > Singapore (0.04)
Europe > Portugal > Guarda > Guarda (0.04)
Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Learn How to Query from Unlabeled Data Streams in Federated Learning

Sun, Yuchang, Li, Xinran, Lin, Tao, Zhang, Jun

Federated learning (FL) enables collaborative learning among decentralized clients while safeguarding the privacy of their local data. Existing studies on FL typically assume offline labeled data available at each client when the training starts. Nevertheless, the training data in practice often arrive at clients in a streaming fashion without ground-truth labels. Given the expensive annotation cost, it is critical to identify a subset of informative samples for labeling on clients. However, selecting samples locally while accommodating the global training objective presents a challenge unique to FL. In this work, we tackle this conundrum by framing the data querying process in FL as a collaborative decentralized decision-making problem and proposing an effective solution named LeaDQ, which leverages multi-agent reinforcement learning algorithms. In particular, under the implicit guidance from global information, LeaDQ effectively learns the local policies for distributed clients and steers them towards selecting samples that can enhance the global model's accuracy. Extensive simulations on image and text tasks show that LeaDQ advances the model performance in various FL scenarios, outperforming the benchmarking algorithms.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2412.08138

Country:

Asia > China > Hong Kong (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Information Technology > Security & Privacy (0.94)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons

Szot, Andrew, Mazoure, Bogdan, Attia, Omar, Timofeev, Aleksei, Agrawal, Harsh, Hjelm, Devon, Gan, Zhe, Kira, Zsolt, Toshev, Alexander

We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language and vision tasks these models are typically trained on. Specifically, our focus lies in areas such as Embodied AI, Games, UI Control, and Planning. To this end, we introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA). GEA is a single unified model capable of grounding itself across these varied domains through a multi-embodiment action tokenizer. GEA is trained with supervised learning on a large dataset of embodied experiences and with online RL in interactive simulators. We explore the data and algorithmic choices necessary to develop such a model. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents. The final GEA model achieves strong generalization performance to unseen tasks across diverse benchmarks compared to other generalist models and benchmark-specific approaches.

large language model, machine learning, preprint arxiv, (18 more...)

2412.08442

Country:

North America > United States > Virginia (0.04)
North America > United States > Indiana (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.93)
Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

The BrowserGym Ecosystem for Web Agent Research

De Chezelles, Thibault Le Sellier, Gasse, Maxime, Drouin, Alexandre, Caccia, Massimo, Boisvert, Léo, Thakkar, Megh, Marty, Tom, Assouel, Rim, Shayegan, Sahar Omidi, Jang, Lawrence Keunho, Lù, Xing Han, Yoran, Ori, Kong, Dehan, Xu, Frank F., Reddy, Siva, Cappart, Quentin, Neubig, Graham, Salakhutdinov, Ruslan, Chapados, Nicolas, Lacoste, Alexandre

The BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents, particularly those leveraging automation and Large Language Models (LLMs) for web interaction tasks. Many existing benchmarks suffer from fragmentation and inconsistent evaluation methodologies, making it challenging to achieve reliable comparisons and reproducible results. BrowserGym aims to solve this by providing a unified, gym-like environment with well-defined observation and action spaces, facilitating standardized evaluation across diverse benchmarks. Combined with AgentLab, a complementary framework that aids in agent creation, testing, and analysis, BrowserGym offers flexibility for integrating new benchmarks while ensuring consistent evaluation and comprehensive experiment management. This standardized approach seeks to reduce the time and complexity of developing web agents, supporting more reliable comparisons and facilitating in-depth analysis of agent behaviors, and could result in more adaptable, capable agents, ultimately accelerating innovation in LLM-driven automation. As a supporting evidence, we conduct the first large-scale, multi-benchmark web agent experiment and compare the performance of 6 state-of-the-art LLMs across all benchmarks currently available in BrowserGym. Among other findings, our results highlight a large discrepancy between OpenAI and Anthropic's latests models, with Claude-3.5-Sonnet leading the way on almost all benchmarks, except on vision-related tasks where GPT-4o is superior. Despite these advancements, our results emphasize that building robust and efficient web agents remains a significant challenge, due to the inherent complexity of real-world web environments and the limitations of current models.

large language model, machine learning, natural language, (19 more...)

2412.05467

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.54)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Robust Markov Decision Processes: A Place Where AI and Formal Methods Meet

Suilen, Marnix, Badings, Thom, Bovy, Eline M., Parker, David, Jansen, Nils

Markov decision processes (MDPs) are a standard model for sequential decision-making problems and are widely used across many scientific areas, including formal methods and artificial intelligence (AI). MDPs do, however, come with the restrictive assumption that the transition probabilities need to be precisely known. Robust MDPs (RMDPs) overcome this assumption by instead defining the transition probabilities to belong to some uncertainty set. We present a gentle survey on RMDPs, providing a tutorial covering their fundamentals. In particular, we discuss RMDP semantics and how to solve them by extending standard MDP methods such as value iteration and policy iteration. We also discuss how RMDPs relate to other models and how they are used in several contexts, including reinforcement learning and abstraction techniques. We conclude with some challenges for future work on RMDPs. Keywords: Robust Markov decision processes Dynamic programming Formal verification Reinforcement learning.

artificial intelligence, machine learning, optimization problem, (18 more...)

2411.11451

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(4 more...)

Genre:

Overview (0.68)
Instructional Material > Course Syllabus & Notes (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Towards Predictive Communication with Brain-Computer Interfaces integrating Large Language Models

Caria, Andrea

This perspective article aims at providing an outline of the state of the art and future developments towards the integration of cutting-edge predictive language models with BCI. A synthetic overview of early and more recent linguistic models, from natural language processing (NLP) models to recent LLM, that to a varying extent improved predictive writing systems, is first provided. Second, a summary of previous BCI implementations integrating language models is presented. The few preliminary studies investigating the possible combination of LLM with BCI spellers to efficiently support fast communication and control are then described. Finally, current challenges and limitations towards the full integration of LLM with BCI systems are discussed. Recent investigations suggest that the combination of LLM with BCI might drastically improve human-computer interaction in patients with motor or language disorders as well as in healthy individuals. In particular, the pretrained autoregressive transformer models, such as GPT, that capitalize from parallelization, learning through pre-training and fine-tuning, promise a substantial improvement of BCI for communication with respect to previous systems incorporating simpler language models. Indeed, among various models, the GPT-2 was shown to represent an excellent candidate for its integration into BCI although testing was only perfomed on simulated conversations and not on real BCI scenarios. Prospectively, the full integration of LLM with advanced BCI systems might lead to a big leap forward towards fast, efficient and user-adaptive neurotechnology.

large language model, machine learning, natural language, (16 more...)

2412.07355

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Non-Normal Diffusion Models

Li, Henry

Diffusion models generate samples by incrementally reversing a process that turns data into noise. We show that when the step size goes to zero, the reversed process is invariant to the distribution of these increments. This reveals a previously unconsidered parameter in the design of diffusion models: the distribution of the diffusion step $\Delta x_k := x_{k} - x_{k + 1}$. This parameter is implicitly set by default to be normally distributed in most diffusion models. By lifting this assumption, we generalize the framework for designing diffusion models and establish an expanded class of diffusion processes with greater flexibility in the choice of loss function used during training. We demonstrate the effectiveness of these models on density estimation and generative modeling tasks on standard image datasets, and show that different choices of the distribution of $\Delta x_k$ result in qualitatively different generated samples.

artificial intelligence, convergence, machine learning, (13 more...)

2412.07935

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Scarlatos, Alexander, Baker, Ryan S., Lan, Andrew

Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs

Recent advances in large language models (LLMs) have led to the development of artificial intelligence (AI)-powered tutoring chatbots, showing promise in providing broad access to high-quality personalized education. Existing works have studied how to make LLMs follow tutoring principles, but have not studied broader uses of LLMs for supporting tutoring. Up until now, tracing student knowledge and analyzing misconceptions has been difficult and time-consuming to implement for open-ended dialogue tutoring. In this work, we investigate whether LLMs can be supportive of this task: we first use LLM prompting methods to identify the knowledge components/skills involved in each dialogue turn, i.e., a tutor utterance posing a task or a student utterance that responds to it. We also evaluate whether the student responds correctly to the tutor and verify the LLM's accuracy using human expert annotations. We then apply a range of knowledge tracing (KT) methods on the resulting labeled data to track student knowledge levels over an entire dialogue. We conduct experiments on two tutoring dialogue datasets, and show that a novel yet simple LLM-based method, LLMKT, significantly outperforms existing KT methods in predicting student response correctness in dialogues. We perform extensive qualitative analyses to highlight the challenges in dialogueKT and outline multiple avenues for future work.

large language model, machine learning, natural language, (17 more...)

2409.1649

Country:

Europe > Switzerland (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)