AITopics | taxnodes:Technology: Instructional Materials

Plotting

taxnodes:Technology: Instructional Materials

News Overviews Instructional Materials AI-Alerts Classics

VideoGUI: A Benchmark for GUI Automation from Instructional Videos Kevin Qinghong Lin

Neural Information Processing SystemsMar-23-2025, 01:51:20 GMT

Graphical User Interface (GUI) automation holds significant promise for enhancing human productivity by assisting with computer tasks. Existing task formulations primarily focus on simple tasks that can be specified by a single, language-only instruction, such as "Insert a new slide." In this work, we introduce VideoGUI, a novel multi-modal benchmark designed to evaluate GUI assistants on visual-centric GUI tasks. Sourced from high-quality web instructional videos, our benchmark focuses on tasks involving professional and novel software (e.g., Adobe Photoshop or Stable Diffusion WebUI) and complex activities (e.g., video editing). VideoGUI evaluates GUI assistants through a hierarchical process, allowing for identification of the specific levels at which they may fail: (i) high-level planning: reconstruct procedural subtasks from visual conditions without language descriptions; (ii) middle-level planning: generate sequences of precise action narrations based on visual state (i.e., screenshot) and goals; (iii) atomic action execution: perform specific actions such as accurately clicking designated elements. For each level, we design evaluation metrics across individual dimensions to provide clear signals, such as individual performance in clicking, dragging, typing, and scrolling for atomic action execution. Our evaluation on VideoGUI reveals that even the SoTA large multimodal model GPT4o performs poorly on visual-centric GUI tasks, especially for high-level planning.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.61)

Industry:

Education > Educational Technology > Audio & Video (0.71)
Education > Educational Technology > Media (0.61)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(3 more...)

Add feedback

f40723ed94042ea9ea36bfb5ad4157b2-Supplemental.pdf

Neural Information Processing SystemsMar-23-2025, 01:29:40 GMT

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Instructional Material > Online (0.40)
Research Report > New Finding (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)

Add feedback

f40723ed94042ea9ea36bfb5ad4157b2-Paper.pdf

Neural Information Processing SystemsMar-23-2025, 01:29:36 GMT

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Instructional Material > Online (0.40)
Research Report > New Finding (0.34)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)

Add feedback

5ebbbac62b968254093023f1c95015d3-Paper-Conference.pdf

Neural Information Processing SystemsMar-23-2025, 00:40:42 GMT

machine learning, rehearsal, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Oceania > New Zealand (0.16)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Online (0.71)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimization Algorithm Design via Electric Circuits

Neural Information Processing SystemsMar-23-2025, 00:04:49 GMT

We present a novel methodology for convex optimization algorithm design using ideas from electric RLC circuits. Given an optimization problem, the first stage of the methodology is to design an appropriate electric circuit whose continuoustime dynamics converge to the solution of the optimization problem at hand. Then, the second stage is an automated, computer-assisted discretization of the continuous-time dynamics, yielding a provably convergent discrete-time algorithm. Our methodology recovers many classical (distributed) optimization algorithms and enables users to quickly design and explore a wide range of new algorithms with convergence guarantees.

algorithm, artificial intelligence, survey article, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.27)

Genre:

Research Report > Experimental Study (0.92)
Instructional Material > Course Syllabus & Notes (0.67)
Overview (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

ViVa: Video-Trained Value Functions for Guiding Online RL from Diverse Data

Dashora, Nitish, Ghosh, Dibya, Levine, Sergey

arXiv.org Artificial IntelligenceMar-23-2025

Online reinforcement learning (RL) with sparse rewards poses a challenge partly because of the lack of feedback on states leading to the goal. Furthermore, expert offline data with reward signal is rarely available to provide this feedback and bootstrap online learning. How can we guide online agents to the right solution without this on-task data? Reward shaping offers a solution by providing fine-grained signal to nudge the policy towards the optimal solution. However, reward shaping often requires domain knowledge to hand-engineer heuristics for a specific goal. To enable more general and inexpensive guidance, we propose and analyze a data-driven methodology that automatically guides RL by learning from widely available video data such as Internet recordings, off-task demonstrations, task failures, and undirected environment interaction. By learning a model of optimal goal-conditioned value from diverse passive data, we open the floor to scaling up and using various data sources to model general goal-reaching behaviors relevant to guiding online RL. Specifically, we use intent-conditioned value functions to learn from diverse videos and incorporate these goal-conditioned values into the reward. Our experiments show that video-trained value functions work well with a variety of data sources, exhibit positive transfer from human video pre-training, can generalize to unseen goals, and scale with dataset size.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2503.1821

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Instructional Material (0.48)
Research Report (0.42)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning-based Self-adaptive Differential Evolution through Automated Landscape Feature Learning

Guo, Hongshu, Ma, Sijie, Huang, Zechuan, Hu, Yuzhi, Ma, Zeyuan, Zhang, Xinglin, Gong, Yue-Jiao

arXiv.org Artificial IntelligenceMar-23-2025

Recently, Meta-Black-Box-Optimization (MetaBBO) methods significantly enhance the performance of traditional black-box optimizers through meta-learning flexible and generalizable meta-level policies that excel in dynamic algorithm configuration (DAC) tasks within the low-level optimization, reducing the expertise required to adapt optimizers for novel optimization tasks. Though promising, existing MetaBBO methods heavily rely on human-crafted feature extraction approach to secure learning effectiveness. To address this issue, this paper introduces a novel MetaBBO method that supports automated feature learning during the meta-learning process, termed as RLDE-AFL, which integrates a learnable feature extraction module into a reinforcement learning-based DE method to learn both the feature encoding and meta-level policy. Specifically, we design an attention-based neural network with mantissa-exponent based embedding to transform the solution populations and corresponding objective values during the low-level optimization into expressive landscape features. We further incorporate a comprehensive algorithm configuration space including diverse DE operators into a reinforcement learning-aided DAC paradigm to unleash the behavior diversity and performance of the proposed RLDE-AFL. Extensive benchmark results show that co-training the proposed feature learning module and DAC policy contributes to the superior optimization performance of RLDE-AFL to several advanced DE methods and recent MetaBBO baselines over both synthetic and realistic BBO scenarios. The source codes of RLDE-AFL are available at https://github.com/GMC-DRL/RLDE-AFL.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2503.18061

Country: Asia > China > Guangdong Province (0.15)

Genre:

Research Report > New Finding (0.48)
Instructional Material > Course Syllabus & Notes (0.34)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Decentralized Navigation of a Cable-Towed Load using Quadrupedal Robot Team via MARL

Chen, Wen-Tse, Nguyen, Minh, Li, Zhongyu, Sue, Guo Ning, Sreenath, Koushil

arXiv.org Artificial IntelligenceMar-23-2025

This work addresses the challenge of enabling a team of quadrupedal robots to collaboratively tow a cable-connected load through cluttered and unstructured environments while avoiding obstacles. Leveraging cables allows the multi-robot system to navigate narrow spaces by maintaining slack when necessary. However, this introduces hybrid physical interactions due to alternating taut and slack states, with computational complexity that scales exponentially as the number of agents increases. To tackle these challenges, we developed a scalable and decentralized system capable of dynamically coordinating a variable number of quadrupedal robots while managing the hybrid physical interactions inherent in the load-towing task. At the core of this system is a novel multi-agent reinforcement learning (MARL)-based planner, designed for decentralized coordination. The MARL-based planner is trained using a centralized training with decentralized execution (CTDE) framework, enabling each robot to make decisions autonomously using only local (ego) observations. To accelerate learning and ensure effective collaboration across varying team sizes, we introduce a tailored training curriculum for MARL. Experimental results highlight the flexibility and scalability of the framework, demonstrating successful deployment with one to four robots in real-world scenarios and up to twelve robots in simulation. The decentralized planner maintains consistent inference times, regardless of the team size. Additionally, the proposed system demonstrates robustness to environment perturbations and adaptability to varying load weights. This work represents a step forward in achieving flexible and efficient multi-legged robotic collaboration in complex and real-world environments.

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

2503.18221

Country: North America > United States > California (0.14)

Genre:

Research Report > New Finding (0.68)
Instructional Material > Course Syllabus & Notes (0.54)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

Neural Information Processing SystemsMar-22-2025, 19:02:37 GMT

However, existing algorithms and theories for learning near-optimal policies in these two settings are rather different and disconnected. Towards bridging this gap, this paper initiates the theoretical study of policy finetuning, that is, online RL where the learner has additional access to a "reference policy" µ close to the optimal policy π

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Industry: Leisure & Entertainment > Games (0.92)

Technology: