AITopics | input action

Collaborating Authors

input action

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Think-Then-React: Towards Unconstrained Human Action-to-Reaction Generation

Tan, Wenhui, Li, Boyuan, Jin, Chuhao, Huang, Wenbing, Wang, Xiting, Song, Ruihua

arXiv.org Artificial IntelligenceFeb-19-2025

Wenhui T an, Boyuan Li, Chuhao Jin, Wenbing Huang, Xiting Wang & Ruihua Song Gaoling School of Artificial Intelligence Renmin University of China Beijing, China {tanwenhui404,liboyuan,jinchuhao,hwenbing,xitingwang,rsong } @ruc.edu.cn Figure 1: Given a human action as input, our Think-Then-React model first thinks by generating an action description and reasons out a reaction prompt. It then reacts to the action based on the results of this thinking process. TTR reacts in a real-time manner at every timestep and periodically re-thinks at specific interval (every two timesteps in the illustration) to mitigate accumulated errors. Modeling human-like action-to-reaction generation has significant real-world applications, like human-robot interaction and games. Despite recent advancements in single-person motion generation, it is still challenging to well handle action-to-reaction generation, due to the difficulty of directly predicting reaction from action sequence without prompts, and the absence of a unified representation that effectively encodes multi-person motion. To address these challenges, we introduce Think-Then-React (TTR), a large language-model-based framework designed to generate human-like reactions. First, with our fine-grained multimodal training strategy, TTR is capable to unify two processes during inference: a thinking process that explicitly infers action intentions and reasons corresponding reaction description, which serve as semantic prompts, and a reacting process that predicts reactions based on input action and the inferred semantic prompts. Second, to effectively represent multi-person motion in language models, we propose a unified motion tokenizer by decoupling egocentric pose and absolute space features, which effectively represents action and reaction motion with same encoding. Extensive experiments demonstrate that TTR outperforms existing baselines, achieving significant improvements in evaluation metrics, such as reducing FID from 3.988 to 1.942. Predicting human reaction to human action in real world scenario is an online and unconstrained task, i.e., future states and text prompts are inaccessible, and it has board applications in virtual reality, human-robot interaction and gaming. Furthermore, Large Language Models (LLMs) have been applied to human motion generation, demonstrating superior performance (Jiang et al., 2023; Zhang et al., 2024).

reaction, representation, sequence, (17 more...)

arXiv.org Artificial Intelligence

2503.16451

Country: Asia > China > Beijing > Beijing (0.24)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.55)

Add feedback

Constructing Behavior Trees from Temporal Plans for Robotic Applications

Zapf, Josh, Roveri, Marco, Martin, Francisco, Manzanares, Juan Carlos

arXiv.org Artificial IntelligenceJun-25-2024

Executing temporal plans in the real and open world requires adapting to uncertainty both in the environment and in the plan actions. A plan executor must therefore be flexible to dispatch actions based on the actual execution conditions. In general, this involves considering both event and time-based constraints between the actions in the plan. A simple temporal network (STN) is a convenient framework for specifying the constraints between actions in the plan. Likewise, a behavior tree (BT) is a convenient framework for controlling the execution flow of the actions in the plan. The principle contributions of this paper are i) an algorithm for transforming a plan into an STN, and ii) an algorithm for transforming an STN into a BT. When combined, these algorithms define a systematic approach for executing total-order (time-triggered) plans in robots operating in the real world. Our approach is based on creating a graph describing a deordered (state-triggered) plan and then creating a BT representing a partial-order (determined at runtime) plan. This approach ensures the correct execution of plans, including those with required concurrency. We demonstrate the validity of our approach within the PlanSys2 framework on real robots.

execution, input action, node, (15 more...)

arXiv.org Artificial Intelligence

2406.17379

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Google DeepMind's new generative model makes Super Mario–like games from scratch

MIT Technology ReviewFeb-29-2024, 11:43:14 GMT

Genie often adds this effect to the games it generates. While Genie is an in-house research project and won't be released, Guzdial notes that the Google DeepMind team says it could one day be turned into a game-making tool--something he's working on too. "I'm definitely interested to see what they build," he says.

genie, google deepmind, video, (8 more...)

MIT Technology Review

Country: North America > Canada > Alberta (0.19)

Industry: Leisure & Entertainment > Games > Computer Games (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback