human prompt
UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations
Kim, Hanjung, Kang, Jaehyun, Kang, Hyolim, Cho, Meedeum, Kim, Seon Joo, Lee, Youngwoon
Learning from human videos has emerged as a central paradigm in robot learning, offering a scalable approach to the scarcity of robot-specific data by leveraging large, diverse video sources. Human videos contain everyday behaviors such as human-object interactions, which could provide a rich source of skills for robot learning. Here, a central question arises: Can robots acquire cross-embodiment skill representations by watching large-scale human demonstrations? Translating human videos into robot-executable skill representations has traditionally relied on paired human-robot datasets [1, 2, 3] or predefined semantic skill labels [4, 5], both of which are difficult to scale. Recent approaches aim to bypass these requirements by learning cross-embodiment skill representations without explicit pairing or labeling [6, 7, 8, 9, 10]. However, these methods still impose constraints on data collection, such as multi-view camera setups, and task and scene alignment between human and robot demonstrations, which limit their scalability and applicability to real-world, in-the-wild human videos. To this end, we propose Universal Skill representations (UniSkill), a scalable approach for learning cross-embodiment skill representations from large-scale in-the-wild video data so that a robot can translate an unseen human demonstration into a sequence of robot-executable skill representations, as illustrated in Figure 1.
- North America > United States (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Generative AI-Driven High-Fidelity Human Motion Simulation
Iyer, Hari, Macwan, Neel, Hude, Atharva Jitendra, Jeong, Heejin, Guo, Shenghan
Human motion simulation (HMS) supports cost-effective evaluation of worker behavior, safety, and productivity in industrial tasks. However, existing methods often suffer from low motion fidelity. This study introduces Generative-AI-Enabled HMS (G-AI-HMS), which integrates text-to-text and text-to-motion models to enhance simulation quality for physical tasks. G-AI-HMS tackles two key challenges: (1) translating task descriptions into motion-aware language using Large Language Models aligned with MotionGPT's training vocabulary, and (2) validating AI-enhanced motions against real human movements using computer vision. Posture estimation algorithms are applied to real-time videos to extract joint landmarks, and motion similarity metrics are used to compare them with AI-enhanced sequences. In a case study involving eight tasks, the AI-enhanced motions showed lower error than human created descriptions in most scenarios, performing better in six tasks based on spatial accuracy, four tasks based on alignment after pose normalization, and seven tasks based on overall temporal similarity. Statistical analysis showed that AI-enhanced prompts significantly (p $<$ 0.0001) reduced joint error and temporal misalignment while retaining comparable posture accuracy.
- North America > United States > Arizona (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > Switzerland (0.04)
PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization
Wang, Xinyuan, Li, Chenxi, Wang, Zhen, Bai, Fan, Luo, Haotian, Zhang, Jiayou, Jojic, Nebojsa, Xing, Eric P., Hu, Zhiting
Highly effective, task-specific prompts are often heavily engineered by experts to integrate detailed instructions and domain insights based on a deep understanding of both instincts of large language models (LLMs) and the intricacies of the target task. However, automating the generation of such expert-level prompts remains elusive. Existing prompt optimization methods tend to overlook the depth of domain knowledge and struggle to efficiently explore the vast space of expert-level prompts. Addressing this, we present PromptAgent, an optimization method that autonomously crafts prompts equivalent in quality to those handcrafted by experts. At its core, PromptAgent views prompt optimization as a strategic planning problem and employs a principled planning algorithm, rooted in Monte Carlo tree search, to strategically navigate the expert-level prompt space. Inspired by human-like trial-and-error exploration, PromptAgent induces precise expert-level insights and in-depth instructions by reflecting on model errors and generating constructive error feedback. Such a novel framework allows the agent to iteratively examine intermediate prompts (states), refine them based on error feedbacks (actions), simulate future rewards, and search for high-reward paths leading to expert prompts. We apply PromptAgent to 12 tasks spanning three practical domains: BIG-Bench Hard (BBH), as well as domain-specific and general NLP tasks, showing it significantly outperforms strong Chain-of-Thought and recent prompt optimization baselines. Extensive analyses emphasize its capability to craft expert-level, detailed, and domain-insightful prompts with great efficiency and generalizability.
- Workflow (0.67)
- Research Report (0.64)
- Overview (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
If AI Could Help You Take Control Of Your Life, Would You Let It? - Liwaiwai
Are you ready to experiment with using AI to take control of your workday? We all know that technology is constantly evolving and AI is no exception. In this essay collection, we're going to explore the potential of using AI to reduce your workload and increase your productivity. But keep in mind, this is an experiment, and we can't guarantee that it will work for everyone. We'll take you through the process of finding the right tasks to assign to AI, and explore the ethical considerations of reducing your work hours.
New AI Development So Advanced It's Too Dangerous To Release, Says Scientists
A group of scientists at OpenAI, a nonprofit research company supported by Elon Musk, has raised some red flags by developing an advanced AI they say is too dangerous to be released. For many years, machine learning systems have greatly struggled with the human language. Though it has been a long time coming, remember SmarterChild from the early 2000s? While it could answer simple questions, the AIM bot usually answered with "I'm sorry I do not understand the question." However, with new methods in analyzing texts, AI has the ability to now answer like a human with little indication that it is a program.
Scientists Developed an AI So Advanced They Say It's Too Dangerous to Release
A group of computer scientists once backed by Elon Musk has caused some alarm by developing an advanced artificial intelligence (AI) they say is too dangerous to release to the public. OpenAI, a research non-profit based in San Francisco, says its "chameleon-like" language prediction system, called GPT–2, will only ever see a limited release in a scaled-down version, due to "concerns about malicious applications of the technology". That's because the computer model, which generates original paragraphs of text based on what it is given to'read', is a little too good at its job. The system devises "synthetic text samples of unprecedented quality" that the researchers say are so advanced and convincing, the AI could be used to create fake news, impersonate people, and abuse or trick people on social media. "GPT–2 is trained with a simple objective: predict the next word, given all of the previous words within some text," the OpenAI team explains on its blog.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)