AITopics | tamp

Collaborating Authors

tamp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models

Teymoorianfard, Mohammadreza, Sitaraman, Siddarth, Ma, Shiqing, Houmansadr, Amir

arXiv.org Artificial IntelligenceNov-21-2025

Video diffusion models can generate realistic and temporally consistent videos. This raises concerns about provenance, ownership, and integrity. Watermarking can help address these issues by embedding metadata directly into the content. To work well, a watermark needs enough capacity for meaningful metadata. It must also stay imperceptible and remain robust to common video manipulations. Existing methods struggle with limited capacity, extra inference cost, or reduced visual quality. We introduce VidStamp, a watermarking framework that embeds frame-level messages through the decoder of a latent video diffusion model. The decoder is fine-tuned in two stages. The first stage uses static image datasets to encourage spatial message separation. The second stage uses synthesized video sequences to restore temporal consistency. This approach enables high-capacity watermarks with minimal perceptual impact. VidStamp also supports dynamic watermarking through a control signal that selects message templates during inference. This adds flexibility and creates a second channel for communication. We evaluate VidStamp on Stable Video Diffusion (I2V), OpenSora, and Wan (T2V). The system embeds 48 bits per frame while preserving visual quality and staying robust to common distortions. Compared with VideoSeal, VideoShield, and RivaGAN, it achieves lower log P-values and stronger detectability. Its frame-wise watermarking design also enables precise temporal tamper localization, with an accuracy of 0.96, which exceeds the VideoShield baseline. Code: https://github.com/SPIN-UMass/VidStamp

artificial intelligence, diffusion model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.01406

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation

Zhou, Zihan, Garg, Animesh, Fox, Dieter, Garrett, Caelan, Mandlekar, Ajay

arXiv.org Artificial IntelligenceOct-23-2024

Robot learning has proven to be a general and effective technique for programming manipulators. Imitation learning is able to teach robots solely from human demonstrations but is bottlenecked by the capabilities of the demonstrations. Reinforcement learning uses exploration to discover better behaviors; however, the space of possible improvements can be too large to start from scratch. And for both techniques, the learning difficulty increases proportional to the length of the manipulation task. Accounting for this, we propose SPIRE, a system that first uses Task and Motion Planning (TAMP) to decompose tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths. We develop novel strategies to train learning agents when deployed in the context of a planning system. We evaluate SPIRE on a suite of long-horizon and contact-rich robot manipulation problems. We find that SPIRE outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance, is 6 times more data efficient in the number of human demonstrations needed to train proficient agents, and learns to complete tasks nearly twice as efficiently. View https://sites.google.com/view/spire-corl-2024 for more details.

handoff section, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2410.18065

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Imaging foundation model for universal enhancement of non-ideal measurement CT

Liu, Yuxin, Ge, Rongjun, He, Yuting, Wu, Zhan, You, Chenyu, Li, Shuo, Chen, Yang

arXiv.org Artificial IntelligenceOct-2-2024

Non-ideal measurement computed tomography (NICT), which sacrifices optimal imaging standards for new advantages in CT imaging, is expanding the clinical application scope of CT images. However, with the reduction of imaging standards, the image quality has also been reduced, extremely limiting the clinical acceptability. Although numerous studies have demonstrated the feasibility of deep learning for the NICT enhancement in specific scenarios, their high data cost and limited generalizability have become large obstacles. The recent research on the foundation model has brought new opportunities for building a universal NICT enhancement model - bridging the image quality degradation with minimal data cost. However, owing to the challenges in the collection of large pre-training datasets and the compatibility of data variation, no success has been reported. In this paper, we propose a multi-scale integrated Transformer AMPlifier (TAMP), the first imaging foundation model for universal NICT enhancement. It has been pre-trained on a large-scale physical-driven simulation dataset with 3.6 million NICT-ICT image pairs, and is able to directly generalize to the NICT enhancement tasks with various non-ideal settings and body regions. Via the adaptation with few data, it can further achieve professional performance in real-world specific scenarios. Our extensive experiments have demonstrated that the proposed TAMP has significant potential for promoting the exploration and application of NICT and serving a wider range of medical scenarios.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.01591

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bootstrapping Object-level Planning with Large Language Models

Paulius, David, Agostini, Alejandro, Quartey, Benedict, Konidaris, George

arXiv.org Artificial IntelligenceSep-18-2024

We introduce a new method that extracts knowledge from a large language model (LLM) to produce object-level plans, which describe high-level changes to object state, and uses them to bootstrap task and motion planning (TAMP) in a hierarchical manner. Existing works use LLMs to either directly output task plans or to generate goals in representations like PDDL. However, these methods fall short because they either rely on the LLM to do the actual planning or output a hard-to-satisfy goal. Our approach instead extracts knowledge from a LLM in the form of plan schemas as an object level representation called functional object-oriented networks (FOON), from which we automatically generate PDDL subgoals. Our experiments demonstrate how our method's performance markedly exceeds alternative planning strategies across several tasks in simulation.

language model, llm, robot, (15 more...)

arXiv.org Artificial Intelligence

2409.12262

Country:

Europe > Austria > Tyrol > Innsbruck (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Anticipatory Task and Motion Planning

Dhakal, Roshan, Nguyen, Duc M., Silver, Tom, Xiao, Xuesu, Stein, Gregory J.

arXiv.org Artificial IntelligenceJul-18-2024

We consider a sequential task and motion planning (tamp) setting in which a robot is assigned continuous-space rearrangement-style tasks one-at-a-time in an environment that persists between each. Lacking advance knowledge of future tasks, existing (myopic) planning strategies unwittingly introduce side effects that impede completion of subsequent tasks: e.g., by blocking future access or manipulation. We present anticipatory task and motion planning, in which estimates of expected future cost from a learned model inform selection of plans generated by a model-based tamp planner so as to avoid such side effects, choosing configurations of the environment that both complete the task and minimize overall cost. Simulated multi-task deployments in navigation-among-movable-obstacles and cabinet-loading domains yield improvements of 32.7% and 16.7% average per-task cost respectively. When given time in advance to prepare the environment, our learning-augmented planning approach yields improvements of 83.1% and 22.3%. Both showcase the value of our approach. Finally, we also demonstrate anticipatory tamp on a real-world Fetch mobile manipulator.

robot, sequence, task and motion planning, (14 more...)

arXiv.org Artificial Intelligence

2407.13694

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches

Zhao, Zhigen, Cheng, Shuo, Ding, Yan, Zhou, Ziyi, Zhang, Shiqi, Xu, Danfei, Zhao, Ye

arXiv.org Artificial IntelligenceJun-30-2024

Task and Motion Planning (TAMP) integrates high-level task planning and low-level motion planning to equip robots with the autonomy to effectively reason over long-horizon, dynamic tasks. Optimization-based TAMP focuses on hybrid optimization approaches that define goal conditions via objective functions and are capable of handling open-ended goals, robotic dynamics, and physical interaction between the robot and the environment. Therefore, optimization-based TAMP is particularly suited to solve highly complex, contact-rich locomotion and manipulation problems. This survey provides a comprehensive review on optimization-based TAMP, covering (i) planning domain representations, including action description languages and temporal logic, (ii) individual solution strategies for components of TAMP, including AI planning and trajectory optimization (TO), and (iii) the dynamic interplay between logic-based task planning and model-based TO. A particular focus of this survey is to highlight the algorithm structures to efficiently solve TAMP, especially hierarchical and distributed approaches. Additionally, the survey emphasizes the synergy between the classical methods and contemporary learning-based innovations such as large language models. Furthermore, the future research directions for TAMP is discussed in this survey, highlighting both algorithmic and application-specific challenges.

international conference, motion planning, tamp, (16 more...)

arXiv.org Artificial Intelligence

2404.02817

Country:

North America > United States > New York > Broome County > Binghamton (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Energy (0.46)
Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.67)

Add feedback

Task and Motion Planning for Execution in the Real

Pan, Tianyang, Shome, Rahul, Kavraki, Lydia E.

arXiv.org Artificial IntelligenceJun-13-2024

Task and motion planning represents a powerful set of hybrid planning methods that combine reasoning over discrete task domains and continuous motion generation. Traditional reasoning necessitates task domain models and enough information to ground actions to motion planning queries. Gaps in this knowledge often arise from sources like occlusion or imprecise modeling. This work generates task and motion plans that include actions cannot be fully grounded at planning time. During execution, such an action is handled by a provided human-designed or learned closed-loop behavior. Execution combines offline planned motions and online behaviors till reaching the task goal. Failures of behaviors are fed back as constraints to find new plans. Forty real-robot trials and motivating demonstrations are performed to evaluate the proposed framework and compare against state-of-the-art. Results show faster execution time, less number of actions, and more success in problems where diverse gaps arise. The experiment data is shared for researchers to simulate these settings. The work shows promise in expanding the applicable class of realistic partially grounded problems that robots can address.

constraint, execution, international conference, (12 more...)

arXiv.org Artificial Intelligence

2406.03641

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Energy (0.66)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

Hou, Abe Bohan, Zhang, Jingyu, Wang, Yichen, Khashabi, Daniel, He, Tianxing

arXiv.org Artificial IntelligenceJun-8-2024

Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection.

robustness, tamp, watermark, (15 more...)

arXiv.org Artificial Intelligence

2402.11399

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Africa > Middle East > Egypt (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.57)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
(2 more...)

Add feedback

D-LGP: Dynamic Logic-Geometric Program for Combined Task and Motion Planning

Xue, Teng, Razmjoo, Amirreza, Calinon, Sylvain

arXiv.org Artificial IntelligenceDec-5-2023

Many real-world sequential manipulation tasks involve a combination of discrete symbolic search and continuous motion planning, collectively known as combined task and motion planning (TAMP). However, prevailing methods often struggle with the computational burden and intricate combinatorial challenges stemming from the multitude of action skeletons. To address this, we propose Dynamic Logic-Geometric Program (D-LGP), a novel approach integrating Dynamic Tree Search and global optimization for efficient hybrid planning. Through empirical evaluation on three benchmarks, we demonstrate the efficacy of our approach, showcasing superior performance in comparison to state-of-the-art techniques. We validate our approach through simulation and demonstrate its capability for online replanning under uncertainty and external disturbances in the real world.

configuration, motion planning, optimization, (15 more...)

arXiv.org Artificial Intelligence

2312.02731

Country: Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Imitating Task and Motion Planning with Visuomotor Transformers

Dalal, Murtaza, Mandlekar, Ajay, Garrett, Caelan, Handa, Ankur, Salakhutdinov, Ruslan, Fox, Dieter

arXiv.org Artificial IntelligenceOct-17-2023

Imitation learning is a powerful tool for training robot manipulation policies, allowing them to learn from expert demonstrations without manual programming or trial-and-error. However, common methods of data collection, such as human supervision, scale poorly, as they are time-consuming and labor-intensive. In contrast, Task and Motion Planning (TAMP) can autonomously generate large-scale datasets of diverse demonstrations. In this work, we show that the combination of large-scale datasets generated by TAMP supervisors and flexible Transformer models to fit them is a powerful paradigm for robot manipulation. To that end, we present a novel imitation learning system called OPTIMUS that trains large-scale visuomotor Transformer policies by imitating a TAMP agent. OPTIMUS introduces a pipeline for generating TAMP data that is specifically curated for imitation learning and can be used to train performant transformer-based policies. In this paper, we present a thorough study of the design decisions required to imitate TAMP and demonstrate that OPTIMUS can solve a wide variety of challenging vision-based manipulation tasks with over 70 different objects, ranging from long-horizon pick-and-place tasks, to shelf and articulated object manipulation, achieving 70 to 80% success rates. Video results and code at https://mihdalal.github.io/optimus/

arxiv preprint arxiv, demonstration, optimus, (14 more...)

arXiv.org Artificial Intelligence

2305.16309

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Germany > Berlin (0.04)

Genre:

Research Report (0.64)
Instructional Material > Course Syllabus & Notes (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback