AITopics

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Neural Information Processing SystemsDec-24-2025, 01:43:09 GMT

Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling

Multi-agent behavior modeling and trajectory forecasting are crucial for the safe navigation of autonomous agents in interactive scenarios. Variational Autoencoder (VAE) has been widely applied in multi-agent interaction modeling to generate diverse behavior and learn a low-dimensional representation for interacting systems. However, existing literature did not formally discuss if a VAE-based model can properly encode interaction into its latent space. In this work, we argue that one of the typical formulations of VAEs in multi-agent modeling suffers from an issue we refer to as social posterior collapse, i.e., the model is prone to ignoring historical social context when predicting the future trajectory of an agent. It could cause significant prediction errors and poor generalization performance.

exploring social posterior collapse, name change, variational autoencoder, (8 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Neural Information Processing SystemsAug-16-2025, 10:01:30 GMT

Interaction Modeling with Multiplex Attention

Modeling multi-agent systems requires understanding how agents interact.

agent, graph, trajectory, (14 more...)

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Neural Information Processing SystemsJan-15-2025, 11:43:24 GMT

Interaction Modeling with Multiplex Attention

Modeling multi-agent systems requires understanding how agents interact. Such systems are often difficult to model because they can involve a variety of types of interactions that layer together to drive rich social behavioral dynamics. Here we introduce a method for accurately modeling multi-agent systems. We present Interaction Modeling with Multiplex Attention (IMMA), a forward prediction model that uses a multiplex latent graph to represent multiple independent types of interactions and attention to account for relations of different strengths. We also introduce Progressive Layer Training, a training strategy for this architecture.

interaction modeling, multiplex attention

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Dal'Col, Lucas, Oliveira, Miguel, Santos, Vítor

Joint Perception and Prediction for Autonomous Driving: A Survey

arXiv.org Artificial IntelligenceDec-18-2024

Perception and prediction modules are critical components of autonomous driving systems, enabling vehicles to navigate safely through complex environments. The perception module is responsible for perceiving the environment, including static and dynamic objects, while the prediction module is responsible for predicting the future behavior of these objects. These modules are typically divided into three tasks: object detection, object tracking, and motion prediction. Traditionally, these tasks are developed and optimized independently, with outputs passed sequentially from one to the next. However, this approach has significant limitations: computational resources are not shared across tasks, the lack of joint optimization can amplify errors as they propagate throughout the pipeline, and uncertainty is rarely propagated between modules, resulting in significant information loss. To address these challenges, the joint perception and prediction paradigm has emerged, integrating perception and prediction into a unified model through multi-task learning. This strategy not only overcomes the limitations of previous methods, but also enables the three tasks to have direct access to raw sensor data, allowing richer and more nuanced environmental interpretations. This paper presents the first comprehensive survey of joint perception and prediction for autonomous driving. We propose a taxonomy that categorizes approaches based on input representation, scene context modeling, and output representation, highlighting their contributions and limitations. Additionally, we present a qualitative analysis and quantitative comparison of existing methods. Finally, we discuss future research directions based on identified gaps in the state-of-the-art.

artificial intelligence, machine learning, representation, (17 more...)

2412.14088

Country:

Europe > Portugal > Aveiro > Aveiro (0.05)
Europe > Switzerland (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2024, 05:33:01 GMT

Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling

Multi-agent behavior modeling and trajectory forecasting are crucial for the safe navigation of autonomous agents in interactive scenarios. Variational Autoencoder (VAE) has been widely applied in multi-agent interaction modeling to generate diverse behavior and learn a low-dimensional representation for interacting systems. However, existing literature did not formally discuss if a VAE-based model can properly encode interaction into its latent space. In this work, we argue that one of the typical formulations of VAEs in multi-agent modeling suffers from an issue we refer to as social posterior collapse, i.e., the model is prone to ignoring historical social context when predicting the future trajectory of an agent. It could cause significant prediction errors and poor generalization performance.

exploring social posterior collapse, interaction modeling, variational autoencoder, (5 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Wagner, Royden, Tas, Ömer Sahin, Steiner, Marlon, Konstantinidis, Fabian, Königshof, Hendrik, Klemp, Marvin, Fernandez, Carlos, Stiller, Christoph

SceneMotion: From Agent-Centric Embeddings to Scene-Wide Forecasts

arXiv.org Artificial IntelligenceAug-19-2024

Self-driving vehicles rely on multimodal motion forecasts to effectively interact with their environment and plan safe maneuvers. We introduce SceneMotion, an attention-based model for forecasting scene-wide motion modes of multiple traffic agents. Our model transforms local agent-centric embeddings into scene-wide forecasts using a novel latent context module. This module learns a scene-wide latent space from multiple agent-centric embeddings, enabling joint forecasting and interaction modeling. The competitive performance in the Waymo Open Interaction Prediction Challenge demonstrates the effectiveness of our approach. Moreover, we cluster future waypoints in time and space to quantify the interaction between agents. We merge all modes and analyze each mode independently to determine which clusters are resolved through interaction or result in conflict. Our implementation is available at: https://github.com/kit-mrt/future-motion

agent, trajectory, waypoint, (14 more...)

2408.01537

Country: Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)

Genre: Research Report (0.40)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.91)

arXiv.org Artificial IntelligenceMay-27-2024

Multi-level Interaction Modeling for Protein Mutational Effect Prediction

Mo, Yuanle, Hong, Xin, Gao, Bowen, Jia, Yinjun, Lan, Yanyan

Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different sidechain conformations, which lead to changes in the backbone conformation, eventually affecting the binding affinity between proteins. However, existing methods typically focus only on sidechain-level interaction modeling, resulting in suboptimal predictions. In this work, we propose a self-supervised multi-level pre-training framework, ProMIM, to fully capture all three levels of interactions with well-designed pretraining objectives. Experiments show ProMIM outperforms all the baselines on the standard benchmark, especially on mutations where significant changes in backbone conformations may occur. In addition, leading results from zero-shot evaluations for SARS-CoV-2 mutational effect prediction and antibody optimization underscore the potential of ProMIM as a powerful next-generation tool for developing novel therapeutic approaches and new drugs.

interaction, mutation, prediction, (14 more...)

2405.17802

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.57)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceJan-29-2024

FIMP: Future Interaction Modeling for Multi-Agent Motion Prediction

Woo, Sungmin, Kim, Minjung, Kim, Donghyeong, Jang, Sungjun, Lee, Sangyoun

Multi-agent motion prediction is a crucial concern in autonomous driving, yet it remains a challenge owing to the ambiguous intentions of dynamic agents and their intricate interactions. Existing studies have attempted to capture interactions between road entities by using the definite data in history timesteps, as future information is not available and involves high uncertainty. However, without sufficient guidance for capturing future states of interacting agents, they frequently produce unrealistic trajectory overlaps. In this work, we propose Future Interaction modeling for Motion Prediction (FIMP), which captures potential future interactions in an end-to-end manner. FIMP adopts a future decoder that implicitly extracts the potential future information in an intermediate feature-level, and identifies the interacting entity pairs through future affinity learning and top-k filtering strategy. Experiments show that our future interaction modeling improves the performance remarkably, leading to superior performance on the Argoverse motion forecasting benchmark.

agent, interaction, prediction, (15 more...)

2401.16189

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.50)

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.36)

Bhattacharyya, Prarthana, Huang, Chengjie, Czarnecki, Krzysztof

SSL-Interactions: Pretext Tasks for Interactive Trajectory Prediction

arXiv.org Artificial IntelligenceJan-15-2024

This paper addresses motion forecasting in multi-agent environments, pivotal for ensuring safety of autonomous vehicles. Traditional as well as recent data-driven marginal trajectory prediction methods struggle to properly learn non-linear agent-to-agent interactions. We present SSL-Interactions that proposes pretext tasks to enhance interaction modeling for trajectory prediction. We introduce four interaction-aware pretext tasks to encapsulate various aspects of agent interactions: range gap prediction, closest distance prediction, direction of movement prediction, and type of interaction prediction. We further propose an approach to curate interaction-heavy scenarios from datasets. This curated data has two advantages: it provides a stronger learning signal to the interaction model, and facilitates generation of pseudo-labels for interaction-centric pretext tasks. We also propose three new metrics specifically designed to evaluate predictions in interactive scenes. Our empirical evaluations indicate SSL-Interactions outperforms state-of-the-art motion forecasting methods quantitatively with up to 8% improvement, and qualitatively, for interaction-heavy scenarios.

agent, prediction, pretext task, (15 more...)