AITopics | pipe

Collaborating Authors

pipe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PIPE: Physics-Informed Position Encoding for Alignment of Satellite Images and Time Series in Typhoon Forecasting

Neural Information Processing SystemsJun-12-2026, 02:12:29 GMT

Multimodal time series forecasting is foundational in various fields, such as utilizing satellite imagery and numerical data for predicting typhoons in climate science. However, existing multimodal approaches primarily focus on utilizing text data to help time series forecasting, leaving the visual data in existing time series datasets underexplored. Furthermore, it is challenging for models to effectively capture the physical information embedded in visual data, such as satellite imagery's temporal and geospatial context, which extends beyond images themselves. To address this gap, we propose physics-informed positional encoding (PIPE), a lightweight method that embeds physical information into vision language models (VLMs). PIPE introduces two key innovations: (1) a physics-informed positional indexing scheme for mapping physics to positional IDs, and (2) a variant-frequency positional encoding mechanism for encoding frequency information of physical variables and sequential order of tokens within the embedding space. By preserving both the physical information and sequential order information, PIPE significantly improves multimodal alignment and forecasting accuracy. Through the experiments on the most representative and the largest open-sourced satellite image dataset, PIPE achieves state-of-the-art performance in both deep learning forecasting and climate domain methods, demonstrating superiority across benchmarks, including a 12\% improvement in typhoon intensity forecasting over prior works.

artificial intelligence, data mining, proceedings, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining (0.82)

Add feedback

MarioGPT: Open-Ended T ext2Level Generation through Large Language Models

Neural Information Processing SystemsFeb-16-2026, 10:32:35 GMT

Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way.

evolutionary algorithm, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)

Add feedback

A graph generation pipeline for critical infrastructures based on heuristics, images and depth data

Diessner, Mike, Tarant, Yannick

arXiv.org Artificial IntelligenceDec-9-2025

Virtual representations of physical critical infrastructures, such as water or energy plants, are used for simulations and digital twins to ensure resilience and continuity of their services. These models usually require 3D point clouds from laser scanners that are expensive to acquire and require specialist knowledge to use. In this article, we present a graph generation pipeline based on photogrammetry. The pipeline detects relevant objects and predicts their relation using RGB images and depth data generated by a stereo camera. This more cost-effective approach uses deep learning for object detection and instance segmentation of the objects, and employs user-defined heuristics or rules to infer their relations. Results of two hydraulic systems show that this strategy can produce graphs close to the ground truth while its flexibility allows the method to be tailored to specific applications and its transparency qualifies it to be used in the high stakes decision-making that is required for critical infrastructures.

artificial intelligence, critical infrastructure, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2512.07269

Genre: Research Report (0.82)

Industry:

Energy (0.93)
Water & Waste Management > Water Management (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

CommonVoice-SpeechRE and RPG-MoGe: Advancing Speech Relation Extraction with a New Dataset and Multi-Order Generative Framework

Ning, Jinzhong, Tulajiang, Paerhati, Le, Yingying, Zhang, Yijia, Sun, Yuanyuan, Lin, Hongfei, Liu, Haifeng

arXiv.org Artificial IntelligenceNov-25-2025

Speech Relation Extraction (SpeechRE) aims to extract relation triplets directly from speech. However, existing benchmark datasets rely heavily on synthetic data, lacking sufficient quantity and diversity of real human speech. Moreover, existing models also suffer from rigid single-order generation templates and weak semantic alignment, substantially limiting their performance. To address these challenges, we introduce CommonVoice-SpeechRE, a large-scale dataset comprising nearly 20,000 real-human speech samples from diverse speakers, establishing a new benchmark for SpeechRE research. Furthermore, we propose the Relation Prompt-Guided Multi-Order Generative Ensemble (RPG-MoGe), a novel framework that features: (1) a multi-order triplet generation ensemble strategy, leveraging data diversity through diverse element orders during both training and inference, and (2) CNN-based latent relation prediction heads that generate explicit relation prompts to guide cross-modal alignment and accurate triplet generation. Experiments show our approach outperforms state-of-the-art methods, providing both a benchmark dataset and an effective solution for real-world SpeechRE. The source code and dataset are publicly available at https://github.com/NingJinzhong/SpeechRE_RPG_MoGe.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.08438

Country: Asia > China > Jiangsu Province (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MarioGPT: Open-Ended T ext2Level Generation through Large Language Models

Neural Information Processing SystemsOct-9-2025, 04:16:15 GMT

Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way.

archive, enemy, mariogpt, (15 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)

Add feedback

Text-to-Level Diffusion Models With Various Text Encoders for Super Mario Bros

Schrum, Jacob, Kilday, Olivia, Salas, Emilio, Hagan, Bess, Williams, Reid

arXiv.org Artificial IntelligenceAug-18-2025

Recent research shows how diffusion models can unconditionally generate tile-based game levels, but use of diffusion models for text-to-level generation is underexplored. There are practical considerations for creating a usable model: caption/level pairs are needed, as is a text embedding model, and a way of generating entire playable levels, rather than individual scenes. We present strategies to automatically assign descriptive captions to an existing dataset, and train diffusion models using both pretrained text encoders and simple transformer models trained from scratch. Captions are automatically assigned to generated scenes so that the degree of overlap between input and output captions can be compared. We also assess the diversity and playability of the resulting level scenes. Results are compared with an unconditional diffusion model and a generative adversarial network, as well as the text-to-level approaches Five-Dollar Model and MarioGPT. Notably, the best diffusion model uses a simple transformer model for text embedding, and takes less time to train than diffusion models employing more complex text encoders, indicating that reliance on larger language models is not necessary. We also present a GUI allowing designers to construct long levels from model-generated scenes.

caption, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.00184

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Fly, Fail, Fix: Iterative Game Repair with Reinforcement Learning and Large Multimodal Models

Zook, Alex, Spjut, Josef, Tremblay, Jonathan

arXiv.org Artificial IntelligenceJul-18-2025

Game design hinges on understanding how static rules and content translate into dynamic player behavior - something modern generative systems that inspect only a game's code or assets struggle to capture. We present an automated design iteration framework that closes this gap by pairing a reinforcement learning (RL) agent, which playtests the game, with a large multimodal model (LMM), which revises the game based on what the agent does. In each loop the RL player completes several episodes, producing (i) numerical play metrics and/or (ii) a compact image strip summarising recent video frames. The LMM designer receives a gameplay goal and the current game configuration, analyses the play traces, and edits the configuration to steer future behaviour toward the goal. We demonstrate results that LMMs can reason over behavioral traces supplied by RL agents to iteratively refine game mechanics, pointing toward practical, scalable tools for AI-assisted game design.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2507.12666

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Positional Encoding meets Persistent Homology on Graphs

Verma, Yogesh, Souza, Amauri H., Garg, Vikas

arXiv.org Artificial IntelligenceJun-9-2025

The local inductive bias of message-passing graph neural networks (GNNs) hampers their ability to exploit key structural information (e.g., connectivity and cycles). Positional encoding (PE) and Persistent Homology (PH) have emerged as two promising approaches to mitigate this issue. PE schemes endow GNNs with location-aware features, while PH methods enhance GNNs with multiresolution topological features. However, a rigorous theoretical characterization of the relative merits and shortcomings of PE and PH has remained elusive. We bridge this gap by establishing that neither paradigm is more expressive than the other, providing novel constructions where one approach fails but the other succeeds. Our insights inform the design of a novel learnable method, PiPE (Persistence-informed Positional Encoding), which is provably more expressive than both PH and PE. PiPE demonstrates strong performance across a variety of tasks (e.g., molecule property prediction, graph classification, and out-of-distribution generalization), thereby advancing the frontiers of graph representation learning. Code is available at https://github.com/Aalto-QuML/PIPE.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.05814

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

PIPE Planner: Pathwise Information Gain with Map Predictions for Indoor Robot Exploration

Baek, Seungjae, Moon, Brady, Kim, Seungchan, Cao, Muqing, Ho, Cherie, Scherer, Sebastian, Jeon, Jeong hwan

arXiv.org Artificial IntelligenceMar-10-2025

Abstract-- Autonomous exploration in unknown environments requires estimating the information gain of an action to guide planning decisions. While prior approaches often compute information gain at discrete waypoints, pathwise integration offers a more comprehensive estimation but is often computationally challenging or infeasible and prone to overestimation. In this work, we propose the Pathwise Information Gain with Map Prediction for Exploration (PIPE) planner, which integrates cumulative sensor coverage along planned trajectories while leveraging map prediction to mitigate overestimation. To enable efficient pathwise coverage computation, we introduce a method to efficiently calculate the expected observation mask along the planned path, significantly reducing computational overhead. Our results highlight the benefits of integrating predictive mapping with pathwise information gain for efficient and informed exploration.

exploration, information gain, sensor coverage, (11 more...)

arXiv.org Artificial Intelligence

2503.07504

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > South Korea > Ulsan > Ulsan (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Advances in Hybrid Modular Climbing Robots: Design Principles and Refinement Strategies

Poon, Ryan, Hunter, Ian

arXiv.org Artificial IntelligenceMar-10-2025

This paper explores the design strategies for hybrid pole- or trunk-climbing robots, focusing on methods to inform design decisions and assess metrics such as adaptability and performance. A wheeled-grasping hybrid robot with modular, tendon-driven grasping arms and a wheeled drive system mounted on a turret was developed to climb columns of varying diameters. Here, the key innovation is the underactuated arms that can be adjusted to different column sizes by adding or removing modular linkages, though the robot also features capabilities like self-locking (the ability of the robot to stay on the column by friction without power), autonomous grasping, and rotation around the column axis. Mathematical models describe conditions for self-locking and vertical climbing. Experimental results demonstrate the robot's efficacy in climbing and self-locking, validating the proposed models and highlighting the potential for fully automated solutions in industrial applications. This work provides a comprehensive framework for evaluating and designing hybrid climbing robots, contributing to advancements in autonomous robotics for environments where climbing tall structures is critical.

column diameter, diameter, robot, (16 more...)

arXiv.org Artificial Intelligence

2503.07423

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Shanghai > Shanghai (0.04)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry: Energy (0.93)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

Add feedback