AITopics

Neural Information Processing SystemsMar-27-2025, 08:11:53 GMT

Learning segmentation from point trajectories

We consider the problem of segmenting objects in videos based on their motion and no other forms of supervision. Prior work has often approached this problem by using the principle of common fate, namely the fact that the motion of points that belong to the same object is strongly correlated. However, most authors have only considered instantaneous motion from optical flow. In this work, we present a way to train a segmentation network using long-term point trajectories as a supervisory signal to complement optical flow. The key difficulty is that long-term motion, unlike instantaneous motion, is difficult to model - any parametric approximation is unlikely to capture complex motion patterns over long periods of time. We instead draw inspiration from subspace clustering approaches, proposing a loss function that seeks to group the trajectories into low-rank matrices where the motion of object points can be approximately explained as a linear combination of other point tracks. Our method outperforms the prior art on motion-based segmentation, which shows the utility of long-term motion and the effectiveness of our formulation.

machine learning, natural language, segmentation, (18 more...)

Country:

Europe > Greece (0.14)
Europe > United Kingdom > England (0.14)
Europe > Italy (0.14)
Europe > Austria (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Neural Information Processing SystemsMar-27-2025, 07:53:41 GMT

Team-Fictitious Play for Reaching Team-Nash Equilibrium in Multi-team Games

Multi-team games, prevalent in robotics and resource management, involve team members striving for a joint best response against other teams. Team-Nash equilibrium (TNE) predicts the outcomes of such coordinated interactions. However, can teams of self-interested agents reach TNE? We introduce Team-Fictitious Play (Team-FP), a new variant of fictitious play where agents respond to the last actions of team members and the beliefs formed about other teams with some inertia in action updates. This design is essential in team coordination beyond the classical fictitious play dynamics.

agent, artificial intelligence, machine learning, (19 more...)

Country: Europe > Greece (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.92)

Neural Information Processing SystemsMar-26-2025, 22:36:57 GMT

Fast Encoder-Based 3D from Casual Videos via Point Track Processing Y oni Kasten 1 Wuyue Lu2 Haggai Maron 1,3 1 NVIDIA Research 2

Predicting 3D geometry in dynamic scenes is a challenging problem. In this problem setup, we are given access to multiple images of a scene taken sequentially, e.g., from a monocular video

dyn 0, machine learning, natural language, (19 more...)

Country:

Europe > Greece (0.14)
Europe > France (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Hardware (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsMar-26-2025, 07:20:28 GMT

Upping the Game: How 2D U-Net Skip Connections Flip 3D Segmentation

In the present study, we introduce an innovative structure for 3D medical image segmentation that effectively integrates 2D U-Net-derived skip connections into the architecture of 3D convolutional neural networks (3D CNNs). Conventional 3D segmentation techniques predominantly depend on isotropic 3D convolutions for the extraction of volumetric features, which frequently engenders inefficiencies due to the varying information density across the three orthogonal axes in medical imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI). This disparity leads to a decline in axial-slice plane feature extraction efficiency, with slice plane features being comparatively underutilized relative to features in the time-axial. To address this issue, we introduce the U-shaped Connection (uC), utilizing simplified 2D U-Net in place of standard skip connections to augment the extraction of the axial-slice plane features while concurrently preserving the volumetric context afforded by 3D convolutions. Based on uC, we further present uC 3DU-Net, an enhanced 3D U-Net backbone that integrates the uC approach to facilitate optimal axial-slice plane feature utilization. Through rigorous experimental validation on five publicly accessible datasets--FLARE2021, OIMHS, FeTA2021, AbdomenCT-1K, and BTCV, the proposed method surpasses contemporary state-of-the-art models. Notably, this performance is achieved while reducing the number of parameters and computational complexity. This investigation underscores the efficacy of incorporating 2D convolutions within the framework of 3D CNNs to overcome the intrinsic limitations of volumetric segmentation, thereby potentially expanding the frontiers of medical image analysis.

artificial intelligence, deep learning, machine learning, (18 more...)

Country:

Europe > Greece (0.14)
Europe > France (0.14)
South America > Peru (0.14)
Europe > Spain (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rupp, Florian, Eberhardinger, Manuel, Eckert, Kai

Simulation-Driven Balancing of Competitive Game Levels with Reinforcement Learning

arXiv.org Artificial IntelligenceMar-24-2025

The balancing process for game levels in competitive two-player contexts involves a lot of manual work and testing, particularly for non-symmetrical game levels. In this work, we frame game balancing as a procedural content generation task and propose an architecture for automatically balancing of tile-based levels within the PCGRL framework (procedural content generation via reinforcement learning). Our architecture is divided into three parts: (1) a level generator, (2) a balancing agent, and (3) a reward modeling simulation. Through repeated simulations, the balancing agent receives rewards for adjusting the level towards a given balancing objective, such as equal win rates for all players. To this end, we propose new swap-based representations to improve the robustness of playability, thereby enabling agents to balance game levels more effectively and quickly compared to traditional PCGRL. By analyzing the agent's swapping behavior, we can infer which tile types have the most impact on the balance. We validate our approach in the Neural MMO (NMMO) environment in a competitive two-player scenario. In this extended conference paper, we present improved results, explore the applicability of the method to various forms of balancing beyond equal balancing, compare the performance to another search-based approach, and discuss the application of existing fairness metrics to game balancing.

evolutionary algorithm, machine learning, reinforcement learning, (19 more...)

doi: 10.1109/TG.2024.3399536

2503.18748

Country:

North America > United States (0.14)
Europe > Greece (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Neural Information Processing SystemsMar-23-2025, 06:57:40 GMT

Universal In-Context Approximation By Prompting Fully Recurrent Models

Zero-shot and in-context learning enable solving tasks without model fine-tuning, making them essential for developing generative model solutions. Therefore, it is crucial to understand whether a pretrained model can be prompted to approximate any function, i.e., whether it is a universal in-context approximator. While it was recently shown that transformer models do possess this property, these results rely on their attention mechanism. Hence, these findings do not apply to fully recurrent architectures like RNNs, LSTMs, and the increasingly popular SSMs. We demonstrate that RNNs, LSTMs, GRUs, Linear RNNs, and linear gated architectures such as Mamba and Hawk/Griffin can also serve as universal in-context approximators. To streamline our argument, we introduce a programming language called LSRL that compiles to these fully recurrent architectures. LSRL may be of independent interest for further studies of fully recurrent models, such as constructing interpretability benchmarks. We also study the role of multiplicative gating and observe that architectures incorporating such gating (e.g., LSTMs, GRUs, Hawk/Griffin) can implement certain operations more stably, making them more viable candidates for practical in-context universal approximation.

machine learning, natural language, node, (19 more...)

Country:

Europe > Greece (0.14)
Europe > Austria (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-23-2025

Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for FCL

Qi, Xiaoming, Zhang, Jingyang, Fu, Huazhu, Yang, Guanyu, Li, Shuo, Jin, Yueming

Federated continual learning (FCL) offers an emerging pattern to facilitate the applicability of federated learning (FL) in real-world scenarios, where tasks evolve dynamically and asynchronously across clients, especially in medical scenario. Existing server-side FCL methods in nature domain construct a continually learnable server model by client aggregation on all-involved tasks. However, they are challenged by: (1) Catastrophic forgetting for previously learned tasks, leading to error accumulation in server model, making it difficult to sustain comprehensive knowledge across all tasks. (2) Biased optimization due to asynchronous tasks handled across different clients, leading to the collision of optimization targets of different clients at the same time steps. In this work, we take the first step to propose a novel server-side FCL pattern in medical domain, Dynamic Allocation Hypernetwork with adaptive model recalibration (\textbf{FedDAH}). It is to facilitate collaborative learning under the distinct and dynamic task streams across clients. To alleviate the catastrophic forgetting, we propose a dynamic allocation hypernetwork (DAHyper) where a continually updated hypernetwork is designed to manage the mapping between task identities and their associated model parameters, enabling the dynamic allocation of the model across clients. For the biased optimization, we introduce a novel adaptive model recalibration (AMR) to incorporate the candidate changes of historical models into current server updates, and assign weights to identical tasks across different time steps based on the similarity for continual optimization. Extensive experiments on the AMOS dataset demonstrate the superiority of our FedDAH to other FCL methods on sites with different task streams. The code is available:https://github.com/jinlab-imvr/FedDAH.

artificial intelligence, machine learning, optimization, (16 more...)

2503.18064

Country:

Asia > Singapore (0.15)
North America > United States (0.14)
Asia > China (0.14)
Europe > Greece (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area (0.47)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Evangelatos, Spyridon, Veroni, Eleni, Efthymiou, Vasilis, Nikolopoulos, Christos, Papadopoulos, Georgios Th., Sarigiannidis, Panagiotis

Exploring Energy Landscapes for Minimal Counterfactual Explanations: Applications in Cybersecurity and Beyond

arXiv.org Artificial IntelligenceMar-23-2025

Counterfactual explanations have emerged as a prominent method in Explainable Artificial Intelligence (XAI), providing intuitive and actionable insights into Machine Learning model decisions. In contrast to other traditional feature attribution methods that assess the importance of input variables, counterfactual explanations focus on identifying the minimal changes required to alter a model's prediction, offering a ``what-if'' analysis that is close to human reasoning. In the context of XAI, counterfactuals enhance transparency, trustworthiness and fairness, offering explanations that are not just interpretable but directly applicable in the decision-making processes. In this paper, we present a novel framework that integrates perturbation theory and statistical mechanics to generate minimal counterfactual explanations in explainable AI. We employ a local Taylor expansion of a Machine Learning model's predictive function and reformulate the counterfactual search as an energy minimization problem over a complex landscape. In sequence, we model the probability of candidate perturbations leveraging the Boltzmann distribution and use simulated annealing for iterative refinement. Our approach systematically identifies the smallest modifications required to change a model's prediction while maintaining plausibility. Experimental results on benchmark datasets for cybersecurity in Internet of Things environments, demonstrate that our method provides actionable, interpretable counterfactuals and offers deeper insights into model sensitivity and decision boundaries in high-dimensional spaces.

artificial intelligence, machine learning, natural language, (17 more...)

2503.18185

Country: Europe > Greece (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.90)

Marzano, Enrico, Pagliarini, Giovanni, Pasini, Riccardo, Sciavicco, Guido, Stan, Ionel Eduard

Symbolic Audio Classification via Modal Decision Tree Learning

arXiv.org Artificial IntelligenceMar-21-2025

The range of potential applications of acoustic analysis is wide. Classification of sounds, in particular, is a typical machine learning task that received a lot of attention in recent years. The most common approaches to sound classification are sub-symbolic, typically based on neural networks, and result in black-box models with high performances but very low transparency. In this work, we consider several audio tasks, namely, age and gender recognition, emotion classification, and respiratory disease diagnosis, and we approach them with a symbolic technique, that is, (modal) decision tree learning. We prove that such tasks can be solved using the same symbolic pipeline, that allows to extract simple rules with very high accuracy and low complexity. In principle, all such tasks could be associated to an autonomous conversation system, which could be useful in different contexts, such as an automatic reservation agent for an hospital or a clinic.

artificial intelligence, decision tree learning, machine learning, (12 more...)

2503.17018

Country:

Europe > Italy (0.14)
Europe > Greece (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)