AITopics

-- Social learning constitutes a fundamental framework for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and subsequently adopted by all agents, in place of their own. We analyze this setting theoretically and through numerical simulations, noting that some agents benefit from using the belief of the last agent, while others experience performance deterioration.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2504.02913

Country:

Europe (0.28)
North America > United States (0.28)

Genre:

Research Report (0.90)
Overview (0.68)

Industry: Education > Curriculum (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Ferreira, Simon, Assaad, Charles K.

Identifying Macro Causal Effects in a C-DMG over ADMGs

Causal effect identification using causal graphs is a fundamental challenge in causal inference. While extensive research has been conducted in this area, most existing methods assume the availability of fully specified directed acyclic graphs or acyclic directed mixed graphs. However, in complex domains such as medicine and epidemiology, complete causal knowledge is often unavailable, and only partial information about the system is accessible. This paper focuses on causal effect identification within partially specified causal graphs, with particular emphasis on cluster-directed mixed graphs (C-DMGs) which can represent many different acyclic directed mixed graphs (ADMGs). These graphs provide a higher-level representation of causal relationships by grouping variables into clusters, offering a more practical approach for handling complex systems. Unlike fully specified ADMGs, C-DMGs can contain cycles, which complicate their analysis and interpretation. Furthermore, their cluster-based nature introduces new challenges, as it gives rise to two distinct types of causal effects: macro causal effects and micro causal effects, each with different properties. In this work, we focus on macro causal effects, which describe the effects of entire clusters on other clusters. We establish that the do-calculus is both sound and complete for identifying these effects in C-DMGs over ADMGs when the cluster sizes are either unknown or of size greater than one. Additionally, we provide a graphical characterization of non-identifiability for macro causal effects in these graphs.

artificial intelligence, c-dmg, causal effect, (15 more...)

2504.01551

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Health & Medicine > Epidemiology (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

ESTM: An Enhanced Dual-Branch Spectral-Temporal Mamba for Anomalous Sound Detection

Ma, Chengyuan, Jia, Peng, Guo, Hongyue, Yang, Wenming

The core challenge in industrial equipment anoma lous sound detection (ASD) lies in modeling the time-frequency coupling characteristics of acoustic features. Existing modeling methods are limited by local receptive fields, making it difficult to capture long-range temporal patterns and cross-band dynamic coupling effects in machine acoustic features. In this paper, we propose a novel framework, ESTM, which is based on a dual-path Mamba architecture with time-frequency decoupled modeling and utilizes Selective State-Space Models (SSM) for long-range sequence modeling. ESTM extracts rich feature representations from different time segments and frequency bands by fusing enhanced Mel spectrograms and raw audio features, while further improving sensitivity to anomalous patterns through the TriStat-Gating (TSG) module. Our experiments demonstrate that ESTM improves anomalous detection performance on the DCASE 2020 Task 2 dataset, further validating the effectiveness of the proposed method.

artificial intelligence, international conference, machine learning, (17 more...)

2509.02471

Country:

North America (0.47)
Asia > China (0.29)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
(2 more...)

Anumasa, Srinivas, C, Barath Chandran., Chen, Tingting, Liu, Dianbo

Data-Dependent Smoothing for Protein Discovery with Walk-Jump Sampling

Diffusion models have emerged as a powerful class of generative models by learning to iteratively reverse the noising process. Their ability to generate high-quality samples has extended beyond high-dimensional image data to other complex domains such as proteins, where data distributions are typically sparse and unevenly spread. Importantly, the sparsity itself is uneven. Empirically, we observed that while a small fraction of samples lie in dense clusters, the majority occupy regions of varying sparsity across the data space. Existing approaches largely ignore this data-dependent variability. In this work, we introduce a Data-Dependent Smoothing Walk-Jump framework that employs kernel density estimation (KDE) as a preprocessing step to estimate the noise scale $σ$ for each data point, followed by training a score model with these data-dependent $σ$ values. By incorporating local data geometry into the denoising process, our method accounts for the heterogeneous distribution of protein data. Empirical evaluations demonstrate that our approach yields consistent improvements across multiple metrics, highlighting the importance of data-aware sigma prediction for generative modeling in sparse, high-dimensional settings.

artificial intelligence, machine learning, sequence, (15 more...)

2509.02069

Country: Asia (0.14)

Genre: Research Report (0.65)

Industry: Health & Medicine (0.58)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Huang, Yixuan, Alvina, Novella, Shanthi, Mohanraj Devendran, Hermans, Tucker

Fail2Progress: Learning from Real-World Robot Failures with Stein Variational Inference

Skill effect models for long-horizon manipulation tasks are prone to failures in conditions not covered by training data distributions. Therefore, enabling robots to reason about and learn from failures is necessary. We investigate the problem of efficiently generating a dataset targeted to observed failures. After fine-tuning a skill effect model on this dataset, we evaluate the extent to which the model can recover from failures and minimize future failures. We propose Fail2Progress, an approach that leverages Stein variational inference to generate multiple simulation environments in parallel, enabling efficient data sample generation similar to observed failures. Our method is capable of handling several challenging mobile manipulation tasks, including transporting multiple objects, organizing a constrained shelf, and tabletop organization. Through large-scale simulation and real-world experiments, we demonstrate that our approach excels at learning from failures across different numbers of objects. Furthermore, we show that Fail2Progress outperforms several baselines.

artificial intelligence, fail2progress, machine learning, (16 more...)

2509.01746

Country: Asia (0.28)

Genre: Research Report (0.82)

Industry:

Education (0.67)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Deleu, Tristan, Nouri, Padideh, Bengio, Yoshua, Precup, Doina

Relative Trajectory Balance is equivalent to Trust-PCL

Recent progress in generative modeling has highlighted the importance of Reinforcement Learning (RL) for fine-tuning, with KL-regularized methods in particular proving to be highly effective for both autoregressive and diffusion models. Complementing this line of work, the Relative Trajectory Balance (RTB) objective was recently introduced in the context of Generative Flow Networks (GFlowNets) to serve the same role of improving fine-tuning in sequential generative models. Building on prior work linking GFlowNets and maximum-entropy RL, we establish in this paper an equivalence between RTB and Trust-PCL, an off-policy RL method with KL regularization. This equivalence situates RTB within the broader theoretical landscape of KL-regularized RL, and clarifies its relationship to earlier methods. Leveraging this insight, we revisit an illustrative example from the RTB paper and show that KL-regularized RL methods achieve comparable performance, offering an alternative perspective to what was previously reported.

machine learning, natural language, reinforcement learning, (15 more...)

2509.01632

Country: North America > Canada > Quebec (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Mark, Konstantin, Galustian, Leonard, Kovar, Maximilian P. -P., Heid, Esther

Feynman-Kac-Flow: Inference Steering of Conditional Flow Matching to an Energy-Tilted Posterior

Institute of Materials Chemistry, TU Wien, A-1060 Vienna, Austria Conditional Flow Matching(CFM) represents a fast and high-quality approach to generative modelling, but in many applications it is of interest to steer the generated samples towards precise requirements. While steering approaches like gradient-based guidance, sequential Monte Carlo steering or Feynman-Kac steering are well established for diffusion models, they have not been extended to flow matching approaches yet. In this work, we formulate this requirement as tilting the output with an energy potential. We derive, for the first time, Feynman-Kac steering for CFM. We evaluate our approach on a set of synthetic tasks, including the generation of tilted distributions in a high-dimensional space, which is a particularly challenging case for steering approaches. We then demonstrate the impact of Feynman-Kac steered CFM on the previously unsolved challenge of generated transition states of chemical reactions with the correct chirality, where the reactants or products can have a different handedness, leading to geometric constraints of the viable reaction pathways connecting reactants and products. Code to reproduce this study is avaiable open-source at https://github.com/heid-lab/fkflow. I. INTRODUCTION Since its introduction by Lipman et al. [1], Conditional Flow Matching (CFM) has seen several interesting applications, ranging from image [1], audio [2] and video [3] generation to decision-making [4], time series modelling [5], protein modelling [6, 7] or molecular structure design [8], amongst others. CFM transforms samples from a source distribution (such as random noise) to samples following a given target distribution (such as images or molecular structures) by modelling probability paths via vector fields. It largely improves on diffusion-based methods both in quality and speed, establishing CFM as a popular generative method [1].

artificial intelligence, chirality, machine learning, (18 more...)

2509.01543

Country: Europe > Austria > Vienna (0.54)

Genre: Research Report (0.64)

Industry: Energy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Hadžić, Armin, Papez, Milan, Pevný, Tomáš

Distillation of a tractable model from the VQ-VAE

Deep generative models with discrete latent space, such as the Vector-Quantized Variational Autoencoder (VQ-VAE), offer excellent data generation capabilities, but, due to the large size of their latent space, their probabilistic inference is deemed intractable. We demonstrate that the VQ-VAE can be distilled into a tractable model by selecting a subset of latent variables with high probabilities. This simple strategy is particularly efficient, especially if the VQ-VAE underutilizes its latent space, which is, indeed, very often the case. We frame the distilled model as a probabilistic circuit, and show that it preserves expressiveness of the VQ-VAE while providing tractable probabilistic inference. Experiments illustrate competitive performance in density estimation and conditional generation tasks, challenging the view of the VQ-VAE as an inherently intractable model.

artificial intelligence, latent space, machine learning, (18 more...)

2509.014

Country: Europe > Czechia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Analysis of Error Sources in LLM-based Hypothesis Search for Few-Shot Rule Induction

Parab, Aishni, Lu, Hongjing, Wu, Ying Nian, Gulwani, Sumit

Inductive reasoning enables humans to infer abstract rules from limited examples and apply them to novel situations. In this work, we compare an LLM-based hypothesis search framework with direct program generation approaches on few-shot rule induction tasks. Our findings show that hypothesis search achieves performance comparable to humans, while direct program generation falls notably behind. An error analysis reveals key bottlenecks in hypothesis generation and suggests directions for advancing program induction methods. Overall, this paper underscores the potential of LLM-based hypothesis search for modeling inductive reasoning and the challenges in building more efficient systems.

large language model, machine learning, natural language, (17 more...)

2509.01016

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Mishu, Sadia Zaman, Rafiuddin, S M

Performance Analysis of Supervised Machine Learning Algorithms for Text Classification

The demand for text classification is growing significantly in web searching, data mining, web ranking, recommendation systems, and so many other fields of information and technology. This paper illustrates the text classification process on different datasets using some standard supervised machine learning techniques. Text documents can be classified through various kinds of classifiers. Labeled text documents are used to classify the text in supervised classifications. This paper applies these classifiers on different kinds of labeled documents and measures the accuracy of the classifiers. An Artificial Neural Network (ANN) model using Back Propagation Network (BPN) is used with several other models to create an independent platform for labeled and supervised text classification process. An existing benchmark approach is used to analyze the performance of classification using labeled documents. Experimental analysis on real data reveals which model works well in terms of classification accuracy.

machine learning, natural language, text classification, (13 more...)

2509.00983

Genre: Research Report (0.66)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.74)
(2 more...)