AITopics

2510.05446

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Nomura, Kentaro, Aoki, Tatsuya, Taniguchi, Tadahiro, Horii, Takato

Decentralized Collective World Model for Emergent Communication and Coordination

We propose a fully decentralized multi-agent world model that enables both symbol emergence for communication and coordinated behavior through temporal extension of collective predictive coding. Unlike previous research that focuses on either communication or coordination separately, our approach achieves both simultaneously. Our method integrates world models with communication channels, enabling agents to predict environmental dynamics, estimate states from partial observations, and share critical information through bidirectional message exchange with contrastive learning for message alignment. Using a two-agent trajectory drawing task, we demonstrate that our communication-based approach outperforms non-communicative models when agents have divergent perceptual capabilities, achieving the second-best coordination after centralized models. Importantly, our decentralized approach with constraints preventing direct access to other agents' internal states facilitates the emergence of more meaningful symbol systems that accurately reflect environmental states. These findings demonstrate the effectiveness of decentralized communication for supporting coordination while developing shared representations of the environment.

agent, artificial intelligence, machine learning, (17 more...)

2504.03353

Country:

North America > United States (0.46)
Asia > Japan > Honshū > Kansai (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Rahimian, Shadi, Fritz, Mario

DP-SNP-TIHMM: Differentially Private, Time-Inhomogeneous Hidden Markov Models for Synthesizing Genome-Wide Association Datasets

Single nucleotide polymorphism (SNP) datasets are fundamental to genetic studies but pose significant privacy risks when shared. The correlation of SNPs with each other makes strong adversarial attacks such as masked-value reconstruction, kin, and membership inference attacks possible. Existing privacy-preserving approaches either apply differential privacy to statistical summaries of these datasets or offer complex methods that require post-processing and the usage of a publicly available dataset to suppress or selectively share SNPs. In this study, we introduce an innovative framework for generating synthetic SNP sequence datasets using samples derived from time-inhomogeneous hidden Markov models (TIHMMs). To preserve the privacy of the training data, we ensure that each SNP sequence contributes only a bounded influence during training, enabling strong differential privacy guarantees. Crucially, by operating on full SNP sequences and bounding their gradient contributions, our method directly addresses the privacy risks introduced by their inherent correlations. Through experiments conducted on the real-world 1000 Genomes dataset, we demonstrate the efficacy of our method using privacy budgets of $\varepsilon \in [1, 10]$ at $δ=10^{-4}$. Notably, by allowing the transition models of the HMM to be dependent on the location in the sequence, we significantly enhance performance, enabling the synthetic datasets to closely replicate the statistical properties of non-private datasets. This framework facilitates the private sharing of genomic data while offering researchers exceptional flexibility and utility.

artificial intelligence, dataset, machine learning, (19 more...)

2510.05777

Country: Europe (0.93)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies

Hong, Chunsan, An, Seonho, Kim, Min-Soo, Ye, Jong Chul

Masked diffusion models (MDMs) have recently emerged as a novel framework for language modeling. MDMs generate sentences by iteratively denoising masked sequences, filling in [MASK] tokens step by step. Although MDMs support any-order sampling, performance is highly sensitive to the choice of which position to unmask next. Prior work typically relies on rule-based schedules (e.g., max-confidence, max-margin), which provide ad hoc improvements. In contrast, we replace these heuristics with a learned scheduler. Specifically, we cast denoising as a KL-regularized Markov decision process (MDP) with an explicit reference policy and optimize a regularized objective that admits policy improvement and convergence guarantees under standard assumptions. We prove that the optimized policy under this framework generates samples that more closely match the data distribution than heuristic schedules. Empirically, across four benchmarks, our learned policy consistently outperforms max-confidence: for example, on SUDOKU, where unmasking order is critical, it yields a 20.1% gain over random and a 11.2% gain over max-confidence.

large language model, machine learning, natural language, (20 more...)

2510.05725

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks

Anupam, Sagnik, Brown, Davis, Li, Shuo, Wong, Eric, Hassani, Hamed, Bastani, Osbert

LLM web agents now browse and take actions on the open web, yet current agent evaluations are constrained to sandboxed environments or artificial tasks. We introduce BrowserArena, a live open-web agent evaluation platform that collects user-submitted tasks, runs Arena-style head-to-head comparisons, and uses step-level human feedback to surface failure modes. Collecting and analyzing step-level annotations on the agent traces, we identify three consistent failure modes: captcha resolution, pop-up banner removal, and direct navigation to URLs. By constructing targeted datasets to further study these tasks, we discover variations in how different language models navigate these failure modes. We find, for example, that o4-mini deploys a wider variety of strategies to circumvent captcha resolution than other models and DeepSeek-R1 consistently misleads users about pop-up banner closure. Our findings surface both the diversity and brittleness of current web agents. More broadly, our benchmarking methodology provides an approach to evaluating and understanding web agent failure modes at scale.

large language model, machine learning, natural language, (20 more...)

2510.02418

Country: North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Zhang, William, Amin, Saurabh, Perakis, Georgia

Modular and Adaptive Conformal Prediction for Sequential Models via Residual Decomposition

Conformal prediction offers finite-sample coverage guarantees under minimal assumptions. However, existing methods treat the entire modeling process as a black box, overlooking opportunities to exploit modular structure. We introduce a conformal prediction framework for two-stage sequential models, where an upstream predictor generates intermediate representations for a downstream model. By decomposing the overall prediction residual into stage-specific components, our method enables practitioners to attribute uncertainty to specific pipeline stages. We develop a risk-controlled parameter selection procedure using family-wise error rate (FWER) control to calibrate stage-wise scaling parameters, and propose an adaptive extension for non-stationary settings that preserves long-run coverage guarantees. Experiments on synthetic distribution shifts, as well as real-world supply chain and stock market data, demonstrate that our approach maintains coverage under conditions that degrade standard conformal methods, while providing interpretable stage-wise uncertainty attribution. This framework offers diagnostic advantages and robust coverage that standard conformal methods lack.

distribution shift, prediction interval, residual component, (14 more...)

2510.04406

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Banking & Finance > Trading (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Jiang, Nan, Xie, Tengyang

Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees

This article introduces the theory of offline reinforcement learning in large state spaces, where good policies are learned from historical data without online interactions with the environment. Key concepts introduced include expressivity assumptions on function approximation (e.g., Bellman completeness vs. realizability) and data coverage (e.g., all-policy vs. single-policy coverage). A rich landscape of algorithms and results is described, depending on the assumptions one is willing to make and the sample and computational complexity guarantees one wishes to achieve. We also discuss open questions and connections to adjacent areas.

algorithm, assumption, international conference, (12 more...)

2510.04088

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Solinas, Christopher, Haluska, Radovan, Sychrovsky, David, Timbers, Finbarr, Bard, Nolan, Buro, Michael, Schmid, Martin, Sturtevant, Nathan R., Bowling, Michael

Neural Bayesian Filtering

As an example, consider the problem of tracking an autonomous robot with an unknown starting position in a d d grid (Figure 1). Suppose the agent's policy is known, and an observer sees that the agent moved a step without colliding into a wall. This information indicates how the observer should update their beliefs about the agent's position. Tracking these belief states can be challenging when they are either continuous or too large to enumerate (Solinas et al., 2023)--even when the agent's policy and the environment dynamics are known. A common approach frames belief state modeling as a Bayesian filtering problem in which a posterior is maintained and updated with each new observation. Classical Bayesian filters, such as the Kalman Filter (Kalman, 1960) and its nonlinear variants (e.g., Extended and Unscented Kalman Filters (Sorenson, 1985; Julier & Uhlmann, 2004)), assume that the underlying distributions are unimodal and approximately Gaussian. While computationally efficient, this limits their applicability in settings that do not satisfy these assumptions.

belief state, nbf, particle, (15 more...)

2510.03614

Country:

North America > Canada > Alberta (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
(2 more...)

Chen, Yujie, Chakraborty, Antik, Bhadra, Anindya

Exact and Approximate MCMC for Doubly-intractable Probabilistic Graphical Models Leveraging the Underlying Independence Model

Bayesian inference for doubly-intractable probabilistic graphical models typically involves variations of the exchange algorithm or approximate Markov chain Monte Carlo (MCMC) samplers. However, existing methods for both classes of algorithms require either perfect samplers or sequential samplers for complex models, which are often either not available, or suffer from poor mixing, especially in high dimensions. We develop a method that does not require perfect or sequential sampling, and can be applied to both classes of methods: exact and approximate MCMC. The key to our approach is to utilize the tractable independence model underlying an intractable probabilistic graphical model for the purpose of constructing a finite sample unbiased Monte Carlo (and not MCMC) estimate of the Metropolis--Hastings ratio. This innovation turns out to be crucial for scalability in high dimensions. The method is demonstrated on the Ising model. Gradient-based alternatives to construct a proposal, such as Langevin and Hamiltonian Monte Carlo approaches, also arise as a natural corollary to our general procedure, and are demonstrated as well.

estimator, sampler, unbiased estimator, (17 more...)

2510.03587

Country:

Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
North America > United States > Indiana (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)