AITopics

2502.07635

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Neural Information Processing SystemsFeb-10-2025, 17:14:26 GMT

Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior. Recently, multiple methods have been proposed to run SGD with biased gradient estimates obtained from a Markov chain. This paper provides the first non-asymptotic convergence analysis of these methods by establishing their mixing rate and gradient variance. To do this, we demonstrate that these methods-which we collectively refer to as Markov chain score ascent (MCSA) methods-can be cast as special cases of the Markov chain gradient descent framework. Furthermore, by leveraging this new understanding, we develop a novel MCSA scheme, parallel MCSA (pMCSA), that achieves a tighter bound on the gradient variance. We demonstrate that this improved theoretical result translates to superior empirical performance.

artificial intelligence, machine learning, variational inference, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)

Neural Information Processing SystemsFeb-10-2025, 01:19:40 GMT

Analysis of Brain States from Multi-Region LFP Time-Series

Kyle R. Ulrich, David E. Carlson, Wenzhao Lian, Jana S. Borg, Kafui Dzirasa, Lawrence Carin

The local field potential (LFP) is a source of information about the broad patterns of brain activity, and the frequencies present in these time-series measurements are often highly correlated between regions. It is believed that these regions may jointly constitute a "brain state," relating to cognition and behavior. An infinite hidden Markov model (iHMM) is proposed to model the evolution of brain states, based on electrophysiological LFP data measured at multiple brain regions. A brain state influences the spectral content of each region in the measured LFP. A new state-dependent tensor factorization is employed across brain regions, and the spectral properties of the LFPs are characterized in terms of Gaussian processes (GPs). The LFPs are modeled as a mixture of GPs, with state-and regiondependent mixture weights, and with the spectral content of the data encoded in GP spectral mixture covariance kernels. The model is able to estimate the number of brain states and the number of mixture components in the mixture of GPs. A new variational Bayesian split-merge algorithm is employed for inference. The model infers state changes as a function of external covariates in two novel electrophysiological datasets, using LFP data recorded simultaneously from multiple brain regions in mice; the results are validated and interpreted by subject-matter experts.

artificial intelligence, brain state, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Majumdar, Anirudha, Sharma, Mohit, Kalashnikov, Dmitry, Singh, Sumeet, Sermanet, Pierre, Sindhwani, Vikas

Predictive Red Teaming: Breaking Policies Without Breaking Robots

Is it possible to expose the vulnerabilities of a given robot policy with respect to changes in environmental factors such as lighting, visual distractors, and object placement without performing hardware evaluations in these scenarios? As we seek to deploy robots in environments with ever-increasing complexity, it becomes imperative to develop scalable methods for predicting how well they will generalize when faced with unseen scenarios. Performing hardware evaluations to discover vulnerabilities -- which can depend in surprising ways on the specifics of policy training and architecture -- is often prohibitively expensive to set up and execute, especially when the goal is to test the limits of safe deployment in a sufficiently diverse set of scenarios. As an example, consider a visuomotor diffusion policy [1] trained to perform pick-and-place tasks via behavior cloning (Figure 1). The policy is trained with a large dataset: over 3K+ demonstrations with varied objects, locations, and visual distractors. Will the policy generalize well to a change in the height of the table by a few centimeters (as one may plausibly predict due to the variations in 2D object locations in the training dataset) compared to when a human is standing closer to the table than seen during training? If so, what is the absolute degradation of the success rate in each case? As it turns out, the above prediction is incorrect: the success rate of the policy degrades from 65% under nominal conditions to 10% by changing the table height, and remains roughly constant with a human close to the table. Predicting the relative and absolute impact of other factors (e.g., lighting, table backgrounds, object distractors; Figure 1) can be even more challenging.

large language model, machine learning, predictive red teaming, (16 more...)

2502.06575

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Zhuang, Luoting, Park, Stephen H., Skates, Steven J., Prosper, Ashley E., Aberle, Denise R., Hsu, William

Advancing Precision Oncology Through Modeling of Longitudinal and Multimodal Data

Cancer evolves continuously over time through a complex interplay of genetic, epigenetic, microenvironmental, and phenotypic changes. This dynamic behavior drives uncontrolled cell growth, metastasis, immune evasion, and therapy resistance, posing challenges for effective monitoring and treatment. However, today's data-driven research in oncology has primarily focused on cross-sectional analysis using data from a single modality, limiting the ability to fully characterize and interpret the disease's dynamic heterogeneity. Advances in multiscale data collection and computational methods now enable the discovery of longitudinal multimodal biomarkers for precision oncology. Longitudinal data reveal patterns of disease progression and treatment response that are not evident from single-timepoint data, enabling timely abnormality detection and dynamic treatment adaptation. Multimodal data integration offers complementary information from diverse sources for more precise risk assessment and targeting of cancer therapy. In this review, we survey methods of longitudinal and multimodal modeling, highlighting their synergy in providing multifaceted insights for personalized care tailored to the unique characteristics of a patient's cancer. We summarize the current challenges and future directions of longitudinal multimodal analysis in advancing precision oncology.

bioinformatics, large language model, machine learning, (20 more...)

2502.07836

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Switzerland (0.04)
(8 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(8 more...)

Uddin, Ashab, Sakr, Ahmed Hamdi, Zhang, Ning

Task Offloading in Vehicular Edge Computing using Deep Reinforcement Learning: A Survey

The increasing demand for Intelligent Transportation Systems (ITS) has introduced significant challenges in managing the complex, computation-intensive tasks generated by modern vehicles while offloading tasks to external computing infrastructures such as edge computing (EC), nearby vehicular , and UAVs has become influential solution to these challenges. However, traditional computational offloading strategies often struggle to adapt to the dynamic and heterogeneous nature of vehicular environments. In this study, we explored the potential of Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL) frameworks to optimize computational offloading through adaptive, real-time decision-making, and we have thoroughly investigated the Markov Decision Process (MDP) approaches on the existing literature. The paper focuses on key aspects such as standardized learning models, optimized reward structures, and collaborative multi-agent systems, aiming to advance the understanding and application of DRL in vehicular networks. Our findings offer insights into enhancing the efficiency, scalability, and robustness of ITS, setting the stage for future innovations in this rapidly evolving field.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2502.06963

Country:

Asia > South Korea (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(17 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.45)

Industry:

Transportation (1.00)
Telecommunications (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Krale, Merlijn, Koops, Wietze, Junges, Sebastian, Simão, Thiago D., Jansen, Nils

Tighter Value-Function Approximations for POMDPs

Solving partially observable Markov decision processes (POMDPs) typically requires reasoning about the values of exponentially many state beliefs. Towards practical performance, state-of-the-art solvers use value bounds to guide this reasoning. However, sound upper value bounds are often computationally expensive to compute, and there is a tradeoff between the tightness of such bounds and their computational cost. This paper introduces new and provably tighter upper value bounds than the commonly used fast informed bound. Our empirical evaluation shows that, despite their additional computational overhead, the new upper bounds accelerate state-of-the-art POMDP solvers on a wide range of benchmarks.

artificial intelligence, machine learning, pomdp, (18 more...)

2502.06523

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.04)
North America > United States > Michigan > Wayne County > Detroit (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Carramiñana, David, Bernardos, Ana M., Besada, Juan A., Casar, José R.

Enhancing healthcare infrastructure resilience through agent-based simulation methods

Critical infrastructures face demanding challenges due to natural and human-generated threats, such as pandemics, workforce shortages or cyber-attacks, which might severely compromise service quality. To improve system resilience, decision-makers would need intelligent tools for quick and efficient resource allocation. This article explores an agent-based simulation model that intends to capture a part of the complexity of critical infrastructure systems, particularly considering the interdependencies of healthcare systems with information and telecommunication systems. Such a model enables to implement a simulation-based optimization approach in which the exposure of critical systems to risks is evaluated, while comparing the mitigation effects of multiple tactical and strategical decision alternatives to enhance their resilience. The proposed model is designed to be parameterizable, to enable adapting it to risk scenarios with different severity, and it facilitates the compilation of relevant performance indicators enabling monitoring at both agent level and system level. To validate the agent-based model, a literature-supported methodology has been used to perform cross-validation, sensitivity analysis and test the usefulness of the proposed model through a use case. The use case analyzes the impact of a concurrent pandemic and a cyber-attack on a hospital and compares different resiliency-enhancing countermeasures using contingency tables. Overall, the use case illustrates the feasibility and versatility of the proposed approach to enhance resiliency.

artificial intelligence, machine learning, scenario, (17 more...)

doi: 10.1016/j.comcom.2025.108070

2502.06636

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Galanis, Andreas, Goldberg, Leslie Ann, Zhang, Xusheng

One-Shot Learning for k-SAT

arXiv.org Machine LearningFeb-10-2025

Consider a $k$-SAT formula $\Phi$ where every variable appears at most $d$ times, and let $\sigma$ be a satisfying assignment of $\Phi$ sampled proportionally to $e^{\beta m(\sigma)}$ where $m(\sigma)$ is the number of variables set to true and $\beta$ is a real parameter. Given $\Phi$ and $\sigma$, can we learn the value of $\beta$ efficiently? This problem falls into a recent line of works about single-sample ("one-shot") learning of Markov random fields. The $k$-SAT setting we consider here was recently studied by Galanis, Kandiros, and Kalavasis (SODA'24) where they showed that single-sample learning is possible when roughly $d\leq 2^{k/6.45}$ and impossible when $d\geq (k+1) 2^{k-1}$. Crucially, for their impossibility results they used the existence of unsatisfiable instances which, aside from the gap in $d$, left open the question of whether the feasibility threshold for one-shot learning is dictated by the satisfiability threshold of $k$-SAT formulas of bounded degree. Our main contribution is to answer this question negatively. We show that one-shot learning for $k$-SAT is infeasible well below the satisfiability threshold; in fact, we obtain impossibility results for degrees $d$ as low as $k^2$ when $\beta$ is sufficiently large, and bootstrap this to small values of $\beta$ when $d$ scales exponentially with $k$, via a probabilistic construction. On the positive side, we simplify the analysis of the learning algorithm and obtain significantly stronger bounds on $d$ in terms of $\beta$. In particular, for the uniform case $\beta\rightarrow 0$ that has been studied extensively in the sampling literature, our analysis shows that learning is possible under the condition $d\lesssim 2^{k/2}$. This is nearly optimal (up to constant factors) in the sense that it is known that sampling a uniformly-distributed satisfying assignment is NP-hard for $d\gtrsim 2^{k/2}$.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Machine Learning

2502.07135

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Zhu, Changxi, Dastani, Mehdi, Wang, Shihan

Reducing Variance Caused by Communication in Decentralized Multi-agent Deep Reinforcement Learning

In decentralized multi-agent deep reinforcement learning (MADRL), communication can help agents to gain a better understanding of the environment to better coordinate their behaviors. Nevertheless, communication may involve uncertainty, which potentially introduces variance to the learning of decentralized agents. In this paper, we focus on a specific decentralized MADRL setting with communication and conduct a theoretical analysis to study the variance that is caused by communication in policy gradients. We propose modular techniques to reduce the variance in policy gradients during training. We adopt our modular techniques into two existing algorithms for decentralized MADRL with communication and evaluate them on multiple tasks in the StarCraft Multi-Agent Challenge and Traffic Junction domains. The results show that decentralized MADRL communication methods extended with our proposed techniques not only achieve high-performing agents but also reduce variance in policy gradients during training.

communication, machine learning, reinforcement learning, (16 more...)

2502.06261

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(10 more...)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)