AITopics

2412.05723

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)

arXiv.org Artificial IntelligenceDec-7-2024

DM-SBL: Channel Estimation under Structured Interference

Wang, Yifan, Yu, Chengjie, Zhu, Jiang, Wang, Fangyong, Tu, Xingbin, Wei, Yan, Qu, Fengzhong

Channel estimation is a fundamental task in communication systems and is critical for effective demodulation. While most works deal with a simple scenario where the measurements are corrupted by the additive white Gaussian noise (AWGN), this work addresses the more challenging scenario where both AWGN and structured interference coexist. Such conditions arise, for example, when a sonar/radar transmitter and a communication receiver operate simultaneously within the same bandwidth. To ensure accurate channel estimation in these scenarios, the sparsity of the channel in the delay domain and the complicate structure of the interference are jointly exploited. Firstly, the score of the structured interference is learned via a neural network based on the diffusion model (DM), while the channel prior is modeled as a Gaussian distribution, with its variance controlling channel sparsity, similar to the setup of the sparse Bayesian learning (SBL). Then, two efficient posterior sampling methods are proposed to jointly estimate the sparse channel and the interference. Nuisance parameters, such as the variance of the prior are estimated via the expectation maximization (EM) algorithm. The proposed method is termed as DM based SBL (DM-SBL). Numerical simulations demonstrate that DM-SBL significantly outperforms conventional approaches that deal with the AWGN scenario, particularly under low signal-to-interference ratio (SIR) conditions. Beyond channel estimation, DM-SBL also shows promise for addressing other linear inverse problems involving structured interference.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2412.05582

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Machine LearningDec-7-2024

Active Sequential Posterior Estimation for Sample-Efficient Simulation-Based Inference

Griesemer, Sam, Cao, Defu, Cui, Zijun, Osorio, Carolina, Liu, Yan

Computer simulations have long presented the exciting possibility of scientific insight into complex real-world processes. Despite the power of modern computing, however, it remains challenging to systematically perform inference under simulation models. This has led to the rise of simulation-based inference (SBI), a class of machine learning-enabled techniques for approaching inverse problems with stochastic simulators. Many such methods, however, require large numbers of simulation samples and face difficulty scaling to high-dimensional settings, often making inference prohibitive under resource-intensive simulators. To mitigate these drawbacks, we introduce active sequential neural posterior estimation (ASNPE). ASNPE brings an active learning scheme into the inference loop to estimate the utility of simulation parameter candidates to the underlying probabilistic model. The proposed acquisition scheme is easily integrated into existing posterior estimation pipelines, allowing for improved sample efficiency with low computational overhead. We further demonstrate the effectiveness of the proposed method in the travel demand calibration setting, a high-dimensional inverse problem commonly requiring computationally expensive traffic simulators. Our method outperforms well-tuned benchmarks and state-of-the-art posterior estimation methods on a large-scale real-world traffic network, as well as demonstrates a performance advantage over non-active counterparts on a suite of SBI benchmark environments.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

2412.0559

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Industry: Transportation (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Wycoff, Nathan, Singh, Lisa O., Arab, Ali, Donato, Katharine M.

Proximal Iteration for Nonlinear Adaptive Lasso

arXiv.org Machine LearningDec-7-2024

Augmenting a smooth cost function with an $\ell_1$ penalty allows analysts to efficiently conduct estimation and variable selection simultaneously in sophisticated models and can be efficiently implemented using proximal gradient methods. However, one drawback of the $\ell_1$ penalty is bias: nonzero parameters are underestimated in magnitude, motivating techniques such as the Adaptive Lasso which endow each parameter with its own penalty coefficient. But it's not clear how these parameter-specific penalties should be set in complex models. In this article, we study the approach of treating the penalty coefficients as additional decision variables to be learned in a \textit{Maximum a Posteriori} manner, developing a proximal gradient approach to joint optimization of these together with the parameters of any differentiable cost function. Beyond reducing bias in estimates, this procedure can also encourage arbitrary sparsity structure via a prior on the penalty coefficients. We compare our method to implementations of specific sparsity structures for non-Gaussian regression on synthetic and real datasets, finding our more general method to be competitive in terms of both speed and accuracy. We then consider nonlinear models for two case studies: COVID-19 vaccination behavior and international refugee movement, highlighting the applicability of this approach to complex problems and intricate sparsity structures.

artificial intelligence, data mining, machine learning, (20 more...)

2412.05726

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Modeling & Simulation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
(3 more...)

arXiv.org Artificial IntelligenceDec-6-2024

Reinforcement Learning: An Overview

Murphy, Kevin

This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement learning and sequential decision making, covering value-based RL, policy-gradient methods, model-based methods, and various other topics (including a very brief discussion of RL+LLMs).

hierarchical reinforcement learning, large language model, machine learning, (22 more...)

2412.05265

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.45)

Genre:

Research Report (1.00)
Overview (1.00)
Workflow (0.93)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Health & Medicine (1.00)
Education (0.92)
Information Technology (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

arXiv.org Artificial IntelligenceDec-6-2024

Estimating the treatment effect over time under general interference through deep learner integrated TMLE

Guo, Suhan, Shen, Furao, Li, Ni

Understanding the effects of quarantine policies in populations with underlying social networks is crucial for public health, yet most causal inference methods fail here due to their assumption of independent individuals. We introduce DeepNetTMLE, a deep-learning-enhanced Targeted Maximum Likelihood Estimation (TMLE) method designed to estimate time-sensitive treatment effects in observational data. DeepNetTMLE mitigates bias from time-varying confounders under general interference by incorporating a temporal module and domain adversarial training to build intervention-invariant representations. This process removes associations between current treatments and historical variables, while the targeting step maintains the bias-variance trade-off, enhancing the reliability of counterfactual predictions. Using simulations of a ``Susceptible-Infected-Recovered'' model with varied quarantine coverages, we show that DeepNetTMLE achieves lower bias and more precise confidence intervals in counterfactual estimates, enabling optimal quarantine recommendations within budget constraints, surpassing state-of-the-art methods.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2412.04799

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Srinivasan, Narayan, Sutton, Matthew, Drovandi, Christopher, South, Leah F

The Polynomial Stein Discrepancy for Assessing Moment Convergence

arXiv.org Machine LearningDec-6-2024

We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference. Classical methods for assessing sample quality like the effective sample size are not appropriate for scalable Bayesian sampling algorithms, such as stochastic gradient Langevin dynamics, that are asymptotically biased. Instead, the gold standard is to use the kernel Stein Discrepancy (KSD), which is itself not scalable given its quadratic cost in the number of samples. The KSD and its faster extensions also typically suffer from the curse-of-dimensionality and can require extensive tuning. To address these limitations, we develop the polynomial Stein discrepancy (PSD) and an associated goodness-of-fit test. While the new test is not fully convergence-determining, we prove that it detects differences in the first r moments in the Bernstein-von Mises limit. We empirically show that the test has higher power than its competitors in several examples, and at a lower computational cost. Finally, we demonstrate that the PSD can assist practitioners to select hyper-parameters of Bayesian sampling algorithms more efficiently than competitors.

artificial intelligence, discrepancy, machine learning, (17 more...)

2412.05135

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(3 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

arXiv.org Artificial IntelligenceDec-6-2024

Detecting Fake News on Social Media: A Novel Reliability Aware Machine-Crowd Hybrid Intelligence-Based Method

Chai, Yidong, Shi, Kangwei, Xie, Jiaheng, Liu, Chunli, Jiang, Yuanchun, Liu, Yezheng

Fake news on social media platforms poses a significant threat to societal systems, underscoring the urgent need for advanced detection methods. The existing detection methods can be divided into machine intelligence-based, crowd intelligence-based, and hybrid intelligence-based methods. Among them, hybrid intelligence-based methods achieve the best performance but fail to consider the reliability issue in detection. In light of this, we propose a novel Reliability Aware Hybrid Intelligence (RAHI) method for fake news detection. Our method comprises three integral modules. The first module employs a Bayesian deep learning model to capture the inherent reliability within machine intelligence. The second module uses an Item Response Theory (IRT)-based user response aggregation to account for the reliability in crowd intelligence. The third module introduces a new distribution fusion mechanism, which takes the distributions derived from both machine and crowd intelligence as input, and outputs a fused distribution that provides predictions along with the associated reliability. The experiments on the Weibo dataset demonstrate the advantages of our method. This study contributes to the research field with a novel RAHI-based method, and the code is shared at https://github.com/Kangwei-g/RAHI. This study has practical implications for three key stakeholders: internet users, online platform managers, and the government.

artificial intelligence, machine learning, social media, (17 more...)

2412.06833

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico (0.04)
North America > United States > Hawaii (0.04)
(9 more...)

Genre: Research Report > New Finding (0.93)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

MacDermott, Matt, Fox, James, Belardinelli, Francesco, Everitt, Tom

Measuring Goal-Directedness

arXiv.org Artificial IntelligenceDec-5-2024

We define maximum entropy goal-directedness (MEG), a formal measure of goal-directedness in causal models and Markov decision processes, and give algorithms for computing it. Measuring goal-directedness is important, as it is a critical element of many concerns about harm from AI. It is also of philosophical interest, as goal-directedness is a key aspect of agency. MEG is based on an adaptation of the maximum causal entropy framework used in inverse reinforcement learning. It can measure goal-directedness with respect to a known utility function, a hypothesis class of utility functions, or a set of random variables. We prove that MEG satisfies several desiderata and demonstrate our algorithms with small-scale experiments.

artificial intelligence, decision support system, machine learning, (19 more...)

2412.04758

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
(2 more...)

Webber, George, Mizuno, Yuya, Howes, Oliver D., Hammers, Alexander, King, Andrew P., Reader, Andrew J.

Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction

arXiv.org Artificial IntelligenceDec-5-2024

Medical image reconstruction with pre-trained score-based generative models (SGMs) has advantages over other existing state-of-the-art deep-learned reconstruction methods, including improved resilience to different scanner setups and advanced image distribution modeling. SGM-based reconstruction has recently been applied to simulated positron emission tomography (PET) datasets, showing improved contrast recovery for out-of-distribution lesions relative to the state-of-the-art. However, existing methods for SGM-based reconstruction from PET data suffer from slow reconstruction, burdensome hyperparameter tuning and slice inconsistency effects (in 3D). In this work, we propose a practical methodology for fully 3D reconstruction that accelerates reconstruction and reduces the number of critical hyperparameters by matching the likelihood of an SGM's reverse diffusion process to a current iterate of the maximum-likelihood expectation maximization algorithm. Using the example of low-count reconstruction from simulated $[^{18}$F]DPA-714 datasets, we show our methodology can match or improve on the NRMSE and SSIM of existing state-of-the-art SGM-based PET reconstruction while reducing reconstruction time and the need for hyperparameter tuning. We evaluate our methodology against state-of-the-art supervised and conventional reconstruction algorithms. Finally, we demonstrate a first-ever implementation of SGM-based reconstruction for real 3D PET data, specifically $[^{18}$F]DPA-714 data, where we integrate perpendicular pre-trained SGMs to eliminate slice inconsistency issues.

hyperparameter, reconstruction, sgm-based reconstruction, (13 more...)

2412.04339

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Switzerland (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)