AITopics

Activity detection is an important task in the next generation grant-free multiple access. While there are a number of existing algorithms designed for this purpose, they mostly require precise information about the network, such as large-scale fading coefficients, small-scale fading channel statistics, noise variance at the access points, and user activity probability. Acquiring these information would take a significant overhead and their estimated values might not be accurate. This problem is even more severe in cell-free networks as there are many of these parameters to be acquired. Therefore, this paper sets out to investigate the activity detection problem without the above-mentioned information. In order to handle so many unknown parameters, this paper employs the Bayesian approach, where the unknown variables are endowed with prior distributions which effectively act as regularizations. Together with the likelihood function, a maximum a posteriori (MAP) estimator and a variational inference algorithm are derived. Extensive simulations demonstrate that the proposed methods, even without the knowledge of these system parameters, perform better than existing state-of-the-art methods, such as covariance-based and approximate message passing methods.

algorithm, detection, probability, (14 more...)

doi: 10.1109/TSP.2024.3361090

2401.16775

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Telecommunications (0.93)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wang, Kaizheng, Shariatmadar, Keivan, Manchingal, Shireen Kudukkil, Cuzzolin, Fabio, Moens, David, Hallez, Hans

CreINNs: Credal-Set Interval Neural Networks for Uncertainty Estimation in Classification Tasks

Uncertainty estimation is increasingly attractive for improving the reliability of neural networks. In this work, we present novel credal-set interval neural networks (CreINNs) designed for classification tasks. CreINNs preserve the traditional interval neural network structure, capturing weight uncertainty through deterministic intervals, while forecasting credal sets using the mathematical framework of probability intervals. Experimental validations on an out-of-distribution detection benchmark (CIFAR10 vs SVHN) showcase that CreINNs outperform epistemic uncertainty estimation when compared to variational Bayesian neural networks (BNNs) and deep ensembles (DEs). Furthermore, CreINNs exhibit a notable reduction in computational complexity compared to variational BNNs and demonstrate smaller model sizes than DEs.

creinn, neural network, prediction, (13 more...)

2401.05043

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Graph Neural Networks with a Distribution of Parametrized Graphs

Lee, See Hian, Ji, Feng, Xia, Kelin, Tay, Wee Peng

Traditionally, graph neural networks have been trained using a single observed graph. However, the observed graph represents only one possible realization. In many applications, the graph may encounter uncertainties, such as having erroneous or missing edges, as well as edge weights that provide little informative value. To address these challenges and capture additional information previously absent in the observed graph, we introduce latent variables to parameterize and generate multiple graphs. We obtain the maximum likelihood estimate of the network parameters in an Expectation-Maximization (EM) framework based on the multiple graphs. Specifically, we iteratively determine the distribution of the graphs using a Markov Chain Monte Carlo (MCMC) method, incorporating the principles of PAC-Bayesian theory. Numerical experiments demonstrate improvements in performance against baseline models on node classification for heterogeneous graphs and graph regression on chemistry datasets.

dataset, graph, graph neural network, (11 more...)

2310.16401

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Smyl, Slawek, Bergmeir, Christoph, Dokumentov, Alexander, Long, Xueying, Wibowo, Erwin, Schmidt, Daniel

Local and Global Trend Bayesian Exponential Smoothing Models

This paper describes a family of seasonal and non-seasonal time series models that can be viewed as generalisations of additive and multiplicative exponential smoothing models, to model series that grow faster than linear but slower than exponential. Their development is motivated by fast-growing, volatile time series. In particular, our models have a global trend that can smoothly change from additive to multiplicative, and is combined with a linear local trend. Seasonality when used is multiplicative in our models, and the error is always additive but is heteroscedastic and can grow through a parameter sigma. We leverage state-of-the-art Bayesian fitting techniques to accurately fit these models that are more complex and flexible than standard exponential smoothing models. When applied to the M3 competition data set, our models outperform the best algorithms in the competition as well as other benchmarks, thus achieving to the best of our knowledge the best results of per-series univariate methods on this dataset in the literature. An open-source software package of our method is available.

algorithm, forecasting, time sery, (15 more...)

2309.1395

Country:

Europe > Austria > Vienna (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(2 more...)

Genre:

Research Report (0.64)
Overview > Growing Problem (0.62)

Industry: Information Technology (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Learning Directed Graphical Models with Optimal Transport

Vo, Vy, Le, Trung, Vuong, Long-Tung, Zhao, He, Bonilla, Edwin, Phung, Dinh

Estimating the parameters of a probabilistic directed graphical model from incomplete data remains a long-standing challenge. This is because, in the presence of latent variables, both the likelihood function and posterior distribution are intractable without further assumptions about structural dependencies or model classes. While existing learning methods are fundamentally based on likelihood maximization, here we offer a new view of the parameter learning problem through the lens of optimal transport. This perspective licenses a general framework that operates on any directed graphs without making unrealistic assumptions on the posterior over the latent variables or resorting to black-box variational approximations. We develop a theoretical framework and support it with extensive empirical evidence demonstrating the flexibility and versatility of our approach. Across experiments, we show that not only can our method recover the ground-truth parameters but it also performs comparably or better on downstream applications, notably the non-trivial task of discrete representation learning.

inference, learning directed graphical model, otp-dag, (11 more...)

2305.15927

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.50)

Industry: Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Du, Yilun, Kaelbling, Leslie

Compositional Generative Modeling: A Single Model is Not All You Need

Large monolithic generative models trained on massive amounts of data have become an increasingly dominant approach in AI research. In this paper, we argue that we should instead construct large generative systems by composing smaller generative models together. We show how such a compositional generative approach enables us to learn distributions in a more data-efficient manner, enabling generalization to parts of the data distribution unseen at training time. We further show how this enables us to program and construct new generative models for tasks completely unseen at training. Finally, we show that in many cases, we can discover separate compositional components from data.

arxiv preprint arxiv, compositional generative modeling, generative model, (11 more...)

2402.01103

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
(2 more...)

Giudice, Enrico, Kuipers, Jack, Moffa, Giusi

Bayesian Causal Inference with Gaussian Process Networks

Quantifying the causal relationships from purely observational data between variables in a system is a problem that has attracted great attention in the fields of statistics and machine learning. Full knowledge of the causal relations allows predicting the outcome of direct manipulations on the system, which can generally only be known from interventional data obtained by performing experiments such as randomized controlled trials (Eberhardt and Scheines, 2007). Predicting the effect of such manipulations without the need of costly or infeasible experiments is of great practical relevance, specifically in the fields of computational biology (Sachs et al., 2005), medicine (Richens et al., 2020) or AI (Schölkopf, 2022), since a central question concerns how a complex system will react to some treatment or outside influence of the user. Pearl's rules of do-calculus (Pearl, 2000) allow computing the intervention distributions resulting from these external manipulations from the joint distribution of the set of random variables together with a Directed Acyclic Graph (DAG). The DAG represents the qualitative causal relationships among the variables; each node in the graph represents a variable and a directed edge indicates a direct causal effect. Probabilistic models that are based on such DAGs, commonly called causal Bayesian Networks (BNs), provide conventional grounds for probabilistic causal inference, due to their compact representation of the joint distribution and their intuitive graphical description of the causal structure.

inference, intervention distribution, posterior, (14 more...)

2402.00623

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > Strength High (0.88)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kagrecha, Anmol, Marklund, Henrik, Van Roy, Benjamin, Jeon, Hong Jun, Zeckhauser, Richard

Adaptive Crowdsourcing Via Self-Supervised Learning

Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate. We develop a new approach -- predict-each-worker -- that leverages self-supervised learning and a novel aggregation scheme. This approach adapts weights assigned to crowdworkers based on estimates they provided for previous quantities. When skills vary across crowdworkers or their estimates correlate, the weighted sum offers a more accurate group estimate than the average. Existing algorithms such as expectation maximization can, at least in principle, produce similarly accurate group estimates. However, their computational requirements become onerous when complex models, such as neural networks, are required to express relationships among crowdworkers. Predict-each-worker accommodates such complexity as well as many other practical challenges. We analyze the efficacy of predict-each-worker through theoretical and computational studies. Among other things, we establish asymptotic optimality as the number of engagements per crowdworker grows.

crowdworker, gaussian data-generating process, hyperparameter, (16 more...)

2401.13239

Country:

North America > United States (0.46)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Comprehensive Exploration of Synthetic Data Generation: A Survey

Bauer, André, Trapp, Simon, Stenger, Michael, Leppich, Robert, Kounev, Samuel, Leznik, Mark, Chard, Kyle, Foster, Ian

cvf international conference, flow transformer virtual env, restricted boltzmann machine, (17 more...)

Recent years have witnessed a surge in the popularity of Machine Learning (ML), applied across diverse domains. However, progress is impeded by the scarcity of training data due to expensive acquisition and privacy legislation. Synthetic data emerges as a solution, but the abundance of released models and limited overview literature pose challenges for decision-making. This work surveys 417 Synthetic Data Generation (SDG) models over the last decade, providing a comprehensive overview of model types, functionality, and improvements. Common attributes are identified, leading to a classification and trend analysis. The findings reveal increased model performance and complexity, with neural network-based approaches prevailing, except for privacy-preserving data generation. Computer vision dominates, with GANs as primary generative models, while diffusion models, transformers, and RNNs compete. Implications from our performance evaluation highlight the scarcity of common metrics and datasets, making comparisons challenging. Additionally, the neglect of training and computational costs in literature necessitates attention in future research. This work serves as a guide for SDG model selection and identifies crucial areas for future exploration.

2401.02524

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Ontario > Toronto (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(8 more...)

EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling

Ren, Siyu, Wu, Zhiyong, Zhu, Kenny Q.

Neural language models are probabilistic models of human text. They are predominantly trained using maximum likelihood estimation (MLE), which is equivalent to minimizing the forward cross-entropy between the empirical data distribution and the model distribution. However, various degeneration phenomena are still widely observed when decoding from the distributions learned by such models. We establish that the forward cross-entropy is suboptimal as a distance metric for aligning human and model distribution due to its (1) recall-prioritization (2) negative diversity ignorance and (3) train-test mismatch. In this paper, we propose Earth Mover Distance Optimization (EMO) for auto-regressive language modeling. EMO capitalizes on the inherent properties of earth mover distance to address the aforementioned challenges. Due to the high complexity of direct computation, we further introduce a feasible upper bound for EMO to ease end-to-end training. Upon extensive evaluation of language models trained using EMO and MLE. We find that EMO demonstrates a consistently better language modeling performance than MLE across domains. Moreover, EMO demonstrates noteworthy enhancements in downstream performance with minimal fine-tuning on merely 25,000 sentences. This highlights the tremendous potential of EMO as a lightweight calibration method for enhancing large-scale pre-trained language models.

conference paper, forward cross-entropy, language model, (14 more...)

2310.04691

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
(2 more...)