AITopics

2306.00858

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
(6 more...)

Genre:

Research Report (0.64)
Questionnaire & Opinion Survey (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
(2 more...)

Erdogan, Mete, Baytekin, Nuri Berke, Coban, Serhat Emre, Demir, Alper

Machine Learning and Kalman Filtering for Nanomechanical Mass Spectrometry

arXiv.org Artificial IntelligenceJun-1-2023

Nanomechanical resonant sensors are used in mass spectrometry via detection of resonance frequency jumps. There is a fundamental trade-off between detection speed and accuracy. Temporal and size resolution are limited by the resonator characteristics and noise. A Kalman filtering technique, augmented with maximum-likelihood estimation, was recently proposed as a Pareto optimal solution. We present enhancements and robust realizations for this technique, including a confidence boosted thresholding approach as well as machine learning for event detection. We describe learning techniques that are based on neural networks and boosted decision trees for temporal location and event size estimation. In the pure learning based approach that discards the Kalman filter, the raw data from the sensor are used in training a model for both location and size prediction. In the alternative approach that augments a Kalman filter, the event likelihood history is used in a binary classifier for event occurrence. Locations and sizes are predicted using maximum-likelihood, followed by a Kalman filter that continually improves the size estimate. We present detailed comparisons of the learning based schemes and the confidence boosted thresholding approach, and demonstrate robust performance for a practical realization.

artificial intelligence, likelihood, machine learning, (17 more...)

2306.00563

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
(3 more...)

Genre:

Personal (0.46)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.76)

Moreno-Muñoz, Pablo, Recasens, Pol G., Hauberg, Søren

On Masked Pre-training and the Marginal Likelihood

arXiv.org Artificial IntelligenceJun-1-2023

Masked pre-training removes random input dimensions and learns a model that can predict the missing values. Empirical results indicate that this intuitive form of self-supervised learning yields models that generalize very well to new domains. A theoretical understanding is, however, lacking. This paper shows that masked pre-training with a suitable cumulative scoring function corresponds to maximizing the model's marginal likelihood, which is de facto the Bayesian model selection measure of generalization. Beyond shedding light on the success of masked pre-training, this insight also suggests that Bayesian models can be trained with appropriately designed self-supervision. Empirically, we confirm the developed theory and explore the main learning principles of masked pre-training in large language models.

artificial intelligence, lml, machine learning, (18 more...)

2306.0052

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Karumuri, Sharmila, Bilionis, Ilias

Learning to solve Bayesian inverse problems: An amortized variational inference approach

Inverse problems, i.e., estimating parameters of physical models from experimental data, are ubiquitous in science and engineering. The Bayesian formulation is the gold standard because it alleviates ill-posedness issues and quantifies epistemic uncertainty. Since analytical posteriors are not typically available, one resorts to Markov chain Monte Carlo sampling or approximate variational inference. However, inference needs to be rerun from scratch for each new set of data. This drawback limits the applicability of the Bayesian formulation to real-time settings, e.g., health monitoring of engineered systems, and medical diagnosis. The objective of this paper is to develop a methodology that enables real-time inference by learning the Bayesian inverse map, i.e., the map from data to posteriors. Our approach is as follows. We represent the posterior distribution using a parameterization based on deep neural networks. Next, we learn the network parameters by amortized variational inference method which involves maximizing the expectation of evidence lower bound over all possible datasets compatible with the model. We demonstrate our approach by solving examples a set of benchmark problems from science and engineering. Our results show that the posterior estimates of our approach are in agreement with the corresponding ground truth obtained by Markov chain Monte Carlo. Once trained, our approach provides the posterior parameters of observation just at the cost of a forward pass of the neural network.

artificial intelligence, machine learning, posterior, (18 more...)

2305.20004

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?

Boutin, Victor, Fel, Thomas, Singhal, Lakshya, Mukherji, Rishav, Nagaraj, Akash, Colin, Julien, Serre, Thomas

An important milestone for AI is the development of algorithms that can produce drawings that are indistinguishable from those of humans. Here, we adapt the 'diversity vs. recognizability' scoring framework from Boutin et al, 2022 and find that one-shot diffusion models have indeed started to close the gap between humans and machines. However, using a finer-grained measure of the originality of individual samples, we show that strengthening the guidance of diffusion models helps improve the humanness of their drawings, but they still fall short of approximating the originality and recognizability of human drawings. Comparing human category diagnostic features, collected through an online psychophysics experiment, against those derived from diffusion models reveals that humans rely on fewer and more localized features. Overall, our study suggests that diffusion models have significantly helped improve the quality of machine-generated drawings; however, a gap between humans and machines remains -- in part explainable by discrepancies in visual strategies.

artificial intelligence, human and machine, machine learning, (19 more...)

2301.11722

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Massachusetts (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(2 more...)

Guth, Florentin, Lempereur, Etienne, Bruna, Joan, Mallat, Stéphane

Conditionally Strongly Log-Concave Generative Models

There is a growing gap between the impressive results of deep image generative models and classical algorithms that offer theoretical guarantees. The former suffer from mode collapse or memorization issues, limiting their application to scientific data. The latter require restrictive assumptions such as log-concavity to escape the curse of dimensionality. We partially bridge this gap by introducing conditionally strongly log-concave (CSLC) models, which factorize the data distribution into a product of conditional probability distributions that are strongly log-concave. This factorization is obtained with orthogonal projectors adapted to the data distribution. It leads to efficient parameter estimation and sampling algorithms, with theoretical guarantees, although the data distribution is not globally log-concave. We show that several challenging multiscale processes are conditionally log-concave using wavelet packet orthogonal projectors. Numerical results are shown for physical fields such as the $\varphi^4$ model and weak lensing convergence maps with higher resolution than in previous works.

decomposition, log-concave generative model, projector, (15 more...)

2306.00181

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre:

Research Report (0.50)
Overview (0.34)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Geng, Sinong, Nassif, Houssam, Manzanares, Carlos A.

A Data-Driven State Aggregation Approach for Dynamic Discrete Choice Models

arXiv.org Machine LearningMay-31-2023

We study dynamic discrete choice models, where a commonly studied problem involves estimating parameters of agent reward functions (also known as "structural" parameters), using agent behavioral data. Maximum likelihood estimation for such models requires dynamic programming, which is limited by the curse of dimensionality. In this work, we present a novel algorithm that provides a data-driven method for selecting and aggregating states, which lowers the computational and sample complexity of estimation. Our method works in two stages. In the first stage, we use a flexible inverse reinforcement learning approach to estimate agent Q-functions. We use these estimated Q-functions, along with a clustering algorithm, to select a subset of states that are the most pivotal for driving changes in Q-functions. In the second stage, with these selected "aggregated" states, we conduct maximum likelihood estimation using a commonly used nested fixed-point algorithm. The proposed two-stage approach mitigates the curse of dimensionality by reducing the problem dimension. Theoretically, we derive finite-sample bounds on the associated estimation error, which also characterize the trade-off of computational complexity, estimation error, and sample complexity. We demonstrate the empirical performance of the algorithm in two classic dynamic discrete choice estimation applications.

data mining, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

2304.04916

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Transportation > Air (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Park, Young-Jin, Wang, Hao, Ardeshir, Shervin, Azizan, Navid

Representation Reliability and Its Impact on Downstream Tasks

Self-supervised pre-trained models extract general-purpose representations from data, and quantifying how reliable they are is crucial because many downstream models use these representations as input for their own tasks. To this end, we first introduce a formal definition of representation reliability: the representation for a given test input is considered to be reliable if the downstream models built on top of that representation can consistently generate accurate predictions for that test point. It is desired to estimate the representation reliability without knowing the downstream tasks a priori. We provide a negative result showing that existing frameworks for uncertainty quantification in supervised learning are not suitable for this purpose. As an alternative, we propose an ensemble-based method for quantifying representation reliability, based on the concept of neighborhood consistency in the representation spaces across various pre-trained models. More specifically, the key insight is to use shared neighboring points as anchors to align different representation spaces. We demonstrate through comprehensive numerical experiments that our method is capable of predicting representation reliability with high accuracy.

data mining, machine learning, natural language, (19 more...)

2306.00206

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
(4 more...)

A Comparison of Decision Algorithms on Newcomblike Problems

Altair, Alex

When formulated using Bayesian networks, two standard decision algorithms (Evidential Decision Theory and Causal Decision Theory) can be shown to fail systematically when faced with aspects of the prisoner's dilemma and so-called "Newcomblike" problems. We describe a new form of decision algorithm, called Timeless Decision Theory, which consistently wins on these problems.

artificial intelligence, machine learning, node, (18 more...)

2306.00175

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China (0.05)
North America > United States > New York (0.05)
(8 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.52)

Deception by Omission: Using Adversarial Missingness to Poison Causal Structure Learning

Koyuncu, Deniz, Gittens, Alex, Yener, Bülent, Yung, Moti

Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate causal structural models (SCMs). However, when the data can be audited for correctness (e.g., it is crytographically signed by its source), this adversarial mechanism is invalidated. This work introduces a novel attack methodology wherein the adversary deceptively omits a portion of the true training data to bias the learned causal structures in a desired manner. Theoretically sound attack mechanisms are derived for the case of arbitrary SCMs, and a sample-efficient learning-based heuristic is given for Gaussian SCMs. Experimental validation of these approaches on real and synthetic data sets demonstrates the effectiveness of adversarial missingness attacks at deceiving popular causal structure learning algorithms.

artificial intelligence, machine learning, missingness mechanism, (18 more...)

2305.20043

Country:

North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)