AITopics

2306.05843

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(3 more...)

arXiv.org Artificial IntelligenceJun-9-2023

Ambiguity Meets Uncertainty: Investigating Uncertainty Estimation for Word Sense Disambiguation

Liu, Zhu, Liu, Ying

Word sense disambiguation (WSD), which aims to determine an appropriate sense for a target word given its context, is crucial for natural language understanding. Existing supervised methods treat WSD as a classification task and have achieved remarkable performance. However, they ignore uncertainty estimation (UE) in the real-world setting, where the data is always noisy and out of distribution. This paper extensively studies UE on the benchmark designed for WSD. Specifically, we first compare four uncertainty scores for a state-of-the-art WSD model and verify that the conventional predictive probabilities obtained at the end of the model are inadequate to quantify uncertainty. Then, we examine the capability of capturing data and model uncertainties by the model with the selected UE score on well-designed test scenarios and discover that the model reflects data uncertainty satisfactorily but underestimates model uncertainty. Furthermore, we explore numerous lexical properties that intrinsically affect data uncertainty and provide a detailed analysis of four critical aspects: the syntactic category, morphology, sense granularity, and semantic relations. The code is available at https://github.com/RyanLiut/WSD-UE.

machine learning, model uncertainty, natural language, (19 more...)

2305.13119

Country:

North America > United States > New Jersey (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(3 more...)

Yang, Songlin, Levy, Roger P., Kim, Yoon

Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars

arXiv.org Artificial IntelligenceJun-9-2023

We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing. Using the probabilistic linear context-free rewriting system (LCFRS) formalism, our approach fixes the rule structure in advance and focuses on parameter learning with maximum likelihood. To reduce the computational complexity of both parsing and parameter estimation, we restrict the grammar formalism to LCFRS-2 (i.e., binary LCFRS with fan-out two) and further discard rules that require O(n^6) time to parse, reducing inference to O(n^5). We find that using a large number of nonterminals is beneficial and thus make use of tensor decomposition-based rank-space dynamic programming with an embedding-based parameterization of rule probabilities to scale up the number of nonterminals. Experiments on German and Dutch show that our approach is able to induce linguistically meaningful trees with continuous and discontinuous structures

computational linguistic, machine learning, natural language, (18 more...)

2212.0914

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > China > Beijing > Beijing (0.04)
(18 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Airiau, Stéphane, Dupuis, Nicholas Kees, Grossi, Davide

Condorcet Markets

Within the classical Condorcet error model for collective binary decisions, we establish equivalence results between elections and markets, showing that the alternative that would be selected by weighed majority voting (under specific weighting schemes) corresponds to the alternative with highest price in the equilibrium of the market (under specific assumptions on the market type). This makes it possible to implement specific weighted majority elections, which are known to have superior truth-tracking performance, through information markets and, crucially, without needing to elicit voters' competences.

equation, equilibrium, equilibrium price, (15 more...)

2306.05028

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kowal, Daniel R., Wu, Bohan

Monte Carlo inference for semiparametric Bayesian regression

arXiv.org Machine LearningJun-8-2023

Data transformations are essential for broad applicability of parametric regression models. However, for Bayesian analysis, joint inference of the transformation and model parameters typically involves restrictive parametric transformations or nonparametric representations that are computationally inefficient and cumbersome for implementation and theoretical analysis, which limits their usability in practice. This paper introduces a simple, general, and efficient strategy for joint posterior inference of an unknown transformation and all regression model parameters. The proposed approach directly targets the posterior distribution of the transformation by linking it with the marginal distributions of the independent and dependent variables, and then deploys a Bayesian nonparametric model via the Bayesian bootstrap. Crucially, this approach delivers (1) joint posterior consistency under general conditions, including multiple model misspecifications, and (2) efficient Monte Carlo (not Markov chain Monte Carlo) inference for the transformation and all parameters for important special cases. These tools apply across a variety of data domains, including real-valued, integer-valued, compactly-supported, and positive data. Simulation studies and an empirical application demonstrate the effectiveness and efficiency of this strategy for semiparametric Bayesian analysis with linear models, quantile regression, and Gaussian processes.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2306.05498

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Government (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Protein Discovery with Discrete Walk-Jump Sampling

Frey, Nathan C., Berenberg, Daniel, Zadorozhny, Karina, Kleinhenz, Joseph, Lafrance-Vanasse, Julien, Hotzel, Isidro, Wu, Yan, Ra, Stephen, Bonneau, Richard, Cho, Kyunghyun, Loukas, Andreas, Gligorijevic, Vladimir, Saremi, Saeed

We resolve difficulties in training and sampling from a discrete generative model by learning a smoothed energy function, sampling from the smoothed data manifold with Langevin Markov chain Monte Carlo (MCMC), and projecting back to the true data manifold with one-step denoising. Our Discrete Walk-Jump Sampling formalism combines the maximum likelihood training of an energy-based model and improved sample quality of a score-based model, while simplifying training and sampling by requiring only a single noise level. We evaluate the robustness of our approach on generative modeling of antibody proteins and introduce the distributional conformity score to benchmark protein generative models. By optimizing and sampling from our models for the proposed distributional conformity score, 97-100% of generated samples are successfully expressed and purified and 35% of functional designs show equal or improved binding affinity compared to known functional antibodies on the first attempt in a single round of laboratory experiments. We also report the first demonstration of long-run fast-mixing MCMC chains where diverse antibody protein classes are visited in a single MCMC chain.

artificial intelligence, machine learning, natural language, (20 more...)

2306.1236

Country:

North America > United States > New York (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(3 more...)

Luo, Weijian, Zhang, Boya, Zhang, Zhihua

Entropy-based Training Methods for Scalable Neural Implicit Sampler

Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning. Traditional approaches like Markov Chain Monte Carlo (MCMC) guarantee asymptotically unbiased samples from such distributions but suffer from computational inefficiency, particularly when dealing with high-dimensional targets, as they require numerous iterations to generate a batch of samples. In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations. Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples without the need for iterative procedures. To train the neural implicit sampler, we introduce two novel methods: the KL training method and the Fisher training method. The former minimizes the Kullback-Leibler divergence, while the latter minimizes the Fisher divergence. By employing these training methods, we effectively optimize the neural implicit sampler to capture the desired target distribution. To demonstrate the effectiveness, efficiency, and scalability of our proposed samplers, we evaluate them on three sampling benchmarks with different scales. These benchmarks include sampling from 2D targets, Bayesian inference, and sampling from high-dimensional energy-based models (EBMs). Notably, in the experiment involving high-dimensional EBMs, our sampler produces samples that are comparable to those generated by MCMC-based methods while being more than 100 times more efficient, showcasing the efficiency of our neural sampler. We believe that the theoretical and empirical contributions presented in this work will stimulate further research on developing efficient samplers for various applications beyond the ones explored in this study.

artificial intelligence, machine learning, sampler, (20 more...)

2306.04952

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Zhou, Wangchunshu, Jiang, Yuchen Eleanor, Wilcox, Ethan, Cotterell, Ryan, Sachan, Mrinmaya

Controlled Text Generation with Natural Language Instructions

Large language models generate fluent texts and can follow natural language instructions to solve a wide range of tasks without task-specific training. Nevertheless, it is notoriously difficult to control their generation to satisfy the various constraints required by different applications. In this work, we present InstructCTG, a controlled text generation framework that incorporates different constraints by conditioning on natural language descriptions and demonstrations of the constraints. In particular, we first extract the underlying constraints of natural texts through a combination of off-the-shelf NLP tools and simple heuristics. We then verbalize the constraints into natural language instructions to form weakly supervised training data. By prepending natural language descriptions of the constraints and a few demonstrations, we fine-tune a pre-trained language model to incorporate various types of constraints. Compared to existing search-based or score-based methods, InstructCTG is more flexible to different constraint types and has a much smaller impact on the generation quality and speed because it does not modify the decoding procedure. Additionally, InstructCTG allows the model to adapt to new constraints without re-training through the use of few-shot task generalization and in-context learning abilities of instruction-tuned language models.

constraint, large language model, machine learning, (20 more...)

2304.14293

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
(5 more...)

Ray, Deep, Murgoitio-Esandi, Javier, Dasgupta, Agnimitra, Oberai, Assad A.

Solution of physics-based inverse problems using conditional generative adversarial networks with full gradient penalty

arXiv.org Artificial IntelligenceJun-7-2023

The solution of probabilistic inverse problems for which the corresponding forward problem is constrained by physical principles is challenging. This is especially true if the dimension of the inferred vector is large and the prior information about it is in the form of a collection of samples. In this work, a novel deep learning based approach is developed and applied to solving these types of problems. The approach utilizes samples of the inferred vector drawn from the prior distribution and a physics-based forward model to generate training data for a conditional Wasserstein generative adversarial network (cWGAN). The cWGAN learns the probability distribution for the inferred vector conditioned on the measurement and produces samples from this distribution. The cWGAN developed in this work differs from earlier versions in that its critic is required to be 1-Lipschitz with respect to both the inferred and the measurement vectors and not just the former. This leads to a loss term with the full (and not partial) gradient penalty. It is shown that this rather simple change leads to a stronger notion of convergence for the conditional density learned by the cWGAN and a more robust and accurate sampling strategy. Through numerical examples it is shown that this change also translates to better accuracy when solving inverse problems. The numerical examples considered include illustrative problems where the true distribution and/or statistics are known, and a more complex inverse problem motivated by applications in biomechanics.

artificial intelligence, cwgan, machine learning, (20 more...)

2306.04895

Country:

North America > United States > Maryland (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany (0.14)
Asia > Middle East > Israel > Mediterranean Sea (0.14)

Genre:

Instructional Material (0.68)
Research Report (0.64)

Industry:

Energy > Oil & Gas > Upstream (0.49)
Health & Medicine > Therapeutic Area (0.47)
Health & Medicine > Health Care Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJun-7-2023

Efficient computation of rankings from pairwise comparisons

Newman, M. E. J.

We study the ranking of individuals, teams, or objects, based on pairwise comparisons between them, using the Bradley-Terry model. Estimates of rankings within this model are commonly made using a simple iterative algorithm first introduced by Zermelo almost a century ago. Here we describe an alternative and similarly simple iteration that provably returns identical results but does so much faster -- over a hundred times faster in some cases. We demonstrate this algorithm with applications to a range of example data sets and derive a number of results regarding its convergence.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2207.00076

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > North Carolina (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Leisure & Entertainment > Sports > Football (0.93)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)