AITopics

artificial intelligence, covariance matrix adaption evolutionary strategy, machine learning, (22 more...)

Within one decade, Deep Learning overtook the dominating solution methods of countless problems of artificial intelligence. ``Deep'' refers to the deep architectures with operations in manifolds of which there are no immediate observations. For these deep architectures some kind of structure is pre-defined -- but what is this structure? With a formal definition for structures of neural networks, neural architecture search problems and solution methods can be formulated under a common framework. Both practical and theoretical questions arise from closing the gap between applied neural architecture search and learning theory. Does structure make a difference or can it be chosen arbitrarily? This work is concerned with deep structures of artificial neural networks and examines automatic construction methods under empirical principles to shed light on to the so called ``black-box models''. Our contributions include a formulation of graph-induced neural networks that is used to pose optimisation problems for neural architecture. We analyse structural properties for different neural network objectives such as correctness, robustness or energy consumption and discuss how structure affects them. Selected automation methods for neural architecture optimisation problems are discussed and empirically analysed. With the insights gained from formalising graph-induced neural networks, analysing structural properties and comparing the applicability of neural architecture search methods qualitatively and quantitatively we advance these methods in two ways. First, new predictive models are presented for replacing computationally expensive evaluation schemes, and second, new generative models for informed sampling during neural architecture search are analysed and discussed.

2410.09579

Country:

Europe (1.00)
North America > United States > California (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.67)
Leisure & Entertainment > Games (0.45)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

Jiao, Xiaoran, Mao, Weian, Jin, Wengong, Yang, Peiyuan, Chen, Hao, Shen, Chunhua

Predicting the change in binding free energy ($\Delta \Delta G$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $\Delta \Delta G$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained inverse folding models to $\Delta \Delta G$ prediction. We begin by analyzing the thermodynamic definition of $\Delta \Delta G$ and introducing the Boltzmann distribution to connect energy with protein conformational distribution. However, the protein conformational distribution is intractable; therefore, we employ Bayes' theorem to circumvent direct estimation and instead utilize the log-likelihood provided by protein inverse folding models for $\Delta \Delta G$ estimation. Compared to previous inverse folding-based methods, our method explicitly accounts for the unbound state of protein complex in the $\Delta \Delta G$ thermodynamic cycle, introducing a physical inductive bias and achieving both supervised and unsupervised state-of-the-art (SoTA) performance. Experimental results on SKEMPI v2 indicate that our method achieves Spearman coefficients of 0.3201 (unsupervised) and 0.5134 (supervised), significantly surpassing the previously reported SoTA values of 0.2632 and 0.4324, respectively. Futhermore, we demonstrate the capability of our method on binding energy prediction, protein-protein docking and antibody optimization tasks.

artificial intelligence, machine learning, prediction, (14 more...)

2410.09543

Country:

Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China (0.04)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

VERITAS-NLI : Validation and Extraction of Reliable Information Through Automated Scraping and Natural Language Inference

Shah, Arjun, Shah, Hetansh, Bafna, Vedica, Khandor, Charmi, Nair, Sindhu

In today's day and age where information is rapidly spread through online platforms, the rise of fake news poses an alarming threat to the integrity of public discourse, societal trust, and reputed news sources. Classical machine learning and Transformer-based models have been extensively studied for the task of fake news detection, however they are hampered by their reliance on training data and are unable to generalize on unseen headlines. To address these challenges, we propose our novel solution, leveraging web-scraping techniques and Natural Language Inference (NLI) models to retrieve external knowledge necessary for verifying the accuracy of a headline. Our system is evaluated on a diverse self-curated evaluation dataset spanning over multiple news channels and broad domains. Our best performing pipeline achieves an accuracy of 84.3% surpassing the best classical Machine Learning model by 33.3% and Bidirectional Encoder Representations from Transformers (BERT) by 31.0% . This highlights the efficacy of combining dynamic web-scraping with Natural Language Inference to find support for a claimed headline in the corresponding externally retrieved knowledge for the task of fake news detection.

large language model, machine learning, natural language, (19 more...)

2410.09455

Country:

Asia > India (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Qatar (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Information Discovery in e-Commerce

Ren, Zhaochun, He, Xiangnan, Yin, Dawei, de Rijke, Maarten

conversational recommendation system, multi-turn chinese dialogue dataset, post-click conversion rate estimation, (15 more...)

Electronic commerce, or e-commerce, is the buying and selling of goods and services, or the transmitting of funds or data online. E-commerce platforms come in many kinds, with global players such as Amazon, Airbnb, Alibaba, eBay and platforms targeting specific geographic regions. Information retrieval has a natural role to play in e-commerce, especially in connecting people to goods and services. Information discovery in e-commerce concerns different types of search (e.g., exploratory search vs. lookup tasks), recommender systems, and natural language processing in e-commerce portals. The rise in popularity of e-commerce sites has made research on information discovery in e-commerce an increasingly active research area. This is witnessed by an increase in publications and dedicated workshops in this space. Methods for information discovery in e-commerce largely focus on improving the effectiveness of e-commerce search and recommender systems, on enriching and using knowledge graphs to support e-commerce, and on developing innovative question answering and bot-based solutions that help to connect people to goods and services. In this survey, an overview is given of the fundamental infrastructure, algorithms, and technical solutions for information discovery in e-commerce. The topics covered include user behavior and profiling, search, recommendation, and language technology in e-commerce.

2410.05763

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Africa > Togo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Promising Solution (0.92)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > e-Commerce (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(13 more...)

arXiv.org Machine LearningOct-12-2024

Scalable Weibull Graph Attention Autoencoder for Modeling Document Networks

Wang, Chaojie, Liu, Xinyang, Wang, Dongsheng, Zhang, Hao, Chen, Bo, Zhou, Mingyuan

Although existing variational graph autoencoders (VGAEs) have been widely used for modeling and generating graph-structured data, most of them are still not flexible enough to approximate the sparse and skewed latent node representations, especially those of document relational networks (DRNs) with discrete observations. To analyze a collection of interconnected documents, a typical branch of Bayesian models, specifically relational topic models (RTMs), has proven their efficacy in describing both link structures and document contents of DRNs, which motives us to incorporate RTMs with existing VGAEs to alleviate their potential issues when modeling the generation of DRNs. In this paper, moving beyond the sophisticated approximate assumptions of traditional RTMs, we develop a graph Poisson factor analysis (GPFA), which provides analytic conditional posteriors to improve the inference accuracy, and extend GPFA to a multi-stochastic-layer version named graph Poisson gamma belief network (GPGBN) to capture the hierarchical document relationships at multiple semantic levels. Then, taking GPGBN as the decoder, we combine it with various Weibull-based graph inference networks, resulting in two variants of Weibull graph auto-encoder (WGAE), equipped with model inference algorithms. Experimental results demonstrate that our models can extract high-quality hierarchical latent document representations and achieve promising performance on various graph analytic tasks.

adjacency matrix, gpgbn, representation, (14 more...)

arXiv.org Machine Learning

2410.09696

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
North America > United States > California (0.04)
(7 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Neural Information Processing SystemsOct-11-2024, 15:44:35 GMT

Recurrent Bayesian Classifier Chains for Exact Multi-Label Classification

Exact multi-label classification is the task of assigning each datapoint a set of class labels such that the assigned set exactly matches the ground truth. Optimizing for exact multi-label classification is important in domains where missing a single label can be especially costly, such as in object detection for autonomous vehicles or symptom classification for disease diagnosis. Recurrent Classifier Chains (RCCs), a recurrent neural network extension of ensemble-based classifier chains, are the state-of-the-art exact multi-label classification method for maximizing subset accuracy. However, RCCs iteratively predict classes with an unprincipled ordering, and therefore indiscriminately condition class probabilities. These disadvantages make RCCs prone to predicting inaccurate label sets. In this work we propose Recurrent Bayesian Classifier Chains (RBCCs), which learn a Bayesian network of class dependencies and leverage this network in order to condition the prediction of child nodes only on their parents.

exact multi-label classification, prediction, recurrent bayesian classifier chain

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Neural Information Processing SystemsOct-11-2024, 15:22:34 GMT

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

This paper considers the challenging tasks of Multi-Agent Reinforcement Learning (MARL) under partial observability, where each agent only sees her own individual observations and actions that reveal incomplete information about the underlying state of system. This paper studies these tasks under the general model of multiplayer general-sum Partially Observable Markov Games (POMGs), which is significantly larger than the standard model of Imperfect Information Extensive-Form Games (IIEFGs). We identify a rich subclass of POMGs---weakly revealing POMGs---in which sample-efficient learning is tractable. In the self-play setting, we prove that a simple algorithm combining optimism and Maximum Likelihood Estimation (MLE) is sufficient to find approximate Nash equilibria, correlated equilibria, as well as coarse correlated equilibria of weakly revealing POMGs, in a polynomial number of samples when the number of agents is small. In the setting of playing against adversarial opponents, we show that a variant of our optimistic MLE algorithm is capable of achieving sublinear regret when being compared against the optimal maximin policies.

Neural Information Processing SystemsOct-11-2024, 14:53:37 GMT

A Computationally Efficient Method for Learning Exponential Family Distributions

We consider the question of learning the natural parameters of a k parameter \textit{minimal} exponential family from i.i.d. We focus on the setting where the support as well as the natural parameters are appropriately bounded. While the traditional maximum likelihood estimator for this class of exponential family is consistent, asymptotically normal, and asymptotically efficient, evaluating it is computationally hard. In this work, we propose a computationally efficient estimator that is consistent as well as asymptotically normal under mild conditions. We provide finite sample guarantees to achieve an ( \ell_2) error of \alpha in the parameter estimation with sample complexity O(\mathrm{poly}(k/\alpha)) and computational complexity {O}(\mathrm{poly}(k/\alpha)) .

asymptotically normal, computationally efficient method, learning exponential family distribution, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)

Neural Information Processing SystemsOct-11-2024, 14:52:23 GMT

Non-convex Statistical Optimization for Sparse Tensor Graphical Model

We consider the estimation of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. The penalized maximum likelihood estimation of this model involves minimizing a non-convex objective function. In spite of the non-convexity of this estimation problem, we prove that an alternating minimization algorithm, which iteratively estimates each sparse precision matrix while fixing the others, attains an estimator with the optimal statistical rate of convergence as well as consistent graph recovery. Notably, such an estimator achieves estimation consistency with only one tensor sample, which is unobserved in previous work.

estimation, non-convex statistical optimization, sparse tensor graphical model, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Neural Information Processing SystemsOct-11-2024, 13:34:26 GMT

The Population Posterior and Bayesian Modeling on Streams

Many modern data analysis problems involve inferences from streaming data. However, streaming data is not easily amenable to the standard probabilistic modeling approaches, which assume that we condition on finite data. We develop population variational Bayes, a new approach for using Bayesian modeling to analyze streams of data. It approximates a new type of distribution, the population posterior, which combines the notion of a population distribution of the data with Bayesian inference in a probabilistic model. We study our method with latent Dirichlet allocation and Dirichlet process mixtures on several large-scale data sets.

inference, population posterior and bayesian modeling

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)