AITopics

2407.05145

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJul-6-2024

Idiographic Personality Gaussian Process for Psychological Assessment

Chen, Yehu, Xi, Muchen, Montgomery, Jacob, Jackson, Joshua, Garnett, Roman

We develop a novel measurement framework based on a Gaussian process coregionalization model to address a long-lasting debate in psychometrics: whether psychological features like personality share a common structure across the population, vary uniquely for individuals, or some combination. We propose the idiographic personality Gaussian process (IPGP) framework, an intermediate model that accommodates both shared trait structure across a population and "idiographic" deviations for individuals. IPGP leverages the Gaussian process coregionalization model to handle the grouped nature of battery responses, but adjusted to non-Gaussian ordinal data. We further exploit stochastic variational inference for efficient latent factor estimation required for idiographic modeling at scale. Using synthetic and real data, we show that IPGP improves both prediction of actual responses and estimation of individualized factor structures relative to existing benchmarks. In a third study, we show that IPGP also identifies unique clusters of personality taxonomies in real-world data, displaying great potential in advancing individualized approaches to psychological diagnosis and treatment.

compass, creativ, organiz, (16 more...)

2407.0497

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Education (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Zong, Yifei, Barajas-Solano, David, Tartakovsky, Alexandre M.

Randomized Physics-Informed Neural Networks for Bayesian Data Assimilation

arXiv.org Artificial IntelligenceJul-5-2024

We propose a randomized physics-informed neural network (PINN) or rPINN method for uncertainty quantification in inverse partial differential equation (PDE) problems with noisy data. This method is used to quantify uncertainty in the inverse PDE PINN solutions. Recently, the Bayesian PINN (BPINN) method was proposed, where the posterior distribution of the PINN parameters was formulated using the Bayes' theorem and sampled using approximate inference methods such as the Hamiltonian Monte Carlo (HMC) and variational inference (VI) methods. In this work, we demonstrate that HMC fails to converge for non-linear inverse PDE problems. As an alternative to HMC, we sample the distribution by solving the stochastic optimization problem obtained by randomizing the PINN loss function. The effectiveness of the rPINN method is tested for linear and non-linear Poisson equations, and the diffusion equation with a high-dimensional space-dependent diffusion coefficient. The rPINN method provides informative distributions for all considered problems. For the linear Poisson equation, HMC and rPINN produce similar distributions, but rPINN is on average 27 times faster than HMC. For the non-linear Poison and diffusion equations, the HMC method fails to converge because a single HMC chain cannot sample multiple modes of the posterior distribution of the PINN parameters in a reasonable amount of time.

artificial intelligence, machine learning, posterior distribution, (19 more...)

2407.04617

Country: North America > United States > Illinois (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Yusuf, Bolaji, Černocký, Jan "Honza", Saraçlar, Murat

Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units

arXiv.org Artificial IntelligenceJul-5-2024

End-to-end (E2E) keyword search (KWS) has emerged as an alternative and complimentary approach to conventional keyword search which depends on the output of automatic speech recognition (ASR) systems. While E2E methods greatly simplify the KWS pipeline, they generally have worse performance than their ASR-based counterparts, which can benefit from pretraining with untranscribed data. In this work, we propose a method for pretraining E2E KWS systems with untranscribed data, which involves using acoustic unit discovery (AUD) to obtain discrete units for untranscribed data and then learning to locate sequences of such units in the speech. We conduct experiments across languages and AUD systems: we show that finetuning such a model significantly outperforms a model trained from scratch, and the performance improvements are generally correlated with the quality of the AUD system used for pretraining.

acoustic unit, query, unit discovery, (13 more...)

2407.04652

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
Asia > Middle East > Republic of Türkiye (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(2 more...)

Leonelli, Manuele, Smith, Jim Q., Wright, Sophia K.

The diameter of a stochastic matrix: A new measure for sensitivity analysis in Bayesian networks

arXiv.org Artificial IntelligenceJul-5-2024

Their use as a decision support tool in business and OR has been increasing over the years, including case studies in project management (van Dorp, 2020), supply chain (Garvey et al., 2015), marketing (Hosseini, 2021), and logistics (Qazi, 2022), among others. BNs are defined by two components: a directed acyclic graph (DAG) where each node is a variable of interest and edges represent the, possibly causal, relationship between them; a conditional probability table (CPT) for each node of the DAG reporting the probability distribution of the associated variable conditional on its parents. BNs are highly interpretable due to their graphical nature, representing the probabilistic relationships between variables, making it easy for users to understand and trace the influence of one variable on another. With explainability now recognized as critical for the use of AI in applied research (Rudin, 2019), including in OR (De Bock et al., 2023), BNs stand out by providing transparent and intuitive explanations, thereby enhancing trust and clarity in decision-making processes. The underlying DAG and the associated CPTs can be learned from data using machine learning algorithms or elicited using experts' opinions and knowledge. There is now a vast amount of algorithms to learn BN from data (e.g.

bayesian network, cpt, diameter, (16 more...)

2407.04667

Country:

Asia (0.06)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJul-5-2024

Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

Cen, Shicong, Mei, Jincheng, Goshvadi, Katayoon, Dai, Hanjun, Yang, Tong, Yang, Sherry, Schuurmans, Dale, Chi, Yuejie, Dai, Bo

Reinforcement learning from human feedback (RLHF) has demonstrated great promise in aligning large language models (LLMs) with human preference. Depending on the availability of preference data, both online and offline RLHF are active areas of investigation. A key bottleneck is understanding how to incorporate uncertainty estimation in the reward function learned from the preference data for RLHF, regardless of how the preference data is collected. While the principles of optimism or pessimism under uncertainty are well-established in standard reinforcement learning (RL), a practically-implementable and theoretically-grounded form amenable to large language models is not yet available, as standard techniques for constructing confidence intervals become intractable under arbitrary policy parameterizations. In this paper, we introduce a unified approach to online and offline RLHF -- value-incentivized preference optimization (VPO) -- which regularizes the maximum-likelihood estimate of the reward function with the corresponding value function, modulated by a $\textit{sign}$ to indicate whether the optimism or pessimism is chosen. VPO also directly optimizes the policy with implicit reward modeling, and therefore shares a simpler RLHF pipeline similar to direct preference optimization. Theoretical guarantees of VPO are provided for both online and offline settings, matching the rates of their standard RL counterparts. Moreover, experiments on text summarization and dialog verify the practicality and effectiveness of VPO.

arxiv preprint arxiv, cal, rlhf, (14 more...)

2405.1932

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Zhang, Zhikun, Duan, Yiting, Wang, Xiangjun, Zhang, Mingyuan

Machine Learning for Complex Systems with Abnormal Pattern by Exception Maximization Outlier Detection Method

arXiv.org Machine LearningJul-5-2024

This paper proposes a novel fast online methodology for outlier detection called the exception maximization outlier detection method(EMODM), which employs probabilistic models and statistical algorithms to detect abnormal patterns from the outputs of complex systems. The EMODM is based on a two-state Gaussian mixture model and demonstrates strong performance in probability anomaly detection working on real-time raw data rather than using special prior distribution information. We confirm this using the synthetic data from two numerical cases. For the real-world data, we have detected the short circuit pattern of the circuit system using EMODM by the current and voltage output of a three-phase inverter. The EMODM also found an abnormal period due to COVID-19 in the insured unemployment data of 53 regions in the United States from 2000 to 2024. The application of EMODM to these two real-life datasets demonstrated the effectiveness and accuracy of our algorithm.

abnormal pattern, complex system, emodm, (16 more...)

2407.04248

Country:

North America > United States > New York (0.04)
North America > United States > California (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Economy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.35)
Health & Medicine > Therapeutic Area > Immunology (0.35)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

arXiv.org Machine LearningJul-5-2024

RPN: Reconciled Polynomial Network Towards Unifying PGMs, Kernel SVMs, MLP and KAN

Zhang, Jiawei

In this paper, we will introduce a novel deep model named Reconciled Polynomial Network (RPN) for deep function learning. RPN has a very general architecture and can be used to build models with various complexities, capacities, and levels of completeness, which all contribute to the correctness of these models. As indicated in the subtitle, RPN can also serve as the backbone to unify different base models into one canonical representation. This includes non-deep models, like probabilistic graphical models (PGMs) - such as Bayesian network and Markov network - and kernel support vector machines (kernel SVMs), as well as deep models like the classic multi-layer perceptron (MLP) and the recent Kolmogorov-Arnold network (KAN). Technically, RPN proposes to disentangle the underlying function to be inferred into the inner product of a data expansion function and a parameter reconciliation function. Together with the remainder function, RPN accurately approximates the underlying functions that governs data distributions. The data expansion functions in RPN project data vectors from the input space to a high-dimensional intermediate space, specified by the expansion functions in definition. Meanwhile, RPN also introduces the parameter reconciliation functions to fabricate a small number of parameters into a higher-order parameter matrix to address the ``curse of dimensionality'' problem caused by the data expansions. Moreover, the remainder functions provide RPN with additional complementary information to reduce potential approximation errors. We conducted extensive empirical experiments on numerous benchmark datasets across multiple modalities, including continuous function datasets, discrete vision and language datasets, and classic tabular datasets, to investigate the effectiveness of RPN.

false 1, false true 1, true false 1, (15 more...)

2407.04819

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.04)
(6 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.67)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)

van Zwol, Björn, Jefferson, Ro, Broek, Egon L. van den

Predictive Coding Networks and Inference Learning: Tutorial and Survey

arXiv.org Machine LearningJul-4-2024

Recent years have witnessed a growing call for renewed emphasis on neuroscience-inspired approaches in artificial intelligence research, under the banner of $\textit{NeuroAI}$. This is exemplified by recent attention gained by predictive coding networks (PCNs) within machine learning (ML). PCNs are based on the neuroscientific framework of predictive coding (PC), which views the brain as a hierarchical Bayesian inference model that minimizes prediction errors from feedback connections. PCNs trained with inference learning (IL) have potential advantages to traditional feedforward neural networks (FNNs) trained with backpropagation. While historically more computationally intensive, recent improvements in IL have shown that it can be more efficient than backpropagation with sufficient parallelization, making PCNs promising alternatives for large-scale applications and neuromorphic hardware. Moreover, PCNs can be mathematically considered as a superset of traditional FNNs, which substantially extends the range of possible architectures for both supervised and unsupervised learning. In this work, we provide a comprehensive review as well as a formal specification of PCNs, in particular placing them in the context of modern ML methods, and positioning PC as a versatile and promising framework worthy of further study by the ML community.

artificial intelligence, machine learning, pcn, (14 more...)

2407.04117

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.67)

Genre:

Research Report (1.00)
Overview (0.86)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Energy > Oil & Gas (1.00)
Law > Litigation (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Filter, Björn, Möller, Ralf, Özçep, Özgür Lütfü

Mechanisms for Data Sharing in Collaborative Causal Inference (Extended Version)

arXiv.org Artificial IntelligenceJul-4-2024

Collaborative causal inference (CCI) is a federated learning method for pooling data from multiple, often self-interested, parties, to achieve a common learning goal over causal structures, e.g. estimation and optimization of treatment variables in a medical setting. Since obtaining data can be costly for the participants and sharing unique data poses the risk of losing competitive advantages, motivating the participation of all parties through equitable rewards and incentives is necessary. This paper devises an evaluation scheme to measure the value of each party's data contribution to the common learning task, tailored to causal inference's statistical demands, by comparing completed partially directed acyclic graphs (CPDAGs) inferred from observational data contributed by the participants. The Data Valuation Scheme thus obtained can then be used to introduce mechanisms that incentivize the agents to contribute data. It can be leveraged to reward agents fairly, according to the quality of their data, or to maximize all agents' data contributions.

agent, causal effect, estimator, (14 more...)

2407.11032

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)