AITopics

doi: 10.58970/IJSB.2214

2501.03305

Country: Asia > Bangladesh (0.75)

Genre: Research Report > New Finding (0.47)

Industry:

Health & Medicine (1.00)
Materials > Chemicals > Agricultural Chemicals (0.34)
Food & Agriculture > Agriculture > Pest Control (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

arXiv.org Artificial IntelligenceJan-6-2025

Information Design with Unknown Prior

Lin, Tao, Li, Ce

Classical information design models (e.g., Bayesian persuasion and cheap talk) require players to have perfect knowledge of the prior distribution of the state of the world. Our paper studies repeated persuasion problems in which the information designer does not know the prior. The information designer learns to design signaling schemes from repeated interactions with the receiver. We design learning algorithms for the information designer to achieve no regret compared to using the optimal signaling scheme with known prior, under two models of the receiver's decision-making. (1) The first model assumes that the receiver knows the prior and can perform posterior update and best respond to signals. In this model, we design a learning algorithm for the information designer with $O(\log T)$ regret in the general case, and another algorithm with $\Theta(\log \log T)$ regret in the case where the receiver has only two actions. (2) The second model assumes that the receiver does not know the prior and employs a no-regret learning algorithm to take actions. We show that the information designer can achieve regret $O(\sqrt{\mathrm{rReg}(T) T})$, where $\mathrm{rReg}(T)=o(T)$ is an upper bound on the receiver's learning regret. Our work thus provides a learning foundation for the problem of information design with unknown prior.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2410.05533

Country:

Europe (0.68)
North America > United States > New York (0.46)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)

Genre: Research Report (0.63)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Wang, Mengxin, Zhang, Dennis J., Zhang, Heng

Large Language Models for Market Research: A Data-augmentation Approach

arXiv.org Machine LearningJan-6-2025

Large Language Models (LLMs) have transformed artificial intelligence by excelling in complex natural language processing tasks. Their ability to generate human-like text has opened new possibilities for market research, particularly in conjoint analysis, where understanding consumer preferences is essential but often resource-intensive. Traditional survey-based methods face limitations in scalability and cost, making LLM-generated data a promising alternative. However, while LLMs have the potential to simulate real consumer behavior, recent studies highlight a significant gap between LLM-generated and human data, with biases introduced when substituting between the two. In this paper, we address this gap by proposing a novel statistical data augmentation approach that efficiently integrates LLM-generated data with real data in conjoint analysis. Our method leverages transfer learning principles to debias the LLM-generated data using a small amount of human data. This results in statistically robust estimators with consistent and asymptotically normal properties, in contrast to naive approaches that simply substitute human data with LLM-generated data, which can exacerbate bias. We validate our framework through an empirical study on COVID-19 vaccine preferences, demonstrating its superior ability to reduce estimation error and save data and costs by 24.9% to 79.8%. In contrast, naive approaches fail to save data due to the inherent biases in LLM-generated data compared to human data. Another empirical study on sports car choices validates the robustness of our results. Our findings suggest that while LLM-generated data is not a direct substitute for human responses, it can serve as a valuable complement when used within a robust statistical framework.

large language model, machine learning, natural language, (20 more...)

2412.19363

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Machine LearningJan-5-2025

Re-examining Granger Causality from Causal Bayesian Networks Perspective

Adedayo, S. A.

The emergence of machine learning (ML) has been phenomenal, with ML-based models outperforming human intelligence, as in the case of AlphaGo [1] and, more recently, large language models (LLMs). With these advances, ML became state-of-the-art for scientific discovery in various fields of study [2]. However, ML algorithms fail to answer the crucial question "what" brings about an effect and "what if" questions i.e., ML cannot identify causal relationships in data and counterfactual questions. Hence, the need for causality and causal inference a field that focuses on unravelling causal interactions in data. Characterising these interactions in complex dynamical systems is a fundamental question in science [3]. Causal structure learning (CSL)--a computational causal discovery field, taking advantage of statistics and machine learning (ML) to unravel causal relations in data--is particularly appealing because it enables us to answer counterfactual questions [4, 5, 6, 7]. We adopt Pearl's causality framework.

artificial intelligence, granger causality, machine learning, (15 more...)

2501.02672

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.46)
Leisure & Entertainment > Games > Go (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.42)

Dia, Noé, Yantovski-Barth, M. J., Adam, Alexandre, Bowles, Micah, Perreault-Levasseur, Laurence, Hezaveh, Yashar, Scaife, Anna

IRIS: A Bayesian Approach for Image Reconstruction in Radio Interferometry with expressive Score-Based priors

arXiv.org Artificial IntelligenceJan-5-2025

Inferring sky surface brightness distributions from noisy interferometric data in a principled statistical framework has been a key challenge in radio astronomy. In this work, we introduce Imaging for Radio Interferometry with Score-based models (IRIS). We use score-based models trained on optical images of galaxies as an expressive prior in combination with a Gaussian likelihood in the uv-space to infer images of protoplanetary disks from visibility data of the DSHARP survey conducted by ALMA. We demonstrate the advantages of this framework compared with traditional radio interferometry imaging algorithms, showing that it produces plausible posterior samples despite the use of a misspecified galaxy prior. Through coverage testing on simulations, we empirically evaluate the accuracy of this approach to generate calibrated posterior samples.

artificial intelligence, machine learning, posterior sample, (15 more...)

2501.02473

Country:

North America > Canada > Quebec > Montreal (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJan-5-2025

From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Wang, Tianyang, Wang, Yunze, Zhou, Jun, Peng, Benji, Song, Xinyuan, Zhang, Charles, Sun, Xintian, Niu, Qian, Liu, Junyu, Chen, Silin, Chen, Keyu, Li, Ming, Feng, Pohsun, Bi, Ziqian, Liu, Ming, Zhang, Yichao, Fei, Cheng, Yin, Caitlyn Heqi, Yan, Lawrence KQ

Uncertainty quantification (UQ) is a critical aspect of artificial intelligence (AI) systems, particularly in high-risk domains such as healthcare, autonomous systems, and financial technology, where decision-making processes must account for uncertainty. This review explores the evolution of uncertainty quantification techniques in AI, distinguishing between aleatoric and epistemic uncertainties, and discusses the mathematical foundations and methods used to quantify these uncertainties. We provide an overview of advanced techniques, including probabilistic methods, ensemble learning, sampling-based approaches, and generative models, while also highlighting hybrid approaches that integrate domain-specific knowledge. Furthermore, we examine the diverse applications of UQ across various fields, emphasizing its impact on decision-making, predictive accuracy, and system robustness. The review also addresses key challenges such as scalability, efficiency, and integration with explainable AI, and outlines future directions for research in this rapidly developing area. Through this comprehensive survey, we aim to provide a deeper understanding of UQ's role in enhancing the reliability, safety, and trustworthiness of AI systems.

data mining, machine learning, natural language, (18 more...)

2501.03282

Country:

North America > United States (0.93)
Asia (0.93)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.93)
Energy > Power Industry (0.68)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

arXiv.org Machine LearningJan-5-2025

Transformers Simulate MLE for Sequence Generation in Bayesian Networks

Cao, Yuan, He, Yihan, Wu, Dennis, Chen, Hong-Yu, Fan, Jianqing, Liu, Han

Transformers (Vaswani et al. 2017) have achieved tremendous success across various fields. These models are known to be particularly strong in terms of sequence generation, and have revolutionized the way we approach problems related to text generation, translation, and scientific discoveries such as protein generation. Despite these achievements, there remains limited understanding of the theoretical capabilities of transformers as sequence generators. To theoretically understand how transformers efficiently generate sequences, several recent works have studied the the power of transformers in learning specific probability models for sequential data (Ildiz et al. 2024, Rajaraman et al. 2024, Makkuva et al. 2024, Nichani et al. 2024). Specifically, Ildiz et al. (2024) studied the problem of learning Markov chains with a one-layer self-attention model, and developed identifiability and convergence guarantees under certain conditions. Rajaraman et al. (2024) studied the behavior of transformers on data drawn from k-order Markov processes, where the conditional distribution of the next variable in a sequence depends on the previous k variables, and showed that such processes can be learned well by transformers of a constant-order depth. Makkuva et al. (2024) further studied the loss function landscape of one-layer transformers in learning Markov chains. Nichani et al. (2024) studied a setting where the tokens consist of multiple sequences of samples generated from a causal network, and demonstrated that transformers can be trained to learn the causal network structure so that, when seeing a new context-query pair, it can generate prediction according to the learned causal structure and the context. However, similar to the studies of Markov chains, Nichani et al. (2024) mostly focused on the setting where each variable has at most one parent.

artificial intelligence, machine learning, transformer, (14 more...)

2501.02547

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Morel-Balbi, Sebastian, Kirkley, Alec

Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons

arXiv.org Machine LearningJan-5-2025

A common task arising in various domains is that of ranking items based on the outcomes of pairwise comparisons, from ranking players and teams in sports to ranking products or brands in marketing studies and recommendation systems. Statistical inference-based methods such as the Bradley-Terry model, which extract rankings based on an underlying generative model of the comparison outcomes, have emerged as flexible and powerful tools to tackle the task of ranking in empirical data. In situations with limited and/or noisy comparisons, it is often challenging to confidently distinguish the performance of different items based on the evidence available in the data. However, existing inference-based ranking methods overwhelmingly choose to assign each item to a unique rank or score, suggesting a meaningful distinction when there is none. Here, we address this problem by developing a principled Bayesian methodology for learning partial rankings -- rankings with ties -- that distinguishes among the ranks of different items only when there is sufficient evidence available in the data. Our framework is adaptable to any statistical ranking method in which the outcomes of pairwise observations depend on the ranks or scores of the items being compared. We develop a fast agglomerative algorithm to perform Maximum A Posteriori (MAP) inference of partial rankings under our framework and examine the performance of our method on a variety of real and synthetic network datasets, finding that it frequently gives a more parsimonious summary of the data than traditional ranking, particularly when observations are sparse.

artificial intelligence, machine learning, ranking, (17 more...)

2501.02505

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (0.68)
Education (0.67)
Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningJan-4-2025

ED-Filter: Dynamic Feature Filtering for Eating Disorder Classification

Naseriparsa, Mehdi, Sukunesan, Suku, Cai, Zhen, Alfarraj, Osama, Tolba, Amr, Rabooki, Saba Fathi, Xia, Feng

Eating disorders (ED) are critical psychiatric problems that have alarmed the mental health community. Mental health professionals are increasingly recognizing the utility of data derived from social media platforms such as Twitter. However, high dimensionality and extensive feature sets of Twitter data present remarkable challenges for ED classification. To overcome these hurdles, we introduce a novel method, an informed branch and bound search technique known as ED-Filter. This strategy significantly improves the drawbacks of conventional feature selection algorithms such as filters and wrappers. ED-Filter iteratively identifies an optimal set of promising features that maximize the eating disorder classification accuracy. In order to adapt to the dynamic nature of Twitter ED data, we enhance the ED-Filter with a hybrid greedy-based deep learning algorithm. This algorithm swiftly identifies sub-optimal features to accommodate the ever-evolving data landscape. Experimental results on Twitter eating disorder data affirm the effectiveness and efficiency of ED-Filter. The method demonstrates significant improvements in classification accuracy and proves its value in eating disorder detection on social media platforms.

artificial intelligence, feature subset, machine learning, (16 more...)

2501.14785

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJan-4-2025

Digital Twin Calibration with Model-Based Reinforcement Learning

Zheng, Hua, Xie, Wei, Ryzhov, Ilya O., Choy, Keilung

This study is motivated by optimal control applications that exhibit high complexity, high uncertainty, and very limited data [Wang et al., 2024, Zheng et al., 2023, Plotkin et al., 2017, Mirasol, 2017]. In particular, all of these challenges are present in the domain of biopharmaceutical manufacturing, used for production of essential life-saving treatments for severe and chronic diseases, including cancers, autoimmune disorders, metabolic diseases, genetic disorders, and infectious diseases such as COVID-19 [Zahavi and Weiner, 2020, Teo, 2022]. Using cells as factories, biomanufacturing involves hundreds of biological, physical, and chemical factors dynamically interacting with each other at molecular, cellular, and macroscopic levels and impacting production outcomes. Due to the complexity of these mechanisms, it is quite difficult to control production safely and effectively, especially in the presence of very limited data. Digital twins have proven very useful in guiding the control of complex physical systems [Tao et al., 2018].

digital twin calibration, machine learning, reinforcement learning, (14 more...)

2501.02205

Country:

Europe (1.00)
North America > United States (0.47)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.93)
(2 more...)