AITopics

2406.10711

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
North America > Canada > Quebec > Montreal (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (1.00)

Industry: Law Enforcement & Public Safety (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Mastropaolo, Antonio, Escobar-Velásquez, Camilo, Linares-Vásquez, Mario

The Rise and Fall(?) of Software Engineering

arXiv.org Artificial IntelligenceJun-14-2024

Over the last ten years, the realm of Artificial Intelligence (AI) has experienced an explosion of revolutionary breakthroughs, transforming what seemed like a far-off dream into a reality that is now deeply embedded in our everyday lives. AI's widespread impact is revolutionizing virtually all aspects of human life, and software engineering (SE) is no exception. As we explore this changing landscape, we are faced with questions about what the future holds for SE and how AI will reshape the roles, duties, and methodologies within the field. The introduction of these groundbreaking technologies highlights the inevitable shift towards a new paradigm, suggesting a future where AI's capabilities may redefine the boundaries of SE, potentially even more than human input. In this paper, we aim at outlining the key elements that, based on our expertise, are vital for the smooth integration of AI into SE, all while preserving the intrinsic human creativity that has been the driving force behind the field. First, we provide a brief description of SE and AI evolution. Afterward, we delve into the intricate interplay between AI-driven automation and human innovation, exploring how these two components can work together to advance SE practices to new methods and standards.

international conference, proceedings, software engineering, (10 more...)

2406.10141

Country:

South America > Brazil (0.05)
North America > United States > New York > New York County > New York City (0.05)
South America > Colombia (0.04)
(16 more...)

Genre: Research Report > Promising Solution (0.66)

Industry:

Information Technology (1.00)
Government (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(4 more...)

Roy, Vivekananda, Khare, Kshitij, Hobert, James P.

The data augmentation algorithm

arXiv.org Machine LearningJun-14-2024

The data augmentation (DA) algorithms are popular Markov chain Monte Carlo (MCMC) algorithms often used for sampling from intractable probability distributions. This review article comprehensively surveys DA MCMC algorithms, highlighting their theoretical foundations, methodological implementations, and diverse applications in frequentist and Bayesian statistics. The article discusses tools for studying the convergence properties of DA algorithms. Furthermore, it contains various strategies for accelerating the speed of convergence of the DA algorithms, different extensions of DA algorithms and outlines promising directions for future research. This paper aims to serve as a resource for researchers and practitioners seeking to leverage data augmentation techniques in MCMC algorithms by providing key insights and synthesizing recent developments.

algorithm, da algorithm, markov chain, (16 more...)

2406.10464

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Alameda County > Hayward (0.04)

Genre:

Research Report (1.00)
Overview (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Melo, Luckeciano C., Tigas, Panagiotis, Abate, Alessandro, Gal, Yarin

Deep Bayesian Active Learning for Preference Modeling in Large Language Models

arXiv.org Machine LearningJun-14-2024

Leveraging human preferences for steering the behavior of Large Language Models (LLMs) has demonstrated notable success in recent years. Nonetheless, data selection and labeling are still a bottleneck for these systems, particularly at large scale. Hence, selecting the most informative points for acquiring human feedback may considerably reduce the cost of preference labeling and unleash the further development of LLMs. Bayesian Active Learning provides a principled framework for addressing this challenge and has demonstrated remarkable success in diverse settings. However, previous attempts to employ it for Preference Modeling did not meet such expectations. In this work, we identify that naive epistemic uncertainty estimation leads to the acquisition of redundant samples. We address this by proposing the Bayesian Active Learner for Preference Modeling (BAL-PM), a novel stochastic acquisition policy that not only targets points of high epistemic uncertainty according to the preference model but also seeks to maximize the entropy of the acquired prompt distribution in the feature space spanned by the employed LLM. Notably, our experiments demonstrate that BAL-PM requires 33% to 68% fewer preference labels in two popular human preference datasets and exceeds previous stochastic Bayesian acquisition policies.

bal-pm, dataset, preference modeling, (13 more...)

2406.10023

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Jenkins, Porter, Selander, Michael, Jenkins, J. Stockton, Merrill, Andrew, Armstrong, Kyle

Personalized Product Assortment with Real-time 3D Perception and Bayesian Payoff Estimation

Product assortment selection is a critical challenge facing physical retailers. Effectively aligning inventory with the preferences of shoppers can increase sales and decrease out-of-stocks. However, in real-world settings the problem is challenging due to the combinatorial explosion of product assortment possibilities. Consumer preferences are typically heterogeneous across space and time, making inventory-preference alignment challenging. Additionally, existing strategies rely on syndicated data, which tends to be aggregated, low resolution, and suffer from high latency. To solve these challenges, we introduce a real-time recommendation system, which we call EdgeRec3D. Our system utilizes recent advances in 3D computer vision for perception and automatic, fine grained sales estimation. These perceptual components run on the edge of the network and facilitate real-time reward signals. Additionally, we develop a Bayesian payoff model to account for noisy estimates from 3D LIDAR data. We rely on spatial clustering to allow the system to adapt to heterogeneous consumer preferences, and a graph-based candidate generation algorithm to address the combinatorial search problem. We test our system in real-world stores across two, 6-8 week A/B tests with beverage products and demonstrate a 35% and 27% increase in sales respectively. Finally, we monitor the deployed system for a period of 28 weeks with an observational study and show a 9.4% increase in sales.

edgerec3d, experiment, recommendation, (14 more...)

doi: 10.1145/3637528.3671518

2406.07769

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > Utah (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.95)

Industry: Retail (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
(3 more...)

Petrungaro, Bruno, Kitson, Neville K., Constantinou, Anthony C.

Investigating potential causes of Sepsis with Bayesian network structure learning

Sepsis is a life-threatening and serious global health issue. This study combines knowledge with available hospital data to investigate the potential causes of Sepsis that can be affected by policy decisions. We investigate the underlying causal structure of this problem by combining clinical expertise with score-based, constraint-based, and hybrid structure learning algorithms. A novel approach to model averaging and knowledge-based constraints was implemented to arrive at a consensus structure for causal inference. The structure learning process highlighted the importance of exploring data-driven approaches alongside clinical expertise. This includes discovering unexpected, although reasonable, relationships from a clinical perspective. Hypothetical interventions on Chronic Obstructive Pulmonary Disease, Alcohol dependence, and Diabetes suggest that the presence of any of these risk factors in patients increases the likelihood of Sepsis. This finding, alongside measuring the effect of these risk factors on Sepsis, has potential policy implications. Recognising the importance of prediction in improving Sepsis related health outcomes, the model built is also assessed in its ability to predict Sepsis. The predictions generated by the consensus model were assessed for their accuracy, sensitivity, and specificity. These three indicators all had results around 70%, and the AUC was 80%, which means the causal structure of the model is reasonably accurate given that the models were trained on data available for commissioning purposes only.

algorithm, graph, sepsis, (16 more...)

2406.09207

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > Greece (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.30)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning

Hu, Xuemin, Li, Shen, Xu, Yingfen, Tang, Bo, Chen, Long

Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given state, resulting in the extrapolation error issue. Recent works address this issue by employing generative adversarial networks (GANs). However, these methods often suffer from insufficient constraints on policy exploration and inaccurate representation of behavior policies. Moreover, the generator in GANs fails in fooling the discriminator while maximizing the expected returns of a policy. Inspired by the diffusion, a generative model with powerful feature expressiveness, we propose a new offline RL method named Diffusion Policies with Generative Adversarial Networks (Diff-PoGAN). In this approach, the diffusion serves as the policy generator to generate diverse distributions of actions, and a regularization method based on maximum likelihood estimation (MLE) is developed to generate data that approximate the distribution of behavior policies. Besides, we introduce an additional regularization term based on the discriminator output to effectively constrain policy exploration for policy improvement. Comprehensive experiments are conducted on the datasets for deep data-driven reinforcement learning (D4RL), and experimental results show that DiffPoGAN outperforms state-of-the-art methods in offline RL.

behavior policy, discriminator, learning, (15 more...)

2406.09089

Country:

Asia > China (0.04)
North America > United States (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Franke, Michael, Tsvilodub, Polina, Carcassi, Fausto

Bayesian Statistical Modeling with Predictors from LLMs

State of the art large language models (LLMs) have shown impressive performance on a variety of benchmark tasks and are increasingly used as components in larger applications, where LLM-based predictions serve as proxies for human judgements or decision. This raises questions about the human-likeness of LLM-derived information, alignment with human intuition, and whether LLMs could possibly be considered (parts of) explanatory models of (aspects of) human cognition or language use. To shed more light on these issues, we here investigate the human-likeness of LLMs' predictions for multiple-choice decision tasks from the perspective of Bayesian statistical modeling. Using human data from a forced-choice experiment on pragmatic language use, we find that LLMs do not capture the variance in the human data at the item-level. We suggest different ways of deriving full distributional predictions from LLMs for aggregate, condition-level data, and find that some, but not all ways of obtaining condition-level predictions yield adequate fits to human data. These results suggests that assessment of LLM performance depends strongly on seemingly subtle choices in methodology, and that LLMs are at best predictors of human behavior at the aggregate, condition-level, for which they are, however, not designed to, or usually used to, make predictions in the first place.

llm, prediction, random choice random choice, (13 more...)

2406.09012

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(9 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Schmid, Nina, del Pozo, David Fernandes, Waegeman, Willem, Hasenauer, Jan

Assessment of Uncertainty Quantification in Universal Differential Equations

arXiv.org Machine LearningJun-13-2024

Scientific Machine Learning is a new class of approaches that integrate physical knowledge and mechanistic models with data-driven techniques for uncovering governing equations of complex processes. Among the available approaches, Universal Differential Equations (UDEs) are used to combine prior knowledge in the form of mechanistic formulations with universal function approximators, like neural networks. Integral to the efficacy of UDEs is the joint estimation of parameters within mechanistic formulations and the universal function approximators using empirical data. The robustness and applicability of resultant models, however, hinge upon the rigorous quantification of uncertainties associated with these parameters, as well as the predictive capabilities of the overall model or its constituent components. With this work, we provide a formalisation of uncertainty quantification (UQ) for UDEs and investigate important frequentist and Bayesian methods. By analysing three synthetic examples of varying complexity, we evaluate the validity and efficiency of ensembles, variational inference and Markov chain Monte Carlo sampling as epistemic UQ methods for UDEs.

neural network, noise model, scenario, (15 more...)

2406.08853

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Switzerland (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Epidemiology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Negri, Marcello Massimo, Aellen, Jonathan, Roth, Volker

Injective Flows for parametric hypersurfaces

arXiv.org Machine LearningJun-13-2024

Normalizing Flows (NFs) are powerful and efficient models for density estimation. When modeling densities on manifolds, NFs can be generalized to injective flows but the Jacobian determinant becomes computationally prohibitive. Current approaches either consider bounds on the log-likelihood or rely on some approximations of the Jacobian determinant. In contrast, we propose injective flows for parametric hypersurfaces and show that for such manifolds we can compute the Jacobian determinant exactly and efficiently, with the same cost as NFs. Furthermore, we show that for the subclass of star-like manifolds we can extend the proposed framework to always allow for a Cartesian representation of the density. We showcase the relevance of modeling densities on hypersurfaces in two settings. Firstly, we introduce a novel Objective Bayesian approach to penalized likelihood models by interpreting level-sets of the penalty as star-like manifolds. Secondly, we consider Bayesian mixture models and introduce a general method for variational inference by defining the posterior of mixture weights on the probability simplex.

injective flow, jacobian determinant, manifold, (13 more...)

2406.09116

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)