AITopics

2402.10877

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Nabi, Razieh, Hejazi, Nima S., van der Laan, Mark J., Benkeser, David

Statistical learning for constrained functional parameters in infinite-dimensional models with applications in fair machine learning

arXiv.org Machine LearningApr-15-2024

Constrained learning has become increasingly important, especially in the realm of algorithmic fairness and machine learning. In these settings, predictive models are developed specifically to satisfy pre-defined notions of fairness. Here, we study the general problem of constrained statistical machine learning through a statistical functional lens. We consider learning a function-valued parameter of interest under the constraint that one or several pre-specified real-valued functional parameters equal zero or are otherwise bounded. We characterize the constrained functional parameter as the minimizer of a penalized risk criterion using a Lagrange multiplier formulation. We show that closed-form solutions for the optimal constrained parameter are often available, providing insight into mechanisms that drive fairness in predictive models. Our results also suggest natural estimators of the constrained parameter that can be constructed by combining estimates of unconstrained parameters of the data generating distribution. Thus, our estimation procedure for constructing fair machine learning algorithms can be applied in conjunction with any statistical learning approach and off-the-shelf software. We demonstrate the generality of our method by explicitly considering a number of examples of statistical fairness constraints and implementing the approach using several popular learning approaches.

constraint, realization, sample size, (17 more...)

2404.09847

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Iowa (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Industry: Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Gloeckler, Manuel, Deistler, Michael, Weilbach, Christian, Wood, Frank, Macke, Jakob H.

All-in-one simulation-based inference

arXiv.org Machine LearningApr-15-2024

Amortized Bayesian inference trains neural networks to solve stochastic inference problems using model simulations, thereby making it possible to rapidly perform Bayesian inference for any newly observed data. However, current simulation-based amortized inference methods are simulation-hungry and inflexible: They require the specification of a fixed parametric prior, simulator, and inference tasks ahead of time. Here, we present a new amortized inference method -- the Simformer -- which overcomes these limitations. By training a probabilistic diffusion model with transformer architectures, the Simformer outperforms current state-of-the-art amortized inference approaches on benchmark tasks and is substantially more flexible: It can be applied to models with function-valued parameters, it can handle inference scenarios with missing or unstructured data, and it can sample arbitrary conditionals of the joint distribution of parameters and data, including both posterior and likelihood. We showcase the performance and flexibility of the Simformer on simulators from ecology, epidemiology, and neuroscience, and demonstrate that it opens up new possibilities and application domains for amortized Bayesian inference on simulation-based models.

inference, simformer, simulation-based inference, (17 more...)

2404.09636

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Mirshekari, Shahin, Moradi, Mohammadreza, Jafari, Hossein, Jafari, Mehdi, Ensaf, Mohammad

Enhancing Predictive Accuracy in Pharmaceutical Sales Through An Ensemble Kernel Gaussian Process Regression Approach

arXiv.org Artificial IntelligenceApr-14-2024

This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Mat\'ern, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Mat\'ern, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an \( R^2 \) score near 1.0, and significantly lower values in Mean Squared Error (MSE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.

category, ensemble kernel, kernel, (9 more...)

2404.19669

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Asia > Singapore > Central Region > Singapore (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceApr-13-2024

Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

Huseljic, Denis, Hahn, Paul, Herde, Marek, Rauch, Lukas, Sick, Bernhard

Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks. Bait, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets. However, Bait's high computational and memory requirements hinder its applicability on large-scale classification tasks, resulting in current research neglecting Bait in their evaluation. This paper introduces two methods to enhance Bait's computational efficiency and scalability. Notably, we significantly reduce its time complexity by approximating the Fisher Information. In particular, we adapt the original formulation by i) taking the expectation over the most probable classes, and ii) constructing a binary classification task, leading to an alternative likelihood for gradient computations. Consequently, this allows the efficient use of Bait on large-scale datasets, including ImageNet. Our unified and comprehensive evaluation across a variety of datasets demonstrates that our approximations achieve strong performance with considerably reduced time complexity.

approximation, bait, complexity, (15 more...)

2404.08981

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Anasashvili, Bachana, Jeleskovic, Vahidin

ALICE: Combining Feature Selection and Inter-Rater Agreeability for Machine Learning Insights

arXiv.org Machine LearningApr-13-2024

The use of Machine Learning models for decision-making has become the new norm not only in tech but any business field imaginable, covering any possible task at hand be it search engine recommendations, customer churn prediction, credit risk scoring, energy load forecasting, or the deployment of personalized AI assistants. This comes at a time when developing ML models has become increasingly easier with the rise of open-source, free and user-friendly Python libraries such as Keras, scikit-learn, PyTorch and as generative AI-based conversational chatbots such as ChatGPT, Gemini and Claude that can provide coding assistance -- if not ready-made code for modeling -- are evolving rapidly. Such developments yet again beg the question of interpretability in machine learning, which has been formulated in various ways in literature and been offered multiple proposed solutions such as exploring causality (see Section 2.1), explainability (see Section 2.2) or abandoning black box ML models altogether. But to make a philosophical argument, it is hard to see the benefits of highly model or domain-specific, post-hoc, or complex solutions to obtain insights into the inner-doings of machine learning models when the modeling task itself is growing ever more accessible to laypeople. Common thought on categorizing ML models in this regard would argue that parametric models descending from the fields of statistics and econometrics such as Linear or Logistic Regression are by nature more interpretable than their data-driven and non-parametric counterparts such as tree-based models or neural networks.

deviceprotection, techsupport, tenuremonth, (17 more...)

2404.09053

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.49)
Research Report > New Finding (0.48)

Industry:

Telecommunications (0.93)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningApr-13-2024

Concentration properties of fractional posterior in 1-bit matrix completion

Mai, The Tien

The problem of estimating a matrix based on a set of its observed entries is commonly referred to as the matrix completion problem. In this work, we specifically address the scenario of binary observations, often termed as 1-bit matrix completion. While numerous studies have explored Bayesian and frequentist methods for real-value matrix completion, there has been a lack of theoretical exploration regarding Bayesian approaches in 1-bit matrix completion. We tackle this gap by considering a general, non-uniform sampling scheme and providing theoretical assurances on the efficacy of the fractional posterior. Our contributions include obtaining concentration results for the fractional posterior and demonstrating its effectiveness in recovering the underlying parameter matrix. We accomplish this using two distinct types of prior distributions: low-rank factorization priors and a spectral scaled Student prior, with the latter requiring fewer assumptions. Importantly, our results exhibit an adaptive nature by not mandating prior knowledge of the rank of the parameter matrix. Our findings are comparable to those found in the frequentist literature, yet demand fewer restrictive assumptions.

1-bit matrix completion, fractional posterior, matrix completion, (12 more...)

2404.08969

Country: Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Zanna, Khadija, Sano, Akane

Enhancing Fairness and Performance in Machine Learning Models: A Multi-Task Learning Approach with Monte-Carlo Dropout and Pareto Optimality

arXiv.org Artificial IntelligenceApr-12-2024

The term bias was first introduced in the machine learning domain by Tom Mitchell in his 1980 paper titled "The need for biases in learning generalizations" Mitchell [1980]. The concept of bias refers to giving importance to particular features to improve generalization. This general idea of bias in machine learning is positive and necessary for models to perform, eliminating the risk of hyper-focusing on specific samples over others. On the contrary, bias can also be negative in machine learning. Negative bias can be defined as an inaccurate assumption made by a machine learning algorithm that is systematically or historically prejudiced against certain groups of people Zanna et al. [2022]. Decisions made by these biased algorithms could cause adverse effects on particular social groups, for example, those defined by sex, race, age, marital status, handicaps, etc., when used to make autonomous decisions in life-changing cases such as health, hiring, education, criminal sentencing, etc. Negative bias can be introduced into the machine pipeline in two main ways, through the data or the algorithm itself Blanzeisky and Cunningham [2021]. Bias due to data, also known as a negative legacy Cunningham and Delany [2021], Kamishima et al. [2012], can be caused by an imbalance in the representation of different population categories

enhancing fairness, machine learning algorithm, participant, (10 more...)

2404.0823

Country:

North America > United States > Texas > Harris County > Houston (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Asia > Middle East > Israel (0.04)
Asia > Japan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Information Technology (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
(2 more...)

Kitson, Neville K, Constantinou, Anthony C

The Impact of Variable Ordering on Bayesian Network Structure Learning

arXiv.org Artificial IntelligenceApr-12-2024

Causal Bayesian Networks provide an important tool for reasoning under uncertainty with potential application to many complex causal systems. Structure learning algorithms that can tell us something about the causal structure of these systems are becoming increasingly important. In the literature, the validity of these algorithms is often tested for sensitivity over varying sample sizes, hyper-parameters, and occasionally objective functions. In this paper, we show that the order in which the variables are read from data can have much greater impact on the accuracy of the algorithm than these factors. Because the variable ordering is arbitrary, any significant effect it has on learnt graph accuracy is concerning, and this raises questions about the validity of the results produced by algorithms that are sensitive to, but have not been assessed against, different variable orderings.

bayesian network structure learning, variable ordering

2206.08952

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bhusal, Dipkamal, Rastogi, Nidhi

Adversarial Patterns: Building Robust Android Malware Classifiers

arXiv.org Artificial IntelligenceApr-12-2024

Machine learning models are increasingly being adopted across various fields, such as medicine, business, autonomous vehicles, and cybersecurity, to analyze vast amounts of data, detect patterns, and make predictions or recommendations. In the field of cybersecurity, these models have made significant improvements in malware detection. However, despite their ability to understand complex patterns from unstructured data, these models are susceptible to adversarial attacks that perform slight modifications in malware samples, leading to misclassification from malignant to benign. Numerous defense approaches have been proposed to either detect such adversarial attacks or improve model robustness. These approaches have resulted in a multitude of attack and defense techniques and the emergence of a field known as `adversarial machine learning.' In this survey paper, we provide a comprehensive review of adversarial machine learning in the context of Android malware classifiers. Android is the most widely used operating system globally and is an easy target for malicious agents. The paper first presents an extensive background on Android malware classifiers, followed by an examination of the latest advancements in adversarial attacks and defenses. Finally, the paper provides guidelines for designing robust malware classifiers and outlines research directions for the future.

adversarial sample, classifier, detection, (16 more...)

2203.02121

Country:

North America > United States > New York > Monroe County > Rochester (0.04)
Asia > Nepal (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(7 more...)