AITopics

2411.16366

Country:

Oceania > New Zealand (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

A Theoretical Survey on Foundation Models

Fu, Shi, Chen, Yuzhu, Wang, Yingjie, Tao, Dacheng

Understanding the inner mechanisms of black-box foundation models (FMs) is essential yet challenging in artificial intelligence and its applications. Over the last decade, the long-running focus has been on their explainability, leading to the development of post-hoc explainable methods to rationalize the specific decisions already made by black-box FMs. However, these explainable methods have certain limitations in terms of faithfulness and resource requirement. Consequently, a new class of interpretable methods should be considered to unveil the underlying mechanisms of FMs in an accurate, comprehensive, heuristic, and resource-light way. This survey aims to review those interpretable methods that comply with the aforementioned principles and have been successfully applied to FMs. These methods are deeply rooted in machine learning theory, covering the analysis of generalization performance, expressive capability, and dynamic behavior. They provide a thorough interpretation of the entire workflow of FMs, ranging from the inference capability and training dynamics to their ethical implications. Ultimately, drawing upon these interpretations, this review identifies the next frontier research directions for FMs.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2410.11444

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)
Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

Ige, Tosin, Kiekintveld, Christopher, Piplai, Aritran, Waggler, Amy, Kolade, Olukunle, Matti, Bolanle Hafiz

An investigation into the performances of the Current state-of-the-art Naive Bayes, Non-Bayesian and Deep Learning Based Classifier for Phishing Detection: A Survey

arXiv.org Artificial IntelligenceNov-24-2024

Phishing is one of the most effective ways in which cybercriminals get sensitive details such as credentials for online banking, digital wallets, state secrets, and many more from potential victims. They do this by spamming users with malicious URLs with the sole purpose of tricking them into divulging sensitive information which is later used for various cybercrimes. In this research, we did a comprehensive review of current state-of-the-art machine learning and deep learning phishing detection techniques to expose their vulnerabilities and future research direction. For better analysis and observation, we split machine learning techniques into Bayesian, non-Bayesian, and deep learning. We reviewed the most recent advances in Bayesian and non-Bayesian-based classifiers before exploiting their corresponding weaknesses to indicate future research direction. While exploiting weaknesses in both Bayesian and non-Bayesian classifiers, we also compared each performance with a deep learning classifier. For a proper review of deep learning-based classifiers, we looked at Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), and Long Short Term Memory Networks (LSTMs). We did an empirical analysis to evaluate the performance of each classifier along with many of the proposed state-of-the-art anti-phishing techniques to identify future research directions, we also made a series of proposals on how the performance of the under-performing algorithm can improved in addition to a two-stage prediction model

artificial intelligence, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.16751

Country:

North America > United States > Texas > El Paso County > El Paso (0.14)
Europe > United Kingdom (0.14)
Asia > South Korea (0.14)
(5 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Du, Jin-Hong, Roeder, Kathryn, Wasserman, Larry

Assumption-Lean Post-Integrated Inference with Negative Control Outcomes

In the big data era, integrating information from multiple heterogeneous sources has become increasingly crucial for achieving larger sample sizes and more diverse study populations. The applications of data integration are in a variety of fields, including but not limited to, causal inference on heterogeneous populations (Shi et al., 2023), survey sampling (Yang et al., 2020), health policy (Paddock et al., 2024), retrospective psychometrics (Howe and Brown, 2023), and multi-omics biological science (Du et al., 2022). Data integration methods have been proposed to mitigate the unwanted effects of heterogeneous datasets and unmeasured covariates, recovering the common variation across datasets. However, a critical and often overlooked question is whether reliable statistical inference can be made from integrated data. Directly performing statistical inference on integrated outcomes and covariates of interests fails to account for the complex correlation structures introduced by the data integration process, often leading to improper analyses that incorrectly assume the corrected data points are independent (Li et al., 2023). While data integration is broadly utilized in various fields, our paper focuses on a challenging scenario with the presence of high-dimensional outcomes.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2410.04996

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > Albany County > Albany (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
(2 more...)

Albert, Joshua, Balazs, Csaba, Fowlie, Andrew, Handley, Will, Hunt-Smith, Nicholas, de Austri, Roberto Ruiz, White, Martin

A comparison of Bayesian sampling algorithms for high-dimensional particle physics and cosmology applications

For several decades now, Bayesian inference techniques have been applied to theories of particle physics, cosmology and astrophysics to obtain the probability density functions of their free parameters. In this study, we review and compare a wide range of Markov Chain Monte Carlo (MCMC) and nested sampling techniques to determine their relative efficacy on functions that resemble those encountered most frequently in the particle astrophysics literature. Our first series of tests explores a series of high-dimensional analytic test functions that exemplify particular challenges, for example highly multimodal posteriors or posteriors with curving degeneracies. We then investigate two real physics examples, the first being a global fit of the $\Lambda$CDM model using cosmic microwave background data from the Planck experiment, and the second being a global fit of the Minimal Supersymmetric Standard Model using a wide variety of collider and astrophysics data. We show that several examples widely thought to be most easily solved using nested sampling approaches can in fact be more efficiently solved using modern MCMC algorithms, but the details of the implementation matter. Furthermore, we also provide a series of useful insights for practitioners of particle astrophysics and cosmology.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2409.18464

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Bockting, Florence, Radev, Stefan T., Bürkner, Paul-Christian

Expert-elicitation method for non-parametric joint priors using normalizing flows

The Bayesian paradigm offers the possibility to incorporate prior knowledge into a statistical model through the specification of prior distributions. This possibility is a central advantage of the Bayesian paradigm (Mikkola et al 2023), yet it also presents one of its most challenging aspects (Simpson et al 2017; lgorzata Roos et al 2015; Van Dongen 2006). In the following, we define prior knowledge as the expertise provided by a domain expert -- an individual with extensive knowledge of a specific subject matter (Falconer et al 2022). This knowledge can be represented in various forms, but to integrate it into a Bayesian model, we need to translate it into a formal mathematical language that can be expressed as a prior distribution over the model parameters (Perepolkin et al 2023; O'Hagan 2019; Martin et al 2012; Garthwaite et al 2005). A whole field of research, commonly referred to as (expert) prior elicitation, has emerged around the question of how to gather expert knowledge and translate it into appropriate prior distributions (Stefan et al 2022; Mikkola et al 2023; Falconer et al 2022).

artificial intelligence, bayesian inference, machine learning, (19 more...)

2411.15826

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > North Rhine-Westphalia (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Ziomek, Juliusz, Adachi, Masaki, Osborne, Michael A.

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal

arXiv.org Machine LearningNov-22-2024

Bayesian Optimization (BO) is widely used for optimising black-box functions but requires us to specify the length scale hyperparameter, which defines the smoothness of the functions the optimizer will consider. Most current BO algorithms choose this hyperparameter by maximizing the marginal likelihood of the observed data, albeit risking misspecification if the objective function is less smooth in regions we have not yet explored. The only prior solution addressing this problem with theoretical guarantees was A-GP-UCB, proposed by Berkenkamp et al. (2019). This algorithm progressively decreases the length scale, expanding the class of functions considered by the optimizer. However, A-GP-UCB lacks a stopping mechanism, leading to over-exploration and slow convergence. To overcome this, we introduce Length scale Balancing (LB) - a novel approach, aggregating multiple base surrogate models with varying length scales. LB intermittently adds smaller length scale candidate values while retaining longer scales, balancing exploration and exploitation. We formally derive a cumulative regret bound of LB and compare it with the regret of an oracle BO algorithm using the optimal length scale. Denoting the factor by which the regret bound of A-GP-UCB was away from oracle as $g(T)$, we show that LB is only $\log g(T)$ away from oracle regret. We also empirically evaluate our algorithm on synthetic and real-world benchmarks and show it outperforms A-GP-UCB, maximum likelihood estimation and MCMC.

artificial intelligence, data mining, machine learning, (20 more...)

2410.10384

Country:

North America > Canada > Alberta (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (0.99)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceNov-22-2024

Financial Fraud Detection using Jump-Attentive Graph Neural Networks

Kadam, Prashank

As the availability of financial services online continues to grow, the incidence of fraud has surged correspondingly. Fraudsters continually seek new and innovative ways to circumvent the detection algorithms in place. Traditionally, fraud detection relied on rule-based methods, where rules were manually created based on transaction data features. However, these techniques soon became ineffective due to their reliance on manual rule creation and their inability to detect complex data patterns. Today, a significant portion of the financial services sector employs various machine learning algorithms, such as XGBoost, Random Forest, and neural networks, to model transaction data. While these techniques have proven more efficient than rule-based methods, they still fail to capture interactions between different transactions and their interrelationships. Recently, graph-based techniques have been adopted for financial fraud detection, leveraging graph topology to aggregate neighborhood information of transaction data using Graph Neural Networks (GNNs). Despite showing improvements over previous methods, these techniques still struggle to keep pace with the evolving camouflaging tactics of fraudsters and suffer from information loss due to over-smoothing. In this paper, we propose a novel algorithm that employs an efficient neighborhood sampling method, effective for camouflage detection and preserving crucial feature information from non-similar nodes. Additionally, we introduce a novel GNN architecture that utilizes attention mechanisms and preserves holistic neighborhood information to prevent information loss. We test our algorithm on financial data to show that our method outperforms other state-of-the-art graph algorithms.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Artificial Intelligence

2411.05857

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Li, Ziyao, Hsu, Shang-Ling, Shahabi, Cyrus

Forecasting Unseen Points of Interest Visits Using Context and Proximity Priors

arXiv.org Artificial IntelligenceNov-22-2024

Understanding human mobility behavior is crucial for numerous applications, including crowd management, location-based recommendations, and the estimation of pandemic spread. Machine learning models can predict the Points of Interest (POIs) that individuals are likely to visit in the future by analyzing their historical visit patterns. Previous studies address this problem by learning a POI classifier, where each class corresponds to a POI. However, this limits their applicability to predict a new POI that was not in the training data, such as the opening of new restaurants. To address this challenge, we propose a model designed to predict a new POI outside the training data as long as its context is aligned with the user's interests. Unlike existing approaches that directly predict specific POIs, our model first forecasts the semantic context of potential future POIs, then combines this with a proximity-based prior probability distribution to determine the exact POI. Experimental results on real-world visit data demonstrate that our model outperforms baseline methods that do not account for semantic contexts, achieving a 17% improvement in accuracy. Notably, as new POIs are introduced over time, our model remains robust, exhibiting a lower decline rate in prediction accuracy compared to existing methods.

poi, pois, sequence, (16 more...)

arXiv.org Artificial Intelligence

2411.15285

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Restaurants (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Banerjee, Saikat, Carbonetto, Peter, Stephens, Matthew

Gradient-based optimization for variational empirical Bayes multiple regression

arXiv.org Machine LearningNov-21-2024

Multiple linear regression provides a simple, but widely used, method to find associations between outcomes (responses) and a set of predictors (explanatory variables). It has been actively studied over more than a century, and there is a rich and vast literature on the subject [1]. In practical situations the number of predictor variables is often large, and it becomes desirable to induce sparsity in the regression coefficients to avoid overfitting [2, 3]. Sparse linear regression also serves as the foundation for non-linear techniques, such as trendfiltering [4, 5], which can estimate an underlying non-linear trend from time series data. Applications of sparse multiple linear regression and trendfiltering arise in a wide range of applications in modern science and engineering, including astronomy [6], atmospheric sciences [7], biology [8], economics [9, 10], genetics [11-15], geophysics [16], medical sciences [17, 18], social sciences [19] and text analysis [20]. Approaches to sparse linear regression can be broadly classified into two groups: (a) penalized linear regressions (PLR), which add a penalty term to the likelihood to penalize the magnitude of its parameters [21-23], and (b) Bayesian approaches [11-14, 24-29], which use a prior probability distribution on the model parameters to induce sparsity.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2411.1457

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)