AITopics

doi: 10.32473/flairs.v35i.130692

2409.04102

Country:

Europe > Switzerland (0.04)
Asia (0.04)

Genre: Research Report (0.41)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Shen, Yuchen, Wen, Haomin, Akoglu, Leman

Zero-shot Outlier Detection via Prior-data Fitted Networks: Model Selection Bygone!

arXiv.org Artificial IntelligenceSep-9-2024

Outlier detection (OD) has a vast literature as it finds numerous applications in environmental monitoring, cybersecurity, finance, and medicine to name a few. Being an inherently unsupervised task, model selection is a key bottleneck for OD (both algorithm and hyperparameter selection) without label supervision. There is a long list of techniques to choose from -- both classical algorithms and deep neural architectures -- and while several studies report their hyperparameter sensitivity, the literature is quite slim on unsupervised model selection -- limiting the effective use of OD in practice. In this paper we present FoMo-0D, for zero/0-shot OD exploring a transformative new direction that bypasses the hurdle of model selection altogether (!), thus breaking new ground. The fundamental idea behind FoMo-0D is the Prior-data Fitted Networks, recently introduced by Muller et al.(2022), which trains a Transformer model on a large body of synthetically generated data from a prior data distribution. In essence, FoMo-0D is a pretrained Foundation Model for zero/0-shot OD on tabular data, which can directly predict the (outlier/inlier) label of any test data at inference time, by merely a single forward pass -- making obsolete the need for choosing an algorithm/architecture, tuning its associated hyperparameters, and even training any model parameters when given a new OD dataset. Extensive experiments on 57 public benchmark datasets against 26 baseline methods show that FoMo-0D performs statistically no different from the top 2nd baseline, while significantly outperforming the majority of the baselines, with an average inference time of 7.7 ms per test sample.

baseline, dataset, fomo-0d, (15 more...)

2409.05672

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.13)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Hofmeyr, David P., Kamper, Francois, Melonas, Michail M.

Optimal Projections for Classification with Naive Bayes

arXiv.org Machine LearningSep-9-2024

In the Naive Bayes classification model the class conditional densities are estimated as the products of their marginal densities along the cardinal basis directions. We study the problem of obtaining an alternative basis for this factorisation with the objective of enhancing the discriminatory power of the associated classification model. We formulate the problem as a projection pursuit to find the optimal linear projection on which to perform classification. Optimality is determined based on the multinomial likelihood within which probabilities are estimated using the Naive Bayes factorisation of the projected data. Projection pursuit offers the added benefits of dimension reduction and visualisation. We discuss an intuitive connection with class conditional independent components analysis, and show how this is realised visually in practical applications. The performance of the resulting classification models is investigated using a large collection of (162) publicly available benchmark data sets and in comparison with relevant alternatives. We find that the proposed approach substantially outperforms other popular probabilistic discriminant analysis models and is highly competitive with Support Vector Machines.

covariate, factorisation, projection, (13 more...)

arXiv.org Machine Learning

2409.05635

Country:

North America > United States > New Jersey > Somerset County > Somerset (0.04)
Europe > United Kingdom (0.04)
Europe > Switzerland (0.04)
Africa > South Africa (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Manthena, Harikha, Shajarian, Shaghayegh, Kimmell, Jeffrey, Abdelsalam, Mahmoud, Khorsandroo, Sajad, Gupta, Maanak

Explainable Malware Analysis: Concepts, Approaches and Challenges

arXiv.org Artificial IntelligenceSep-9-2024

Machine learning (ML) has seen exponential growth in recent years, finding applications in various domains such as finance, medicine, and cybersecurity. Malware remains a significant threat to modern computing, frequently used by attackers to compromise systems. While numerous machine learning-based approaches for malware detection achieve high performance, they often lack transparency and fail to explain their predictions. This is a critical drawback in malware analysis, where understanding the rationale behind detections is essential for security analysts to verify and disseminate information. Explainable AI (XAI) addresses this issue by maintaining high accuracy while producing models that provide clear, understandable explanations for their decisions. In this survey, we comprehensively review the current state-of-the-art ML-based malware detection techniques and popular XAI approaches. Additionally, we discuss research implementations and the challenges of explainable malware analysis. This theoretical survey serves as an entry point for researchers interested in XAI applications in malware detection. By analyzing recent advancements in explainable malware analysis, we offer a broad overview of the progress in this field, positioning our work as the first to extensively cover XAI methods for malware classification and detection.

detection, malware detection, prediction, (17 more...)

2409.13723

Country:

North America > United States > Tennessee (0.04)
North America > United States > North Carolina (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.34)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(5 more...)

arXiv.org Machine LearningSep-8-2024

Improving Tree Probability Estimation with Stochastic Optimization and Variance Reduction

Xie, Tianyu, Yuan, Musu, Deng, Minghua, Zhang, Cheng

Probability estimation of tree topologies is one of the fundamental tasks in phylogenetic inference. The recently proposed subsplit Bayesian networks (SBNs) provide a powerful probabilistic graphical model for tree topology probability estimation by properly leveraging the hierarchical structure of phylogenetic trees. However, the expectation maximization (EM) method currently used for learning SBN parameters does not scale up to large data sets. In this paper, we introduce several computationally efficient methods for training SBNs and show that variance reduction could be the key for better performance. Furthermore, we also introduce the variance reduction technique to improve the optimization of SBN parameters for variational Bayesian phylogenetic inference (VBPI). Extensive synthetic and real data experiments demonstrate that our methods outperform previous baseline methods on the tasks of tree topology probability estimation as well as Bayesian phylogenetic inference using SBNs.

estimator, kl divergence, tree topology, (10 more...)

arXiv.org Machine Learning

2409.05282

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ghorbanian, Javad, Casaprima, Nicholas, Olivier, Audrey

Empowering Bayesian Neural Networks with Functional Priors through Anchored Ensembling for Mechanics Surrogate Modeling Applications

arXiv.org Machine LearningSep-8-2024

In recent years, neural networks (NNs) have become increasingly popular for surrogate modeling tasks in mechanics and materials modeling applications. While traditional NNs are deterministic functions that rely solely on data to learn the input--output mapping, casting NN training within a Bayesian framework allows to quantify uncertainties, in particular epistemic uncertainties that arise from lack of training data, and to integrate a priori knowledge via the Bayesian prior. However, the high dimensionality and non-physicality of the NN parameter space, and the complex relationship between parameters (NN weights) and predicted outputs, renders both prior design and posterior inference challenging. In this work we present a novel BNN training scheme based on anchored ensembling that can integrate a priori information available in the function space, from e.g. low-fidelity models. The anchoring scheme makes use of low-rank correlations between NN parameters, learnt from pre-training to realizations of the functional prior. We also perform a study to demonstrate how correlations between NN weights, which are often neglected in existing BNN implementations, is critical to appropriately transfer knowledge between the function-space and parameter-space priors. Performance of our novel BNN algorithm is first studied on a small 1D example to illustrate the algorithm's behavior in both interpolation and extrapolation settings. Then, a thorough assessment is performed on a multi--input--output materials surrogate modeling example, where we demonstrate the algorithm's capabilities both in terms of accuracy and quality of the uncertainty estimation, for both in-distribution and out-of-distribution data.

ensemble, neural network, prediction, (13 more...)

arXiv.org Machine Learning

2409.05234

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Navaneeth, N, Tushar, null, Chakraborty, Souvik

Harnessing physics-informed operators for high-dimensional reliability analysis problems

Reliability analysis is a formidable task, particularly in systems with a large number of stochastic parameters. Conventional methods for quantifying reliability often rely on extensive simulations or experimental data, which can be costly and time-consuming, especially when dealing with systems governed by complex physical laws which necessitates computationally intensive numerical methods such as finite element or finite volume techniques. On the other hand, surrogate-based methods offer an efficient alternative for computing reliability by approximating the underlying model from limited data. Neural operators have recently emerged as effective surrogates for modelling physical systems governed by partial differential equations. These operators can learn solutions to PDEs for varying inputs and parameters. Here, we investigate the efficacy of the recently developed physics-informed wavelet neural operator in solving reliability analysis problems. In particular, we investigate the possibility of using physics-informed operator for solving high-dimensional reliability analysis problems, while bypassing the need for any simulation. Through four numerical examples, we illustrate that physics-informed operator can seamlessly solve high-dimensional reliability analysis problems with reasonable accuracy, while eliminating the need for running expensive simulations.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

2409.04708

Country: Asia > India (0.47)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Modeling & Simulation (0.68)
(2 more...)

Unlocking Potential Binders: Multimodal Pretraining DEL-Fusion for Denoising DNA-Encoded Libraries

Gu, Chunbin, He, Mutian, Cao, Hanqun, Chen, Guangyong, Hsieh, Chang-yu, Heng, Pheng Ann

In the realm of drug discovery, DNA-encoded library (DEL) screening technology has emerged as an efficient method for identifying high-affinity compounds. However, DEL screening faces a significant challenge: noise arising from nonspecific interactions within complex biological systems. Neural networks trained on DEL libraries have been employed to extract compound features, aiming to denoise the data and uncover potential binders to the desired therapeutic target. Nevertheless, the inherent structure of DEL, constrained by the limited diversity of building blocks, impacts the performance of compound encoders. Moreover, existing methods only capture compound features at a single level, further limiting the effectiveness of the denoising strategy. To mitigate these issues, we propose a Multimodal Pretraining DEL-Fusion model (MPDF) that enhances encoder capabilities through pretraining and integrates compound features across various scales. We develop pretraining tasks applying contrastive objectives between different compound representations and their text descriptions, enhancing the compound encoders' ability to acquire generic features. Furthermore, we propose a novel DEL-fusion framework that amalgamates compound information at the atomic, submolecular, and molecular levels, as captured by various compound encoders. The synergy of these innovations equips MPDF with enriched, multi-scale features, enabling comprehensive downstream denoising. Evaluated on three DEL datasets, MPDF demonstrates superior performance in data processing and analysis for validation tasks. Notably, MPDF offers novel insights into identifying high-affinity molecules, paving the way for improved DEL utility in drug discovery.

compound, dataset, library, (16 more...)

2409.05916

Country:

Asia > China > Hong Kong (0.04)
Asia > Middle East > Israel (0.04)
Asia > China > Gansu Province > Lanzhou (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)

A Comprehensive Survey on Evidential Deep Learning and Its Applications

Gao, Junyu, Chen, Mengyuan, Xiang, Liangyu, Xu, Changsheng

Reliable uncertainty estimation has become a crucial requirement for the industrial deployment of deep learning algorithms, particularly in high-risk applications such as autonomous driving and medical diagnosis. However, mainstream uncertainty estimation methods, based on deep ensembling or Bayesian neural networks, generally impose substantial computational overhead. To address this challenge, a novel paradigm called Evidential Deep Learning (EDL) has emerged, providing reliable uncertainty estimation with minimal additional computation in a single forward pass. This survey provides a comprehensive overview of the current research on EDL, designed to offer readers a broad introduction to the field without assuming prior knowledge. Specifically, we first delve into the theoretical foundation of EDL, the subjective logic theory, and discuss its distinctions from other uncertainty estimation frameworks. We further present existing theoretical advancements in EDL from four perspectives: reformulating the evidence collection process, improving uncertainty estimation via OOD samples, delving into various training strategies, and evidential regression networks. Thereafter, we elaborate on its extensive applications across various machine learning paradigms and downstream tasks. In the end, an outlook on future directions for better performances and broader adoption of EDL is provided, highlighting potential research avenues.

deep learning, edl, learning, (16 more...)

2409.0472

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry: Health & Medicine > Diagnostic Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Varsos, Konstantinos, Papamichail, Merkouris, Flouris, Giorgos, Bitsaki, Marina

Adaptation Procedure in Misinformation Games

We study interactions between agents in multi-agent systems, in which the agents are misinformed with regards to the game that they play, essentially having a subjective and incorrect understanding of the setting, without being aware of it. For that, we introduce a new game-theoretic concept, called misinformation games, that provides the necessary toolkit to study this situation. Subsequently, we enhance this framework by developing a time-discrete procedure (called the Adaptation Procedure) that captures iterative interactions in the above context. During the Adaptation Procedure, the agents update their information and reassess their behaviour in each step. We demonstrate our ideas through an implementation, which is used to study the efficiency and characteristics of the Adaptation Procedure.

adaptation procedure, misinformation game, misinformation game mg, (8 more...)

2409.04854

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.63)
Workflow (0.46)

Industry:

Media > News (0.70)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)