AITopics

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Neural Information Processing SystemsJan-22-2025, 00:32:34 GMT

Reviews: Learning Bayesian Networks with Low Rank Conditional Probability Tables

This paper presents a method for structural learning of a BN given observational data. The work is mainly theoretical, and for the proposal some assumptions are taken. A great effort is also given in presenting and develop theoretically the complexity of the algorithm. One of the key points in the proposed algorithm is the use of Fourier basis vectors (coefficients) and how they are applied in the compressed sensing step. I haven't checked thoroughly all the mathematical part, which is the core of the paper.

algorithm, learning bayesian network, low rank conditional probability table

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Attention-Driven Hierarchical Reinforcement Learning with Particle Filtering for Source Localization in Dynamic Fields

Shi, Yiwei, Yang, Mengyue, Zhang, Qi, Zhang, Weinan, Liu, Cunjia, Liu, Weiru

In many real-world scenarios, such as gas leak detection or environmental pollutant tracking, solving the Inverse Source Localization and Characterization problem involves navigating complex, dynamic fields with sparse and noisy observations. Traditional methods face significant challenges, including partial observability, temporal and spatial dynamics, out-of-distribution generalization, and reward sparsity. To address these issues, we propose a hierarchical framework that integrates Bayesian inference and reinforcement learning. The framework leverages an attention-enhanced particle filtering mechanism for efficient and accurate belief updates, and incorporates two complementary execution strategies: Attention Particle Filtering Planning and Attention Particle Filtering Reinforcement Learning. These approaches optimize exploration and adaptation under uncertainty. Theoretical analysis proves the convergence of the attention-enhanced particle filter, while extensive experiments across diverse scenarios validate the framework's superior accuracy, adaptability, and computational efficiency. Our results highlight the framework's potential for broad applications in dynamic field estimation tasks.

machine learning, particle, reinforcement learning, (12 more...)

2501.13084

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
(2 more...)

WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge

Chen, Jingyuan, Wu, Tao, Ji, Wei, Wu, Fei

Large language models (LLMs) have emerged as powerful tools in natural language processing (NLP), showing a promising future of artificial generated intelligence (AGI). Despite their notable performance in the general domain, LLMs have remained suboptimal in the field of education, owing to the unique challenges presented by this domain, such as the need for more specialized knowledge, the requirement for personalized learning experiences, and the necessity for concise explanations of complex concepts. To address these issues, this paper presents a novel LLM for education named WisdomBot, which combines the power of LLMs with educational theories, enabling their seamless integration into educational contexts. To be specific, we harness self-instructed knowledge concepts and instructions under the guidance of Bloom's Taxonomy as training data. To further enhance the accuracy and professionalism of model's response on factual questions, we introduce two key enhancements during inference, i.e., local knowledge base retrieval augmentation and search engine retrieval augmentation during inference. We substantiate the effectiveness of our approach by applying it to several Chinese LLMs, thereby showcasing that the fine-tuned models can generate more reliable and professional responses.

large language model, machine learning, natural language, (18 more...)

2501.12877

Country:

Asia > Singapore (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Renero, Jesus, Ochoa, Idoia, Maestre, Roberto

REX: Causal Discovery based on Machine Learning and Explainability techniques

Causal discovery --the process of identifying cause-and-effect relationships from observational data-- is a pivotal challenge in artificial intelligence (AI) and machine learning. Unveiling causal structures enables robust predictions, facilitates counterfactual reasoning, and enhances decision-making processes in complex systems [1]. Traditional methods for causal discovery often rely on statistical tests for independence and structural equation modeling, which may not scale efficiently with high-dimensional data or effectively capture intricate non-linear relationships [2, 3]. In recent years, machine learning models, particularly deep learning architectures, have achieved remarkable success in predictive tasks. However, these models are typically considered "black boxes" due to their lack of interpretability. This opacity has led to a growing interest in explainable AI (XAI) techniques, with Shapley values emerging as a prominent method for interpreting model predictions [4]. Shapley values, grounded in cooperative game theory, provide a principled approach to attributing the contribution of each feature to the output of a model by quantifying the average marginal contribution of a feature across all possible subsets of features [5]. While Shapley values offer valuable insights into feature importance within a model's predictive framework, the link between feature importance and causal influence is non-trivial.

artificial intelligence, causal relationship, machine learning, (17 more...)

2501.12706

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
(2 more...)

Fleissner, Maximilian, Esser, Pascal, Ghoshdastidar, Debarghya

A Probabilistic Model for Self-Supervised Learning

Self-supervised learning (SSL) aims to find meaningful representations from unlabeled data by encoding semantic similarities through data augmentations. Despite its current popularity, theoretical insights about SSL are still scarce. For example, it is not yet known whether commonly used SSL loss functions can be related to a statistical model, much in the same as OLS, generalized linear models or PCA naturally emerge as maximum likelihood estimates of an underlying generative process. In this short paper, we consider a latent variable statistical model for SSL that exhibits an interesting property: Depending on the informativeness of the data augmentations, the MLE of the model either reduces to PCA, or approaches a simple non-contrastive loss. We analyze the model and also empirically illustrate our findings.

artificial intelligence, inductive learning, machine learning, (17 more...)

2501.13031

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Enhancing Robust Fairness via Confusional Spectral Regularization

Jin, Gaojie, Wu, Sihao, Liu, Jiaxu, Huang, Tianjin, Mu, Ronghui

Recent research has highlighted a critical issue known as "robust fairness", where robust accuracy varies significantly across different classes, undermining the reliability of deep neural networks (DNNs). A common approach to address this has been to dynamically reweight classes during training, giving more weight to those with lower empirical robust performance. However, we find there is a divergence of class-wise robust performance between training set and testing set, which limits the effectiveness of these explicit reweighting methods, indicating the need for a principled alternative. In this work, we derive a robust generalization bound for the worst-class robust error within the PAC-Bayesian framework, accounting for unknown data distributions. Our analysis shows that the worst-class robust error is influenced by two main factors: the spectral norm of the empirical robust confusion matrix and the information embedded in the model and training set. While the latter has been extensively studied, we propose a novel regularization technique targeting the spectral norm of the robust confusion matrix to improve worst-class robust accuracy and enhance robust fairness. Deep neural networks, spanning a diverse array of domains and applications, have shown impressive abilities to learn from training data and generalize effectively to new, unseen data. However, recent studies have uncovered a notable weakness in these DNNs - their vulnerability to subtle, often undetectable "adversarial attacks" (Biggio et al., 2013; Szegedy et al., 2013). It has been discovered that even slight perturbations to the input, typically imperceptible to humans, can drastically mislead the networks, resulting in significant prediction errors (Goodfellow et al., 2015; Wu et al., 2020a).

artificial intelligence, bayesian inference, machine learning, (15 more...)

2501.13273

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

arXiv.org Machine LearningJan-22-2025

Co-Learning Bayesian Optimization

Guo, Zhendong, Ong, Yew-Soon, He, Tiantian, Liu, Haitao

Bayesian optimization (BO) is well known to be sample-efficient for solving black-box problems. However, the BO algorithms can sometimes get stuck in suboptimal solutions even with plenty of samples. Intrinsically, such suboptimal problem of BO can attribute to the poor surrogate accuracy of the trained Gaussian process (GP), particularly that in the regions where the optimal solutions locate. Hence, we propose to build multiple GP models instead of a single GP surrogate to complement each other and thus resolving the suboptimal problem of BO. Nevertheless, according to the bias-variance tradeoff equation, the individual prediction errors can increase when increasing the diversity of models, which may lead to even worse overall surrogate accuracy. On the other hand, based on the theory of Rademacher complexity, it has been proved that exploiting the agreement of models on unlabeled information can help to reduce the complexity of the hypothesis space, and therefore achieving the required surrogate accuracy with fewer samples. Such value of model agreement has been extensively demonstrated for co-training style algorithms to boost model accuracy with a small portion of samples. Inspired by the above, we propose a novel BO algorithm labeled as co-learning BO (CLBO), which exploits both model diversity and agreement on unlabeled information to improve the overall surrogate accuracy with limited samples, and therefore achieving more efficient global optimization. Through tests on five numerical toy problems and three engineering benchmarks, the effectiveness of proposed CLBO has been well demonstrated.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TCYB.2022.3168551

2501.13332

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Neural Information Processing SystemsJan-21-2025, 05:38:11 GMT

Review for NeurIPS paper: Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network

Additional Feedback: The authors propose a Gibbs sampling algorithm that is mentioned to be very efficient. I would expect the parameters to be very correlated, especially in a three-layer model. Could the authors elaborate on this, efficient in what sense? I assume the Gibbs sampler is rather used as a stochastic optimization algorithm than a way to explore the whole posterior? The link activation variable u_k is essentially a variable that will work on the topic level to give strength to individual topics for the links.

deep relational topic modeling, gibbs sampler, graph poisson gamma belief network, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.73)

Neural Information Processing SystemsJan-21-2025, 05:38:03 GMT

Review for NeurIPS paper: Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network

The paper, the reviews, the author response and the ensuing discussion were all taken into consideration. All reviewers considered the work marginally above the acceptance threshold. Novelty was a concern for some but other reviewers appreciated it. Lacking comparisons to GCN and others, evaluation of underlying topics, and consideration of topic modeling prior work were also concerns. However, the paper was generally felt to represent good work, and use of a deep model in this context, design of the model, and convincing experiments were appreciated.

deep relational topic modeling, graph poisson gamma belief network, neurips paper, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)