AITopics

doi: 10.1177/20552076241239274

2403.20124

Country:

Europe > Spain > Castile and León > León Province > León (0.05)
Europe > Spain > Castile and León > Valladolid Province > Valladolid (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nutrition and Weight Loss (1.00)
Health & Medicine > Therapeutic Area > Internal Medicine (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Teisseyre, Paweł, Furmańczyk, Konrad, Mielniczuk, Jan

Verifying the Selected Completely at Random Assumption in Positive-Unlabeled Learning

arXiv.org Machine LearningMar-29-2024

The goal of positive-unlabeled (PU) learning is to train a binary classifier on the basis of training data containing positive and unlabeled instances, where unlabeled observations can belong either to the positive class or to the negative class. Modeling PU data requires certain assumptions on the labeling mechanism that describes which positive observations are assigned a label. The simplest assumption, considered in early works, is SCAR (Selected Completely at Random Assumption), according to which the propensity score function, defined as the probability of assigning a label to a positive observation, is constant. On the other hand, a much more realistic assumption is SAR (Selected at Random), which states that the propensity function solely depends on the observed feature vector. SCAR-based algorithms are much simpler and computationally much faster compared to SAR-based algorithms, which usually require challenging estimation of the propensity score. In this work, we propose a relatively simple and computationally fast test that can be used to determine whether the observed data meet the SCAR assumption. Our test is based on generating artificial labels conforming to the SCAR case, which in turn allows to mimic the distribution of the test statistic under the null hypothesis of SCAR. We justify our method theoretically. In experiments, we demonstrate that the test successfully detects various deviations from SCAR scenario and at the same time it is possible to effectively control the type I error. The proposed test can be recommended as a pre-processing step to decide which final PU algorithm to choose in cases when nature of labeling mechanism is not known.

assumption, probability, proceedings, (15 more...)

2404.00145

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

arXiv.org Machine LearningMar-29-2024

Bayesian Nonparametrics: An Alternative to Deep Learning

Moraffah, Bahman

Bayesian nonparametric models offer a flexible and powerful framework for statistical model selection, enabling the adaptation of model complexity to the intricacies of diverse datasets. This survey intends to delve into the significance of Bayesian nonparametrics, particularly in addressing complex challenges across various domains such as statistics, computer science, and electrical engineering. By elucidating the basic properties and theoretical foundations of these nonparametric models, this survey aims to provide a comprehensive understanding of Bayesian nonparametrics and their relevance in addressing complex problems, particularly in the domain of multi-object tracking. Through this exploration, we uncover the versatility and efficacy of Bayesian nonparametric methodologies, paving the way for innovative solutions to intricate challenges across diverse disciplines.

application, dirichlet process, posterior distribution, (16 more...)

2404.00085

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Ohio (0.04)
(4 more...)

Genre:

Research Report (0.83)
Overview (0.54)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceMar-28-2024

The State of Lithium-Ion Battery Health Prognostics in the CPS Era

Shinde, Gaurav, Mohapatra, Rohan, Krishan, Pooja, Garg, Harish, Prabhu, Srikanth, Das, Sanchari, Masum, Mohammad, Sengupta, Saptarshi

Lithium-ion batteries (Li-ion) have revolutionized energy storage technology, becoming integral to our daily lives by powering a diverse range of devices and applications. Their high energy density, fast power response, recyclability, and mobility advantages have made them the preferred choice for numerous sectors. This paper explores the seamless integration of Prognostics and Health Management within batteries, presenting a multidisciplinary approach that enhances the reliability, safety, and performance of these powerhouses. Remaining useful life (RUL), a critical concept in prognostics, is examined in depth, emphasizing its role in predicting component failure before it occurs. The paper reviews various RUL prediction methods, from traditional models to cutting-edge data-driven techniques. Furthermore, it highlights the paradigm shift toward deep learning architectures within the field of Li-ion battery health prognostics, elucidating the pivotal role of deep learning in addressing battery system complexities. Practical applications of PHM across industries are also explored, offering readers insights into real-world implementations.This paper serves as a comprehensive guide, catering to both researchers and practitioners in the field of Li-ion battery PHM.

battery, estimation, ieeexplore, (12 more...)

2403.19816

Country:

North America > United States > California > Santa Clara County > San Jose (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Artificial IntelligenceMar-28-2024

Checkpoint Merging via Bayesian Optimization in LLM Pretraining

Liu, Deyuan, Wang, Zecheng, Wang, Bingning, Chen, Weipeng, Li, Chunshan, Tu, Zhiying, Chu, Dianhui, Li, Bo, Sui, Dianbo

The rapid proliferation of large language models (LLMs) such as GPT-4 and Gemini underscores the intense demand for resources during their training processes, posing significant challenges due to substantial computational and environmental costs. To alleviate this issue, we propose checkpoint merging in pretraining LLM. This method utilizes LLM checkpoints with shared training trajectories, and is rooted in an extensive search space exploration for the best merging weight via Bayesian optimization. Through various experiments, we demonstrate that: (1) Our proposed methodology exhibits the capacity to augment pretraining, presenting an opportunity akin to obtaining substantial benefits at minimal cost; (2) Our proposed methodology, despite requiring a given held-out dataset, still demonstrates robust generalization capabilities across diverse domains, a pivotal aspect in pretraining.

checkpoint, dataset, experiment, (16 more...)

2403.1939

Country:

Asia > Middle East > Jordan (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Monaco (0.04)
(6 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Hewapathirana, Jayathi, Sumanathilaka, Deshan

EmoScan: Automatic Screening of Depression Symptoms in Romanized Sinhala Tweets

arXiv.org Artificial IntelligenceMar-28-2024

This work explores the utilization of Romanized Sinhala social media data to identify individuals at risk of depression. A machine learning-based framework is presented for the automatic screening of depression symptoms by analyzing language patterns, sentiment, and behavioural cues within a comprehensive dataset of social media posts. The research has been carried out to compare the suitability of Neural Networks over the classical machine learning techniques. The proposed Neural Network with an attention layer which is capable of handling long sequence data, attains a remarkable accuracy of 93.25% in detecting depression symptoms, surpassing current state-of-the-art methods. These findings underscore the efficacy of this approach in pinpointing individuals in need of proactive interventions and support. Mental health professionals, policymakers, and social media companies can gain valuable insights through the proposed model. Leveraging natural language processing techniques and machine learning algorithms, this work offers a promising pathway for mental health screening in the digital era. By harnessing the potential of social media data, the framework introduces a proactive method for recognizing and assisting individuals at risk of depression. In conclusion, this research contributes to the advancement of proactive interventions and support systems for mental health, thereby influencing both research and practical applications in the field.

accuracy, depression, romanized sinhala, (13 more...)

2403.19728

Country: Asia > Sri Lanka > Western Province > Colombo > Colombo (0.05)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Cucuringu, Mihai, Dong, Xiaowen, Zhang, Ning

Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering

arXiv.org Machine LearningMar-28-2024

This paper studies the directed graph clustering problem through the lens of statistics, where we formulate clustering as estimating underlying communities in the directed stochastic block model (DSBM). We conduct the maximum likelihood estimation (MLE) on the DSBM and thereby ascertain the most probable community assignment given the observed graph structure. In addition to the statistical point of view, we further establish the equivalence between this MLE formulation and a novel flow optimization heuristic, which jointly considers two important directed graph statistics: edge density and edge orientation. Building on this new formulation of directed clustering, we introduce two efficient and interpretable directed clustering algorithms, a spectral clustering algorithm and a semidefinite programming based clustering algorithm. We provide a theoretical upper bound on the number of misclustered vertices of the spectral clustering algorithm using tools from matrix perturbation theory. We compare, both quantitatively and qualitatively, our proposed algorithms with existing directed clustering methods on both synthetic and real-world data, thus providing further ground to our theoretical contributions. Keywords: graph clustering, directed graphs, maximum likelihood estimation, spectral methods, matrix perturbation analysis, semidefinite programming. Authors are listed in alphabetical order. This is the corresponding author.

algorithm, graph, matrix, (16 more...)

2403.19516

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wang, Ziyu, Holmes, Chris

On Uncertainty Quantification for Near-Bayes Optimal Algorithms

arXiv.org Machine LearningMar-28-2024

Bayesian modelling allows for the quantification of predictive uncertainty which is crucial in safety-critical applications. Yet for many machine learning (ML) algorithms, it is difficult to construct or implement their Bayesian counterpart. In this work we present a promising approach to address this challenge, based on the hypothesis that commonly used ML algorithms are efficient across a wide variety of tasks and may thus be near Bayes-optimal w.r.t. an unknown task distribution. We prove that it is possible to recover the Bayesian posterior defined by the task distribution, which is unknown but optimal in this setting, by building a martingale posterior using the algorithm. We further propose a practical uncertainty quantification method that apply to general ML algorithms. Experiments based on a variety of non-NN and NN algorithms demonstrate the efficacy of our method.

algorithm, dataset, posterior, (14 more...)

2403.19381

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Emmert, Johannes, Mendez, Ronald, Dastjerdi, Houman Mirzaalian, Syben, Christopher, Maier, Andreas

The Artificial Neural Twin -- Process Optimization and Continual Learning in Distributed Process Chains

arXiv.org Artificial IntelligenceMar-27-2024

Industrial process optimization and control is crucial to increase economic and ecologic efficiency. However, data sovereignty, differing goals, or the required expert knowledge for implementation impede holistic implementation. Further, the increasing use of data-driven AI-methods in process models and industrial sensory often requires regular fine-tuning to accommodate distribution drifts. We propose the Artificial Neural Twin, which combines concepts from model predictive control, deep learning, and sensor networks to address these issues. Our approach introduces differentiable data fusion to estimate the state of distributed process steps and their dependence on input data. By treating the interconnected process steps as a quasi neural-network, we can backpropagate loss gradients for process optimization or model fine-tuning to process parameters or AI models respectively. The concept is demonstrated on a virtual machine park simulated in Unity, consisting of bulk material processes in plastic recycling.

artificial intelligence, machine learning, time step, (18 more...)

2403.18343

Genre:

Instructional Material (0.67)
Workflow (0.54)

Industry:

Materials (0.66)
Energy > Oil & Gas > Upstream (0.34)
Water & Waste Management > Solid Waste Management (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(3 more...)

Hengst, Floris den, Wolter, Ralf, Altmeyer, Patrick, Kaygan, Arda

Conformal Intent Classification and Clarification for Fast and Accurate Intent Recognition

arXiv.org Artificial IntelligenceMar-27-2024

We present Conformal Intent Classification and Clarification (CICC), a framework for fast and accurate intent classification for task-oriented dialogue systems. The framework turns heuristic uncertainty scores of any intent classifier into a clarification question that is guaranteed to contain the true intent at a pre-defined confidence level. By disambiguating between a small number of likely intents, the user query can be resolved quickly and accurately. Additionally, we propose to augment the framework for out-of-scope detection. In a comparative evaluation using seven intent recognition datasets we find that CICC generates small clarification questions and is capable of out-of-scope detection. CICC can help practitioners and researchers substantially in improving the user experience of dialogue agents with specific clarification questions.

cicc, prediction, query, (16 more...)

2403.18973

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)