AITopics

2412.11692

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Gou, Junyang, Salberg, Arnt-Børre, Shahvandi, Mostafa Kiani, Tourian, Mohammad J., Meyer, Ulrich, Boergens, Eva, Waldeland, Anders U., Velicogna, Isabella, Dahl, Fredrik, Jäggi, Adrian, Schindler, Konrad, Soja, Benedikt

Uncertainties of Satellite-based Essential Climate Variables from Deep Learning

Accurate uncertainty information associated with essential climate variables (ECVs) is crucial for reliable climate modeling and understanding the spatiotemporal evolution of the Earth system. In recent years, geoscience and climate scientists have benefited from rapid progress in deep learning to advance the estimation of ECV products with improved accuracy. However, the quantification of uncertainties associated with the output of such deep learning models has yet to be thoroughly adopted. This survey explores the types of uncertainties associated with ECVs estimated from deep learning and the techniques to quantify them. The focus is on highlighting the importance of quantifying uncertainties inherent in ECV estimates, considering the dynamic and multifaceted nature of climate data. The survey starts by clarifying the definition of aleatoric and epistemic uncertainties and their roles in a typical satellite observation processing workflow, followed by bridging the gap between conventional statistical and deep learning views on uncertainties. Then, we comprehensively review the existing techniques for quantifying uncertainties associated with deep learning algorithms, focusing on their application in ECV studies. The specific need for modification to fit the requirements from both the Earth observation side and the deep learning side in such interdisciplinary tasks is discussed. Finally, we demonstrate our findings with two ECV examples, snow cover and terrestrial water storage, and provide our perspectives for future research.

artificial intelligence, epistemic uncertainty, machine learning, (16 more...)

2412.17506

Country:

Europe > Sweden (0.28)
Europe > Norway (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (0.94)
Water & Waste Management > Water Management (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningDec-23-2024

Fast Causal Discovery by Approximate Kernel-based Generalized Score Functions with Linear Computational Complexity

Ren, Yixin, Zhang, Haocheng, Xia, Yewei, Zhang, Hao, Guan, Jihong, Zhou, Shuigeng

Score-based causal discovery methods can effectively identify causal relationships by evaluating candidate graphs and selecting the one with the highest score. One popular class of scores is kernel-based generalized score functions, which can adapt to a wide range of scenarios and work well in practice because they circumvent assumptions about causal mechanisms and data distributions. Despite these advantages, kernel-based generalized score functions pose serious computational challenges in time and space, with a time complexity of $\mathcal{O}(n^3)$ and a memory complexity of $\mathcal{O}(n^2)$, where $n$ is the sample size. In this paper, we propose an approximate kernel-based generalized score function with $\mathcal{O}(n)$ time and space complexities by using low-rank technique and designing a set of rules to handle the complex composite matrix operations required to calculate the score, as well as developing sampling algorithms for different data types to benefit the handling of diverse data types efficiently. Our extensive causal discovery experiments on both synthetic and real-world data demonstrate that compared to the state-of-the-art method, our method can not only significantly reduce computational costs, but also achieve comparable accuracy, especially for large datasets.

artificial intelligence, machine learning, score function, (12 more...)

2412.17717

Country:

North America > Canada > Ontario > Toronto (0.05)
Asia > China > Shanghai > Shanghai (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Lagrange, Nikita, Isambert, Herve

An efficient search-and-score algorithm for ancestral graphs using multivariate information scores

arXiv.org Machine LearningDec-23-2024

We propose a greedy search-and-score algorithm for ancestral graphs, which include directed as well as bidirected edges, originating from unobserved latent variables. The normalized likelihood score of ancestral graphs is estimated in terms of multivariate information over relevant ``ac-connected subsets'' of vertices, C, that are connected through collider paths confined to the ancestor set of C. For computational efficiency, the proposed two-step algorithm relies on local information scores limited to the close surrounding vertices of each node (step 1) and edge (step 2). This computational strategy, although restricted to information contributions from ac-connected subsets containing up to two-collider paths, is shown to outperform state-of-the-art causal discovery methods on challenging benchmark datasets.

ancestral graph, artificial intelligence, machine learning, (17 more...)

2412.17508

Country:

Europe (0.67)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Leveraging Cardiovascular Simulations for In-Vivo Prediction of Cardiac Biomarkers

Manduchi, Laura, Wehenkel, Antoine, Behrmann, Jens, Pegolotti, Luca, Miller, Andy C., Sener, Ozan, Cuturi, Marco, Sapiro, Guillermo, Jacobsen, Jörn-Henrik

Whole-body hemodynamics simulators, which model blood flow and pressure waveforms as functions of physiological parameters, are now essential tools for studying cardiovascular systems. However, solving the corresponding inverse problem of mapping observations (e.g., arterial pressure waveforms at specific locations in the arterial network) back to plausible physiological parameters remains challenging. Leveraging recent advances in simulation-based inference, we cast this problem as statistical inference by training an amortized neural posterior estimator on a newly built large dataset of cardiac simulations that we publicly release. To better align simulated data with real-world measurements, we incorporate stochastic elements modeling exogenous effects. The proposed framework can further integrate in-vivo data sources to refine its predictive capabilities on real-world data. In silico, we demonstrate that the proposed framework enables finely quantifying uncertainty associated with individual measurements, allowing trustworthy prediction of four biomarkers of clinical interest--namely Heart Rate, Cardiac Output, Systemic Vascular Resistance, and Left Ventricular Ejection Time--from arterial pressure waveforms and photoplethysmograms. Furthermore, we validate the framework in vivo, where our method accurately captures temporal trends in CO and SVR monitoring on the VitalDB dataset. Finally, the predictive error made by the model monotonically increases with the predicted uncertainty, thereby directly supporting the automatic rejection of unusable measurements.

artificial intelligence, machine learning, posterior distribution, (17 more...)

2412.17542

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Rusli, Andre, Shishido, Makoto

An Experimental Evaluation of Japanese Tokenizers for Sentiment-Based Text Classification

This study investigates the performance of three popular tokenization tools: MeCab, Sudachi, and SentencePiece, when applied as a preprocessing step for sentiment-based text classification of Japanese texts. Using Term Frequency-Inverse Document Frequency (TF-IDF) vectorization, we evaluate two traditional machine learning classifiers: Multinomial Naive Bayes and Logistic Regression. The results reveal that Sudachi produces tokens closely aligned with dictionary definitions, while MeCab and SentencePiece demonstrate faster processing speeds. The combination of SentencePiece, TF-IDF, and Logistic Regression outperforms the other alternatives in terms of classification performance.

machine learning, natural language, text classification, (19 more...)

2412.17361

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)

Better Knowledge Enhancement for Privacy-Preserving Cross-Project Defect Prediction

Wang, Yuying, Li, Yichen, Wang, Haozhao, Zhao, Lei, Zhang, Xiaofang

Cross-Project Defect Prediction (CPDP) poses a non-trivial challenge to construct a reliable defect predictor by leveraging data from other projects, particularly when data owners are concerned about data privacy. In recent years, Federated Learning (FL) has become an emerging paradigm to guarantee privacy information by collaborative training a global model among multiple parties without sharing raw data. While the direct application of FL to the CPDP task offers a promising solution to address privacy concerns, the data heterogeneity arising from proprietary projects across different companies or organizations will bring troubles for model training. In this paper, we study the privacy-preserving cross-project defect prediction with data heterogeneity under the federated learning framework. To address this problem, we propose a novel knowledge enhancement approach named FedDP with two simple but effective solutions: 1. Local Heterogeneity Awareness and 2. Global Knowledge Distillation. Specifically, we employ open-source project data as the distillation dataset and optimize the global model with the heterogeneity-aware local model ensemble via knowledge distillation. Experimental results on 19 projects from two datasets demonstrate that our method significantly outperforms baselines.

artificial intelligence, data mining, machine learning, (15 more...)

2412.17317

Country:

North America > United States (1.00)
Asia (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningDec-22-2024

Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation

Ding, Yanna, Huang, Zijie, Shou, Xiao, Guo, Yihang, Sun, Yizhou, Gao, Jianxi

We Training neural architectures is a resource-intensive endeavor, utilize a seq2seq variational autoencoder framework to analyze often demanding considerable computational power the initial stages of a learning curve and predict its future and time. Researchers have developed various methodologies progression. This predictive capability is further enhanced to predict the performance of neural networks early in by an architecture-aware component that produces a graphlevel the training process using learning curve data. Some methods embedding from the architecture's topology, employing Domhan et al. (2015); Gargiani et al. (2019); Adriaensen techniques like Graph Convolutional Networks (GCN) Kipf et al. (2023) apply Bayesian inference to project these and Welling (2016) and Differentiable Pooling Ying et al. curves forward, while others employ time-series prediction (2018). This integration not only improves the accuracy of techniques, such as LSTM networks. Despite their effectiveness, learning curve extrapolations compared to existing methods these approaches (Swersky et al., 2014; Baker et al., but also significantly facilitates model ranking, potentially 2017) typically overlook the architectural features of networks, leading to more efficient use of computational resources, missing out on crucial insights that could be derived from the accelerated experimentation cycles, and faster progress in the models' topology.

artificial intelligence, deep learning, machine learning, (18 more...)

2412.15554

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceDec-22-2024

Machine learning and natural language processing models to predict the extent of food processing

Arora, Nalin, Bhagat, Sumit, Dhama, Riya, Bagler, Ganesh

The dramatic increase in consumption of ultra-processed food has been associated with numerous adverse health effects. Given the public health consequences linked to ultra-processed food consumption, it is highly relevant to build computational models to predict the processing of food products. We created a range of machine learning, deep learning, and NLP models to predict the extent of food processing by integrating the FNDDS dataset of food products and their nutrient profiles with their reported NOVA processing level. Starting with the full nutritional panel of 102 features, we further implemented coarse-graining of features to 65 and 13 nutrients by dropping flavonoids and then by considering the 13-nutrient panel of FDA, respectively. LGBM Classifier and Random Forest emerged as the best model for 102 and 65 nutrients, respectively, with an F1-score of 0.9411 and 0.9345 and MCC of 0.8691 and 0.8543. For the 13-nutrient panel, Gradient Boost achieved the best F1-score of 0.9284 and MCC of 0.8425. We also implemented NLP based models, which exhibited state-of-the-art performance.

large language model, machine learning, nutrient, (17 more...)

2412.17217

Country:

North America > United States > North Carolina (0.04)
North America > Mexico (0.04)
North America > Guatemala (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Consumer Health (1.00)
Food & Agriculture (1.00)
Education > Health & Safety > School Nutrition (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceDec-22-2024

Reduced Order Models and Conditional Expectation

Matthies, Hermann G.

Systems may depend on parameters which one may control, or which serve to optimise the system, or are imposed externally, or they could be uncertain. This last case is taken as the "Leitmotiv" for the following. A reduced order model is produced from the full order model by some kind of projection onto a relatively low-dimensional manifold or subspace. The parameter dependent reduction process produces a function of the parameters into the manifold. One now wants to examine the relation between the full and the reduced state for all possible parameter values of interest. Similarly, in the field of machine learning, also a function of the parameter set into the image space of the machine learning model is learned on a training set of samples, typically minimising the mean-square error. This set may be seen as a sample from some probability distribution, and thus the training is an approximate computation of the expectation, giving an approximation to the conditional expectation, a special case of an Bayesian updating where the Bayesian loss function is the mean-square error. This offers the possibility of having a combined look at these methods, and also introducing more general loss functions.

artificial intelligence, conditional expectation, machine learning, (20 more...)

2412.19836

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)