AITopics

2311.02695

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceNov-5-2023

Non-parametric Conditional Independence Testing for Mixed Continuous-Categorical Variables: A Novel Method and Numerical Evaluation

Popescu, Oana-Iuliana, Gerhardus, Andreas, Runge, Jakob

Conditional independence testing (CIT) is a common task in machine learning, e.g., for variable selection, and a main component of constraint-based causal discovery. While most current CIT approaches assume that all variables are numerical or all variables are categorical, many real-world applications involve mixed-type datasets that include numerical and categorical variables. Non-parametric CIT can be conducted using conditional mutual information (CMI) estimators combined with a local permutation scheme. Recently, two novel CMI estimators for mixed-type datasets based on k-nearest-neighbors (k-NN) have been proposed. As with any k-NN method, these estimators rely on the definition of a distance metric. One approach computes distances by a one-hot encoding of the categorical variables, essentially treating categorical variables as discrete-numerical, while the other expresses CMI by entropy terms where the categorical variables appear as conditions only. In this work, we study these estimators and propose a variation of the former approach that does not treat categorical variables as numeric. Our numerical experiments show that our variant detects dependencies more robustly across different data distributions and preprocessing types.

artificial intelligence, estimator, machine learning, (16 more...)

2310.11132

Country: North America > United States > Virginia (0.14)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

arXiv.org Machine LearningOct-9-2023

Projecting infinite time series graphs to finite marginal graphs using number theory

Gerhardus, Andreas, Wahl, Jonas, Faltenbacher, Sofia, Ninad, Urmi, Runge, Jakob

In recent years, a growing number of method and application works have adapted and applied the causal-graphical-model framework to time series data. Many of these works employ time-resolved causal graphs that extend infinitely into the past and future and whose edges are repetitive in time, thereby reflecting the assumption of stationary causal relationships. However, most results and algorithms from the causal-graphical-model framework are not designed for infinite graphs. In this work, we develop a method for projecting infinite time series graphs with repetitive edges to marginal graphical models on a finite time window. These finite marginal graphs provide the answers to $m$-separation queries with respect to the infinite graph, a task that was previously unresolved. Moreover, we argue that these marginal graphs are useful for causal discovery and causal effect estimation in time series, effectively enabling to apply results developed for finite graphs to the infinite graphs. The projection procedure relies on finding common ancestors in the to-be-projected graph and is, by itself, not new. However, the projection procedure has not yet been algorithmically implemented for time series graphs since in these infinite graphs there can be infinite sets of paths that might give rise to common ancestors. We solve the search over these possibly infinite sets of paths by an intriguing combination of path-finding techniques for finite directed graphs and solution theory for linear Diophantine equations. By providing an algorithm that carries out the projection, our paper makes an important step towards a theoretically-grounded and method-agnostic generalization of a range of causal inference methods and results to time series.

artificial intelligence, graph, machine learning, (17 more...)

2310.05526

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.64)

arXiv.org Artificial IntelligenceJun-30-2023

Selecting Robust Features for Machine Learning Applications using Multidata Causal Discovery

S., Saranya Ganesh, Beucler, Tom, Tam, Frederick Iat-Hin, Gomez, Milton S., Runge, Jakob, Gerhardus, Andreas

Robust feature selection is vital for creating reliable and interpretable Machine Learning (ML) models. When designing statistical prediction models in cases where domain knowledge is limited and underlying interactions are unknown, choosing the optimal set of features is often difficult. To mitigate this issue, we introduce a Multidata (M) causal feature selection approach that simultaneously processes an ensemble of time series datasets and produces a single set of causal drivers. This approach uses the causal discovery algorithms PC1 or PCMCI that are implemented in the Tigramite Python package. These algorithms utilize conditional independence tests to infer parts of the causal graph. Our causal feature selection approach filters out causally-spurious links before passing the remaining causal features as inputs to ML models (Multiple linear regression, Random Forest) that predict the targets. We apply our framework to the statistical intensity prediction of Western Pacific Tropical Cyclones (TC), for which it is often difficult to accurately choose drivers and their dimensionality reduction (time lags, vertical levels, and area-averaging). Using more stringent significance thresholds in the conditional independence tests helps eliminate spurious causal relationships, thus helping the ML model generalize better to unseen TC cases. M-PC1 with a reduced number of features outperforms M-PCMCI, non-causal ML, and other feature selection methods (lagged correlation, random), even slightly outperforming feature selection based on eXplainable Artificial Intelligence. The optimal causal drivers obtained from our causal feature selection help improve our understanding of underlying relationships and suggest new potential drivers of TC intensification.

artificial intelligence, machine learning, selection, (16 more...)

2304.05294

Country:

Europe > Germany (0.29)
Europe > Switzerland > Vaud (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

arXiv.org Artificial IntelligenceJun-23-2023

Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables

Runge, Jakob

The problem of selecting optimal backdoor adjustment sets to estimate causal effects in graphical models with hidden and conditioned variables is addressed. Previous work has defined optimality as achieving the smallest asymptotic estimation variance and derived an optimal set for the case without hidden variables. For the case with hidden variables there can be settings where no optimal set exists and currently only a sufficient graphical optimality criterion of limited applicability has been derived. In the present work optimality is characterized as maximizing a certain adjustment information which allows to derive a necessary and sufficient graphical criterion for the existence of an optimal adjustment set and a definition and algorithm to construct it. Further, the optimal set is valid if and only if a valid adjustment set exists and has higher (or equal) adjustment information than the Adjust-set proposed in Perkovi{\'c} et al. [Journal of Machine Learning Research, 18: 1--62, 2018] for any graph. The results translate to minimal asymptotic estimation variance for a class of estimators whose asymptotic variance follows a certain information-theoretic relation. Numerical experiments indicate that the asymptotic results also hold for relatively small sample sizes and that the optimal adjustment set or minimized variants thereof often yield better variance also beyond that estimator class. Surprisingly, among the randomly created setups more than 90\% fulfill the optimality conditions indicating that also in many real-world scenarios graphical optimality may hold. Code is available as part of the python package \url{https://github.com/jakobrunge/tigramite}.

adjustment, artificial intelligence, machine learning, (18 more...)

2102.10324

Country: Europe > Germany (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningJun-20-2023

Conditional Independence Testing with Heteroskedastic Data and Applications to Causal Discovery

Günther, Wiebke, Ninad, Urmi, Wahl, jonas, Runge, Jakob

Conditional independence (CI) testing is frequently used in data analysis and machine learning for various scientific fields and it forms the basis of constraint-based causal discovery. Oftentimes, CI testing relies on strong, rather unrealistic assumptions. One of these assumptions is homoskedasticity, in other words, a constant conditional variance is assumed. We frame heteroskedasticity in a structural causal model framework and present an adaptation of the partial correlation CI test that works well in the presence of heteroskedastic noise, given that expert knowledge about the heteroskedastic relationships is available. Further, we provide theoretical consistency results for the proposed CI test which carry over to causal discovery under certain assumptions. Numerical causal discovery experiments demonstrate that the adapted partial correlation CI test outperforms the standard test in the presence of heteroskedasticity and is on par for the homoskedastic case. Finally, we discuss the general challenges and limits as to how expert knowledge about heteroskedasticity can be accounted for in causal discovery.

artificial intelligence, heteroskedasticity, machine learning, (14 more...)

2306.11498

Country:

Europe > Germany (0.14)
North America > United States (0.14)
Europe > Spain (0.14)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Machine LearningJun-15-2023

Bootstrap aggregation and confidence measures to improve time series causal discovery

Debeire, Kevin, Runge, Jakob, Gerhardus, Andreas, Eyring, Veronika

Causal discovery methods have demonstrated the ability to identify the time series graphs representing the causal temporal dependency structure of dynamical systems. However, they do not include a measure of the confidence of the estimated links. Here, we introduce a novel bootstrap aggregation (bagging) and confidence measure method that is combined with time series causal discovery. This new method allows measuring confidence for the links of the time series graphs calculated by causal discovery methods. This is done by bootstrapping the original times series data set while preserving temporal dependencies. Next to confidence measures, aggregating the bootstrapped graphs by majority voting yields a final aggregated output graph. In this work, we combine our approach with the state-of-the-art conditional-independence-based algorithm PCMCI+. With extensive numerical experiments we empirically demonstrate that, in addition to providing confidence measures for links, Bagged-PCMCI+ improves the precision and recall of its base algorithm PCMCI+. Specifically, Bagged-PCMCI+ has a higher detection power regarding adjacencies and a higher precision in orienting contemporaneous edges while at the same time showing a lower rate of false positives. These performance improvements are especially pronounced in the more challenging settings (short time sample size, large number of variables, high autocorrelation). Our bootstrap approach can also be combined with other time series causal discovery algorithms and can be of considerable use in many real-world applications, especially when confidence measures for the links are desired.

artificial intelligence, bagged, machine learning, (15 more...)

2306.08946

Country:

North America > United States > Virginia (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > Bremen > Bremen (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceMay-21-2023

Discovering Causal Relations and Equations from Data

Camps-Valls, Gustau, Gerhardus, Andreas, Ninad, Urmi, Varando, Gherardo, Martius, Georg, Balaguer-Ballester, Emili, Vinuesa, Ricardo, Diaz, Emiliano, Zanna, Laure, Runge, Jakob

Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws and principles that are invariant, robust and causal explanations of the world has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventional studies in the system under study. With the advent of big data and the use of data-driven methods, causal and equation discovery fields have grown and made progress in computer science, physics, statistics, philosophy, and many applied fields. All these domains are intertwined and can be used to discover causal relations, physical laws, and equations from observational data. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of Physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for observational causal and equation discovery, point out connections, and showcase a complete set of case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is being revolutionised with the efficient exploitation of observational data, modern machine learning algorithms and the interaction with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.

data mining, evolutionary algorithm, machine learning, (24 more...)

2305.13341

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.92)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
(10 more...)

arXiv.org Artificial IntelligenceApr-16-2021

EarthNet2021: A large-scale dataset and challenge for Earth surface forecasting as a guided video prediction task

Requena-Mesa, Christian, Benson, Vitus, Reichstein, Markus, Runge, Jakob, Denzler, Joachim

Satellite images are snapshots of the Earth surface. We propose to forecast them. We frame Earth surface forecasting as the task of predicting satellite imagery conditioned on future weather. EarthNet2021 is a large dataset suitable for training deep neural networks on the task. It contains Sentinel 2 satellite imagery at 20m resolution, matching topography and mesoscale (1.28km) meteorological variables packaged into 32000 samples. Additionally we frame EarthNet2021 as a challenge allowing for model intercomparison. Resulting forecasts will greatly improve (>x50) over the spatial resolution found in numerical models. This allows localized impacts from extreme weather to be predicted, thus supporting downstream applications such as crop yield prediction, forest health assessments or biodiversity monitoring. Find data, code, and how to participate at www.earthnet.tech

deep learning, neural network, prediction, (17 more...)

2104.10066

Country:

Europe > Germany (0.14)
Oceania > Australia (0.14)

Genre: Research Report (0.64)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.58)
Food & Agriculture > Agriculture (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-10-2021

Towards Learning an Unbiased Classifier from Biased Data via Conditional Adversarial Debiasing

Reimers, Christian, Bodesheim, Paul, Runge, Jakob, Denzler, Joachim

Bias in classifiers is a severe issue of modern deep learning methods, especially for their application in safety- and security-critical areas. Often, the bias of a classifier is a direct consequence of a bias in the training dataset, frequently caused by the co-occurrence of relevant features and irrelevant ones. To mitigate this issue, we require learning algorithms that prevent the propagation of bias from the dataset into the classifier. We present a novel adversarial debiasing method, which addresses a feature that is spuriously connected to the labels of training images but statistically independent of the labels for test images. Thus, the automatic identification of relevant features during training is perturbed by irrelevant features. This is the case in a wide range of bias-related problems for many computer vision tasks, such as automatic skin cancer detection or driver assistance. We argue by a mathematical proof that our approach is superior to existing techniques for the abovementioned bias. Our experiments show that our approach performs better than state-of-the-art techniques on a well-known benchmark dataset with real-world images of cats and dogs.

criterion, deep learning, neural network, (18 more...)

2103.06179

Country: Europe > Germany (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.68)
Health & Medicine > Therapeutic Area > Dermatology (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)