AITopics

2210.06201

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Collas, Antoine, Breloy, Arnaud, Ren, Chengfang, Ginolhac, Guillaume, Ovarlez, Jean-Philippe

Riemannian optimization for non-centered mixture of scaled Gaussian distributions

arXiv.org Artificial IntelligenceJun-25-2023

This paper studies the statistical model of the non-centered mixture of scaled Gaussian distributions (NC-MSG). Using the Fisher-Rao information geometry associated to this distribution, we derive a Riemannian gradient descent algorithm. This algorithm is leveraged for two minimization problems. The first one is the minimization of a regularized negative log-likelihood (NLL). The latter makes the trade-off between a white Gaussian distribution and the NC-MSG. Conditions on the regularization are given so that the existence of a minimum to this problem is guaranteed without assumptions on the samples. Then, the Kullback-Leibler (KL) divergence between two NC-MSG is derived. This divergence enables us to define a minimization problem to compute centers of mass of several NC-MSGs. The proposed Riemannian gradient descent algorithm is leveraged to solve this second minimization problem. Numerical experiments show the good performance and the speed of the Riemannian gradient descent on the two problems. Finally, a Nearest centroid classifier is implemented leveraging the KL divergence and its associated center of mass. Applied on the large scale dataset Breizhcrops, this classifier shows good accuracies as well as robustness to rigid transformations of the test set.

artificial intelligence, bayesian inference, machine learning, (20 more...)

doi: 10.1109/TSP.2023.3290354

2209.03315

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJun-25-2023

TCE: A Test-Based Approach to Measuring Calibration Error

Matsubara, Takuo, Tax, Niek, Mudd, Richard, Guy, Ido

While a number of metrics--such as log-likelihood, userspecified This paper proposes a new metric to measure the scoring functions, and the area under the receiver calibration error of probabilistic binary classifiers, operating characteristic (ROC) curve--are used to assess the called test-based calibration error (TCE). TCE incorporates quality of probabilistic classifiers, it is usually hard or even a novel loss function based on a statistical impossible to gauge whether predictions are well-calibrated test to examine the extent to which model predictions from the values of these metrics. For assessment of calibration, differ from probabilities estimated from it is typically necessary to use a metric that measures data. It offers (i) a clear interpretation, (ii) a consistent calibration error, that is, a deviation between model predictions scale that is unaffected by class imbalance, and and probabilities of target occurrences estimated from (iii) an enhanced visual representation with repect data. The importance of assessing calibration error has been to the standard reliability diagram. In addition, we long emphasised in machine learning [Nixon et al., 2019, introduce an optimality criterion for the binning Minderer et al., 2021] and in probabilistic forecasting more procedure of calibration error metrics based on a broadly [Dawid, 1982, Degroot and Fienberg, 1983].

artificial intelligence, machine learning, tce, (13 more...)

2306.14343

Country:

North America > United States > California > Orange County > Irvine (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJun-25-2023

Quantifying Aleatoric and Epistemic Uncertainty in Machine Learning: Are Conditional Entropy and Mutual Information Appropriate Measures?

Wimmer, Lisa, Sale, Yusuf, Hofman, Paul, Bischl, Bern, Hüllermeier, Eyke

The quantification of aleatoric and epistemic uncertainty in terms of conditional entropy and mutual information, respectively, has recently become quite common in machine learning. While the properties of these measures, which are rooted in information theory, seem appealing at first glance, we identify various incoherencies that call their appropriateness into question. In addition to the measures themselves, we critically discuss the idea of an additive decomposition of total uncertainty into its aleatoric and epistemic constituents. Experiments across different computer vision tasks support our theoretical findings and raise concerns about current practice in uncertainty quantification.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2209.03302

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Chobtham, Kiattikun, Constantinou, Anthony C.

Tuning structure learning algorithms with out-of-sample and resampling strategies

One of the challenges practitioners face when applying structure learning algorithms to their data involves determining a set of hyperparameters; otherwise, a set of hyperparameter defaults is assumed. The optimal hyperparameter configuration often depends on multiple factors, including the size and density of the usually unknown underlying true graph, the sample size of the input data, and the structure learning algorithm. We propose a novel hyperparameter tuning method, called the Out-of-sample Tuning for Structure Learning (OTSL), that employs out-of-sample and resampling strategies to estimate the optimal hyperparameter configuration for structure learning, given the input data set and structure learning algorithm. Synthetic experiments show that employing OTSL as a means to tune the hyperparameters of hybrid and score-based structure learning algorithms leads to improvements in graphical accuracy compared to the state-of-the-art. We also illustrate the applicability of this approach to real datasets from different disciplines.

algorithm, dataset, hyperparameter, (14 more...)

2306.13932

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > Thailand (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Ahn, Surin, Grana, Justin, Tamene, Yafet, Holsheimer, Kristian

Uncertainty Quantification for Local Model Explanations Without Model Access

We present a model-agnostic algorithm for generating post-hoc explanations and uncertainty intervals for a machine learning model when only a static sample of inputs and outputs from the model is available, rather than direct access to the model itself. This situation may arise when model evaluations are expensive; when privacy, security and bandwidth constraints are imposed; or when there is a need for real-time, on-device explanations. Our algorithm uses a bootstrapping approach to quantify the uncertainty that inevitably arises when generating explanations from a finite sample of model queries. Through a simulation study, we show that the uncertainty intervals generated by our algorithm exhibit a favorable trade-off between interval width and coverage probability compared to the naive confidence intervals from classical regression analysis as well as current Bayesian approaches for quantifying explanation uncertainty. We further demonstrate the capabilities of our method by applying it to black-box models, including a deep neural network, trained on three real-world datasets.

artificial intelligence, explanation, machine learning, (17 more...)

2301.05761

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

FED-CD: Federated Causal Discovery from Interventional and Observational Data

Abyaneh, Amin, Scherrer, Nino, Schwab, Patrick, Bauer, Stefan, Schölkopf, Bernhard, Mehrjou, Arash

Existing causal discovery methods typically require the data to be available in a centralized location. However, many practical domains, such as healthcare, limit access to the data gathered by local entities, primarily for privacy and regulatory constraints. To address this, we propose FED-CD, a federated framework for inferring causal structures from distributed datasets containing observational and interventional data. By exchanging updates instead of data samples, FED-CD ensures privacy while enabling decentralized discovery of the underlying directed acyclic graph (DAG). We accommodate scenarios with shared or disjoint intervened covariates, and mitigate the adverse effects of interventional data heterogeneity. We provide empirical evidence for the performance and scalability of FED-CD for decentralized causal discovery using synthetic and real-world DAGs.

artificial intelligence, fed-cd, machine learning, (12 more...)

2211.03846

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > Canada > Quebec > Montreal (0.14)
(4 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Shang, Guokan, Tixier, Antoine Jean-Pierre, Vazirgiannis, Michalis, Lorré, Jean-Pierre

Speaker-change Aware CRF for Dialogue Act Classification

Recent work in Dialogue Act (DA) classification approaches the task as a sequence labeling problem, using neural network models coupled with a Conditional Random Field (CRF) as the last layer. CRF models the conditional probability of the target DA label sequence given the input utterance sequence. However, the task involves another important input sequence, that of speakers, which is ignored by previous work. To address this limitation, this paper proposes a simple modification of the CRF layer that takes speaker-change into account. Experiments on the SwDA corpus show that our modified CRF layer outperforms the original one, with very wide margins for some DA labels. Further, visualizations demonstrate that our CRF layer can learn meaningful, sophisticated transition patterns between DA label pairs conditioned on speaker-change in an end-to-end way. Code is publicly available.

artificial intelligence, machine learning, sequence, (18 more...)

2004.02913

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
(20 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

arXiv.org Artificial IntelligenceJun-23-2023

Tensor Dirichlet Process Multinomial Mixture Model for Passenger Trajectory Clustering

Li, Ziyue, Yan, Hao, Zhang, Chen, Wang, Andi, Ketter, Wolfgang, Sun, Lijun, Tsung, Fugee

Passenger clustering based on travel records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, namely: each passenger has multiple trips, and each trip contains multi-dimensional multi-mode information. Furthermore, existing approaches rely on an accurate specification of the clustering number to start, which is difficult when millions of commuters are using the transport systems on a daily basis. In this paper, we propose a novel Tensor Dirichlet Process Multinomial Mixture model (Tensor-DPMM), which is designed to preserve the multi-mode and hierarchical structure of the multi-dimensional trip information via tensor, and cluster them in a unified one-step manner. The model also has the ability to determine the number of clusters automatically by using the Dirichlet Process to decide the probabilities for a passenger to be either assigned in an existing cluster or to create a new cluster: This allows our model to grow the clusters as needed in a dynamic manner. Finally, existing methods do not consider spatial semantic graphs such as geographical proximity and functional similarity between the locations, which may cause inaccurate clustering. To this end, we further propose a variant of our model, namely the Tensor-DPMM with Graph. For the algorithm, we propose a tensor Collapsed Gibbs Sampling method, with an innovative step of "disband and relocating", which disbands clusters with too small amount of members and relocates them to the remaining clustering. This avoids uncontrollable growing amounts of clusters. A case study based on Hong Kong metro passenger data is conducted to demonstrate the automatic process of learning the number of clusters, and the learned clusters are better in within-cluster compactness and cross-cluster separateness.

data mining, machine learning, natural language, (22 more...)

2306.13794

Country:

North America > Canada > Quebec > Montreal (0.14)
Africa > Senegal > Kolda Region > Kolda (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Transportation > Passenger (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(3 more...)

Loo, Chu Kiong, Liew, Wei Shiung, Wermter, Stefan

Explainable Lifelong Stream Learning Based on "Glocal" Pairwise Fusion

arXiv.org Artificial IntelligenceJun-23-2023

Real-time on-device continual learning applications are used on mobile phones, consumer robots, and smart appliances. Such devices have limited processing and memory storage capabilities, whereas continual learning acquires data over a long period of time. By necessity, lifelong learning algorithms have to be able to operate under such constraints while delivering good performance. This study presents the Explainable Lifelong Learning (ExLL) model, which incorporates several important traits: 1) learning to learn, in a single pass, from streaming data with scarce examples and resources; 2) a self-organizing prototype-based architecture that expands as needed and clusters streaming data into separable groups by similarity and preserves data against catastrophic forgetting; 3) an interpretable architecture to convert the clusters into explainable IF-THEN rules as well as to justify model predictions in terms of what is similar and dissimilar to the inference; and 4) inferences at the global and local level using a pairwise decision fusion process to enhance the accuracy of the inference, hence ``Glocal Pairwise Fusion.'' We compare ExLL against contemporary online learning algorithms for image recognition, using OpenLoris, F-SIOL-310, and Places datasets to evaluate several continual learning scenarios for video streams, low-sample learning, ability to scale, and imbalanced data streams. The algorithms are evaluated for their performance in accuracy, number of parameters, and experiment runtime requirements. ExLL outperforms all algorithms for accuracy in the majority of the tested scenarios.

artificial intelligence, expert system, machine learning, (17 more...)

2306.1341

Country:

Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
Europe > Germany > Hamburg (0.04)
Asia > British Indian Ocean Territory > Diego Garcia (0.04)

Genre:

Research Report (1.00)
Instructional Material (0.87)

Industry:

Education > Educational Setting > Continuing Education (0.55)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
(3 more...)