AITopics

2107.05847

Country:

Europe > Germany (0.28)
Europe > France (0.14)
Oceania > Australia (0.14)
(4 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas (0.92)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Derakhshani, Mohammad Mahdi, Zhen, Xiantong, Shao, Ling, Snoek, Cees G. M.

Kernel Continual Learning

arXiv.org Artificial IntelligenceJul-14-2021

This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. We deploy an episodic memory unit that stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. This does not require memory replay and systematically avoids task interference in the classifiers. We further introduce variational random features to learn a data-driven kernel for each task. To do so, we formulate kernel continual learning as a variational inference problem, where a random Fourier basis is incorporated as the latent variable. The variational posterior distribution over the random Fourier basis is inferred from the coreset of each task. In this way, we are able to generate more informative kernels specific to each task, and, more importantly, the coreset size can be reduced to achieve more compact memory, resulting in more efficient continual learning based on episodic memory. Extensive evaluation on four benchmarks demonstrates the effectiveness and promise of kernels for continual learning.

continual learning, kernel, learning, (12 more...)

2107.05757

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > UAE (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data

Yuan, Han, Xie, Feng, Ong, Marcus Eng Hock, Ning, Yilin, Chee, Marcel Lucas, Saffari, Seyed Ehsan, Abdullah, Hairil Rizal, Goldstein, Benjamin Alan, Chakraborty, Bibhas, Liu, Nan

Background: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among a wide variety of decision-making models for determining the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. Its current framework, however, still leaves room for improvement when addressing unbalanced data of rare events. Methods: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. All scoring models were evaluated on the basis of their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches in the prediction of inpatient mortality. Results: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839) while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.800). The AutoScore-Imbalance sub-model (using down-sampling algorithm) yielded an AUC of 0. 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Conclusions: The AutoScore-Imbalance tool has the potential to be applied to highly unbalanced datasets to gain further insight into rare medical events and to facilitate real-world clinical decision-making.

autoscore, autoscore-imbalance, dataset, (15 more...)

doi: 10.1016/j.jbi.2022.104072

2107.06039

Country:

Asia > Middle East > Israel (0.24)
Asia > Singapore > Central Region > Singapore (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Adversarial Motorial Prototype Framework for Open Set Recognition

Xia, Ziheng, Wang, Penghui, Dong, Ganggang, Liu, Hongwei

Open set recognition is designed to identify known classes and to reject unknown classes simultaneously. Specifically, identifying known classes and rejecting unknown classes correspond to reducing the empirical risk and the open space risk, respectively. First, the motorial prototype framework (MPF) is proposed, which classifies known classes according to the prototype classification idea. Moreover, a motorial margin constraint term is added into the loss function of the MPF, which can further improve the clustering compactness of known classes in the feature space to reduce both risks. Second, this paper proposes the adversarial motorial prototype framework (AMPF) based on the MPF. On the one hand, this model can generate adversarial samples and add these samples into the training phase; on the other hand, it can further improve the differential mapping ability of the model to known and unknown classes with the adversarial motion of the margin constraint radius. Finally, this paper proposes an upgraded version of the AMPF, AMPF++, which adds much more generated unknown samples into the training phase. In this paper, a large number of experiments prove that the performance of the proposed models is superior to that of other current works.

optimization, recognition, unknown class, (11 more...)

2108.04225

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Hunan Province > Changsha (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

arXiv.org Machine LearningJul-13-2021

Can Less be More? When Increasing-to-Balancing Label Noise Rates Considered Beneficial

Liu, Yang, Wang, Jialu

In this paper, we answer the question when inserting label noise (less informative labels) can instead return us more accurate and fair models. We are primarily inspired by two observations that 1) increasing a certain class of instances' label noise to balance the noise rates (increasing-to-balancing) results in an easier learning problem; 2) Increasing-to-balancing improves fairness guarantees against label bias. In this paper, we will first quantify the trade-offs introduced by increasing a certain group of instances' label noise rate w.r.t. the learning difficulties and performance guarantees. We analytically demonstrate when such an increase proves to be beneficial, in terms of either improved generalization errors or the fairness guarantees. Then we present a method to leverage our idea of inserting label noise for the task of learning with noisy labels, either without or with a fairness constraint. The primary technical challenge we face is due to the fact that we would not know which data instances are suffering from higher noise, and we would not have the ground truth labels to verify any possible hypothesis. We propose a detection method that informs us which group of labels might suffer from higher noise, without using ground truth information. We formally establish the effectiveness of the proposed solution and demonstrate it with extensive experiments.

label noise, noise rate, noisy label, (13 more...)

2107.05913

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

arXiv.org Machine LearningJul-13-2021

Induced Domain Adaptation

Liu, Yang, Chen, Yatong, Wei, Jiaheng

We formulate the problem of induced domain adaptation (IDA) when the underlying distribution/domain shift is introduced by the model being deployed. Our formulation is motivated by applications where the deployed machine learning models interact with human agents, and will ultimately face responsive and interactive data distributions. We formalize the discussions of the transferability of learning in our IDA setting by studying how the model trained on the available source distribution (data) would translate to the performance on the induced domain. We provide both upper bounds for the performance gap due to the induced domain shift, as well as lower bound for the trade-offs a classifier has to suffer on either the source training distribution or the induced target distribution. We provide further instantiated analysis for two popular domain adaptation settings with covariate shift and label shift. We highlight some key properties of IDA, as well as computational and learning challenges.

adaptation, domain adaptation, err, (15 more...)

2107.05911

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
North America > United States > New York > New York County > New York City (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Zhang, Jingyi, Sun, Xiaoxiao

Oversampling Divide-and-conquer for Response-skewed Kernel Ridge Regression

arXiv.org Machine LearningJul-13-2021

The divide-and-conquer method has been widely used for estimating large-scale kernel ridge regression estimates. Unfortunately, when the response variable is highly skewed, the divide-and-conquer kernel ridge regression (dacKRR) may overlook the underrepresented region and result in unacceptable results. We develop a novel response-adaptive partition strategy to overcome the limitation. In particular, we propose to allocate the replicates of some carefully identified informative observations to multiple nodes (local processors). The idea is analogous to the popular oversampling technique. Although such a technique has been widely used for addressing discrete label skewness, extending it to the dacKRR setting is nontrivial. We provide both theoretical and practical guidance on how to effectively over-sample the observations under the dacKRR setting. Furthermore, we show the proposed estimate has a smaller asymptotic mean squared error (AMSE) than that of the classical dacKRR estimate under mild conditions. Our theoretical findings are supported by both simulated and real-data analyses.

divide-and-conquer approach, estimator, regression, (14 more...)

2107.05834

Country:

North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.34)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Yang, Lixuan, Rossi, Dario

Thinkback: Task-SpecificOut-of-Distribution Detection

The increased success of Deep Learning (DL) has recently sparked large-scale deployment of DL models in many diverse industry segments. Yet, a crucial weakness of supervised model is the inherent difficulty in handling out-of-distribution samples, i.e., samples belonging to classes that were not presented to the model at training time. We propose in this paper a novel way to formulate the out-of-distribution detection problem, tailored for DL models. Our method does not require fine tuning process on training data, yet is significantly more accurate than the state of the art for out-of-distribution detection.

detection, energy score, thinkback, (13 more...)

2107.06668

Country: Europe > France (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Bhatt, Umang, Chien, Isabel, Zafar, Muhammad Bilal, Weller, Adrian

DIVINE: Diverse Influential Training Points for Data Visualization and Model Refinement

As the complexity of machine learning (ML) models increases, resulting in a lack of prediction explainability, several methods have been developed to explain a model's behavior in terms of the training data points that most influence the model. However, these methods tend to mark outliers as highly influential points, limiting the insights that practitioners can draw from points that are not representative of the training data. In this work, we take a step towards finding influential training points that also represent the training data well. We first review methods for assigning importance scores to training points. Given importance scores, we propose a method to select a set of DIVerse INfluEntial (DIVINE) training points as a useful explanation of model behavior. As practitioners might not only be interested in finding data points influential with respect to model accuracy, but also with respect to other important metrics, we show how to evaluate training data points on the basis of group fairness. Our method can identify unfairness-inducing training points, which can be removed to improve fairness outcomes. Our quantitative experiments and user studies show that visualizing DIVINE points helps practitioners understand and explain model behavior better than earlier approaches.

divine point, importance score, training point, (15 more...)

2107.05978

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Education (0.93)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

#artificialintelligenceJul-12-2021, 13:00:50 GMT

Comprehensive Guide on ROC Curve

The ROC (Receiver Operating Characteristic) curve is a way to visualise the performance of a binary classifier. Here, you can interpret 0 as negative and 1 as positive. In order to classify whether a data item is negative or positive, we need to first decide on the classification threshold. For instance, suppose we have trained a model like logistic regression, and this model predicted a $0.4$ probability that a particular observation is negative, and a $0.6$ probability that the observation is positive. If we set the classification threshold to be $0.5$,

classification threshold, classifier, roc curve, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)