AITopics | discriminant power

Collaborating Authors

discriminant power

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Max-relevance-min-divergence Criterion for Data Discretization with Applications on Naive Bayes

Wang, Shihe, Ren, Jianfeng, Bai, Ruibin, Yao, Yuan, Jiang, Xudong

arXiv.org Artificial IntelligenceApr-4-2023

In many classification models, data is discretized to better estimate its distribution. Existing discretization methods often target at maximizing the discriminant power of discretized data, while overlooking the fact that the primary target of data discretization in classification is to improve the generalization performance. As a result, the data tend to be over-split into many small bins since the data without discretization retain the maximal discriminant information. Thus, we propose a Max-Dependency-Min-Divergence (MDmD) criterion that maximizes both the discriminant information and generalization ability of the discretized data. More specifically, the Max-Dependency criterion maximizes the statistical dependency between the discretized data and the classification variable while the Min-Divergence criterion explicitly minimizes the JS-divergence between the training data and the validation data for a given discretization scheme. The proposed MDmD criterion is technically appealing, but it is difficult to reliably estimate the high-order joint distributions of attributes and the classification variable. We hence further propose a more practical solution, Max-Relevance-Min-Divergence (MRmD) discretization scheme, where each attribute is discretized separately, by simultaneously maximizing the discriminant information and the generalization ability of the discretized data. The proposed MRmD is compared with the state-of-the-art discretization algorithms under the naive Bayes classification framework on 45 machine-learning benchmark datasets. It significantly outperforms all the compared methods on most of the datasets.

artificial intelligence, discretization, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.10095

Country:

North America > United States > Wisconsin (0.04)
Asia > China > Zhejiang Province > Ningbo (0.04)
Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.63)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Applications of Naive Bayes part1(Artificial Intelligence)

#artificialintelligenceNov-9-2022, 20:00:37 GMT

Abstract: In many classification models, data is discretized to better estimate its distribution. Existing discretization methods often target at maximizing the discriminant power of discretized data, while overlooking the fact that the primary target of data discretization in classification is to improve the generalization performance. As a result, the data tend to be over-split into many small bins since the data without discretization retain the maximal discriminant information. Thus, we propose a Max-Dependency-Min-Divergence (MDmD) criterion that maximizes both the discriminant information and generalization ability of the discretized data. More specifically, the Max-Dependency criterion maximizes the statistical dependency between the discretized data and the classification variable while the Min-Divergence criterion explicitly minimizes the JS-divergence between the training data and the validation data for a given discretization scheme.

application, artificial intelligence, discriminant information, (11 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.78)

Add feedback

Boosting the Discriminant Power of Naive Bayes

Wang, Shihe, Ren, Jianfeng, Lian, Xiaoyu, Bai, Ruibin, Jiang, Xudong

arXiv.org Artificial IntelligenceSep-20-2022

Naive Bayes has been widely used in many applications because of its simplicity and ability in handling both numerical data and categorical data. However, lack of modeling of correlations between features limits its performance. In addition, noise and outliers in the real-world dataset also greatly degrade the classification performance. In this paper, we propose a feature augmentation method employing a stack auto-encoder to reduce the noise in the data and boost the discriminant power of naive Bayes. The proposed stack auto-encoder consists of two auto-encoders for different purposes. The first encoder shrinks the initial features to derive a compact feature representation in order to remove the noise and redundant information. The second encoder boosts the discriminant power of the features by expanding them into a higher-dimensional space so that different classes of samples could be better separated in the higher-dimensional space. By integrating the proposed feature augmentation method with the regularized naive Bayes, the discrimination power of the model is greatly enhanced. The proposed method is evaluated on a set of machine-learning benchmark datasets. The experimental results show that the proposed method significantly and consistently outperforms the state-of-the-art naive Bayes classifiers.

artificial intelligence, machine learning, naive baye, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICPR56361.2022.9956358

2209.09532

Country:

Asia > China > Zhejiang Province > Ningbo (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.95)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Few-shot Learning via Dependency Maximization and Instance Discriminant Analysis

Hou, Zejiang, Kung, Sun-Yuan

arXiv.org Artificial IntelligenceSep-6-2021

We study the few-shot learning (FSL) problem, where a model learns to recognize new objects with extremely few labeled training data per category. Most of previous FSL approaches resort to the meta-learning paradigm, where the model accumulates inductive bias through learning many training tasks so as to solve a new unseen few-shot task. In contrast, we propose a simple approach to exploit unlabeled data accompanying the few-shot task for improving few-shot performance. Firstly, we propose a Dependency Maximization method based on the Hilbert-Schmidt norm of the cross-covariance operator, which maximizes the statistical dependency between the embedded feature of those unlabeled data and their label predictions, together with the supervised loss over the support set. We then use the obtained model to infer the pseudo-labels for those unlabeled data. Furthermore, we propose anInstance Discriminant Analysis to evaluate the credibility of each pseudo-labeled example and select the most faithful ones into an augmented support set to retrain the model as in the first step. We iterate the above process until the pseudo-labels for the unlabeled data becomes stable. Following the standard transductive and semi-supervised FSL setting, our experiments show that the proposed method out-performs previous state-of-the-art methods on four widely used benchmarks, including mini-ImageNet, tiered-ImageNet, CUB, and CIFARFS.

classifier, few-shot learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2109.0282

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Introduction to logistic regression

Chung, Moo K.

arXiv.org Machine LearningOct-28-2020

July 29, 2020 For random field theory based multiple comparison corrections In brain imaging, it is often necessary to compute the distribution of the supremum of a random field. Unfortunately, computing the distribution of the supremum of the random field is not easy and requires satisfying many distributional assumptions that may not be true in real data. Thus, there is a need to come up with a different framework that does not use the traditional statistical hypothesis testing paradigm that requires to compute p-values. With this as a motivation, we can use a different approach called the logistic regression that does not require computing the p-value and still be able to localize the regions of brain network differences (Flury 1997, Hastie et al. 2003, Chung et al. 2008). Unlike other discriminant and classification techniques that tried to classify preselected feature vectors, the method here does not require any preselected feature vectors and performs the classification at each edge level (Higdon et al. 2004, Shen et al. 2004, Thomaz et al. 2006).

artificial intelligence, discriminant power, machine learning, (17 more...)

arXiv.org Machine Learning

2008.13567

Country:

North America > United States > Wisconsin > Dane County > Madison (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)

Add feedback

A Feature-map Discriminant Perspective for Pruning Deep Neural Networks

Hou, Zejiang, Kung, Sun-Yuan

arXiv.org Machine LearningMay-28-2020

Network pruning has become the de facto tool to accelerate deep neural networks for mobile and edge applications. Recently, feature-map discriminant based channel pruning has shown promising results, as it aligns well with the CNN objective of differentiating multiple classes and offers better interpretability of the pruning decision. However, existing discriminant-based methods are challenged by computation inefficiency, as there is a lack of theoretical guidance on quantifying the feature-map discriminant power. In this paper, we present a new mathematical formulation to accurately and efficiently quantify the feature-map discriminativeness, which gives rise to a novel criterion,Discriminant Information(DI). We analyze the theoretical property of DI, specifically the non-decreasing property, that makes DI a valid selection criterion. DI-based pruning removes channels with minimum influence to DI value, as they contain little information regarding to the discriminant power. The versatility of DI criterion also enables an intra-layer mixed precision quantization to further compress the network. Moreover, we propose a DI-based greedy pruning algorithm and structure distillation technique to automatically decide the pruned structure that satisfies certain resource budget, which is a common requirement in reality. Extensive experiments demonstratethe effectiveness of our method: our pruned ResNet50 on ImageNet achieves 44% FLOPs reduction without any Top-1 accuracy loss compared to unpruned model

artificial intelligence, machine learning, pruning, (16 more...)

arXiv.org Machine Learning

2005.13796

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback