AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Weakly Supervised-Based Oversampling for High Imbalance and High Dimensionality Data Classification

arXiv.org Machine LearningOct-6-2020

With the abundance of industrial datasets, imbalanced classification has become a common problem in several application domains. Oversampling is an effective method to solve imbalanced classification. One of the main challenges of the existing oversampling methods is to accurately label the new synthetic samples. Inaccurate labels of the synthetic samples would distort the distribution of the dataset and possibly worsen the classification performance. This paper introduces the idea of weakly supervised learning to handle the inaccurate labeling of synthetic samples caused by traditional oversampling methods. Graph semi-supervised SMOTE is developed to improve the credibility of the synthetic samples' labels. In addition, we propose cost-sensitive neighborhood components analysis for high dimensional datasets and bootstrap based ensemble framework for highly imbalanced datasets. The proposed method has achieved good classification performance on 8 synthetic datasets and 3 real-world datasets, especially for high imbalance and high dimensionality problems. The average performances and robustness are better than the benchmark methods.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Machine Learning

2009.14096

Country:

North America > United States > Colorado (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

GLOD: Gaussian Likelihood Out of Distribution Detector

Amit, Guy, Levy, Moshe, Rosenberg, Ishai, Shabtai, Asaf, Elovici, Yuval

arXiv.org Machine LearningOct-6-2020

Discriminative deep neural networks (DNNs) do well at classifying input associated with the classes they have been trained on. However, out-of-distribution (OOD) input poses a great challenge to such models and consequently represents a major risk when these models are used in safety-critical systems. In the last two years, extensive research has been performed in the domain of OOD detection. This research has relied mainly on training the model with OOD data or requiring additional computation for OOD detection. Such methods may not be applicable in many real world use cases. In this paper, we propose GLOD -- Gaussian likelihood out of distribution detector -- an extended DNN classifier capable of efficiently detecting OOD samples with no additional runtime overhead and without auxiliary training data. GLOD uses a layer that models the Gaussian density function of the trained classes. The layer outputs are used to estimate a Log-Likelihood Ratio which is employed to detect OOD samples. We evaluate GLOD's detection performance on SVHN, CIFAR-10 and CIFAR-100.

artificial intelligence, detection, machine learning, (19 more...)

arXiv.org Machine Learning

2008.06856

Country: Europe > France (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

LOGAN: Local Group Bias Detection by Clustering

Zhao, Jieyu, Chang, Kai-Wei

arXiv.org Artificial IntelligenceOct-6-2020

Machine learning techniques have been widely used in natural language processing (NLP). However, as revealed by many recent studies, machine learning models often inherit and amplify the societal biases in data. Various metrics have been proposed to quantify biases in model predictions. In particular, several of them evaluate disparity in model performance between protected groups and advantaged groups in the test corpus. However, we argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model. In fact, a model with similar aggregated performance between different groups on the entire data may behave differently on instances in a local region. To analyze and detect such local bias, we propose LOGAN, a new bias detection technique based on clustering. Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region and allows us to better analyze the biases in model predictions.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2010.02867

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Africa > Eswatini > Manzini > Manzini (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Semantic Evaluation for Text-to-SQL with Distilled Test Suites

Zhong, Ruiqi, Yu, Tao, Klein, Dan

arXiv.org Artificial IntelligenceOct-6-2020

We propose test suite accuracy to approximate semantic accuracy for Text-to-SQL models. Our method distills a small test suite of databases that achieves high code coverage for the gold query from a large number of randomly generated databases. At evaluation time, it computes the denotation accuracy of the predicted queries on the distilled test suite, hence calculating a tight upper-bound for semantic accuracy efficiently. We use our proposed method to evaluate 21 models submitted to the Spider leader board and manually verify that our method is always correct on 100 examples. In contrast, the current Spider metric leads to a 2.5% false negative rate on average and 8.1% in the worst case, indicating that test suite accuracy is needed. Our implementation, along with distilled test suites for eleven Text-to-SQL datasets, is publicly available.

artificial intelligence, machine learning, query, (17 more...)

arXiv.org Artificial Intelligence

2010.0284

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

AI Can Detect COVID-19 in the Lungs Like a Virtual Physician, New Study Shows

#artificialintelligenceOct-5-2020, 17:05:31 GMT

A University of Central Florida researcher is part of a new study showing that artificial intelligence can be nearly as accurate as a physician in diagnosing COVID-19 in the lungs. The study, recently published in Nature Communications, shows the new technique can also overcome some of the challenges of current testing. Researchers demonstrated that an AI algorithm could be trained to classify COVID-19 pneumonia in computed tomography (CT) scans with up to 90 percent accuracy, as well as correctly identify positive cases 84 percent of the time and negative cases 93 percent of the time. CT scans offer a deeper insight into COVID-19 diagnosis and progression as compared to the often-used reverse transcription-polymerase chain reaction, or RT-PCR, tests. These tests have high false negative rates, delays in processing and other challenges.

artificial intelligence, covid-19, machine learning, (14 more...)

#artificialintelligence

Country:

North America > United States > Virginia (0.05)
Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.05)
Europe > Italy (0.05)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

Automatic CAD-RADS Scoring Using Deep Learning

Denzinger, Felix, Wels, Michael, Breininger, Katharina, Gülsün, Mehmet A., Schöbinger, Max, André, Florian, Buß, Sebastian, Görich, Johannes, Sühling, Michael, Maier, Andreas

arXiv.org Artificial IntelligenceOct-5-2020

Coronary CT angiography (CCTA) has established its role as a non-invasive modality for the diagnosis of coronary artery disease (CAD). The CAD-Reporting and Data System (CAD-RADS) has been developed to standardize communication and aid in decision making based on CCTA findings. The CAD-RADS score is determined by manual assessment of all coronary vessels and the grading of lesions within the coronary artery tree. We propose a bottom-up approach for fully-automated prediction of this score using deep-learning operating on a segment-wise representation of the coronary arteries. The method relies solely on a prior fully-automated centerline extraction and segment labeling and predicts the segment-wise stenosis degree and the overall calcification grade as auxiliary tasks in a multi-task learning setup. We evaluate our approach on a data collection consisting of 2,867 patients. On the task of identifying patients with a CAD-RADS score indicating the need for further invasive investigation our approach reaches an area under curve (AUC) of 0.923 and an AUC of 0.914 for determining whether the patient suffers from CAD. This level of performance enables our approach to be used in a fully-automated screening setup or to assist diagnostic CCTA reading, especially due to its neural architecture design -- which allows comprehensive predictions.

artificial intelligence, cad-rad score, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-59725-2

2010.01963

Country:

Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate

Kursuncu, Ugur, Gaur, Manas, Castillo, Carlos, Alambo, Amanuel, Thirunarayan, K., Shalin, Valerie, Achilov, Dilshod, Arpinar, I. Budak, Sheth, Amit

arXiv.org Artificial IntelligenceOct-5-2020

Terror attacks have been linked in part to online extremist content. Although tens of thousands of Islamist extremism supporters consume such content, they are a small fraction relative to peaceful Muslims. The efforts to contain the ever-evolving extremism on social media platforms have remained inadequate and mostly ineffective. Divergent extremist and mainstream contexts challenge machine interpretation, with a particular threat to the precision of classification algorithms. Our context-aware computational approach to the analysis of extremist content on Twitter breaks down this persuasion process into building blocks that acknowledge inherent ambiguity and sparsity that likely challenge both manual and automated classification. We model this process using a combination of three contextual dimensions -- religion, ideology, and hate -- each elucidating a degree of radicalization and highlighting independent features to render them computationally accessible. We utilize domain-specific knowledge resources for each of these contextual dimensions such as Qur'an for religion, the books of extremist ideologues and preachers for political ideology and a social media hate speech corpus for hate. Our study makes three contributions to reliable analysis: (i) Development of a computational approach rooted in the contextual dimensions of religion, ideology, and hate that reflects strategies employed by online Islamist extremist groups, (ii) An in-depth analysis of relevant tweet datasets with respect to these dimensions to exclude likely mislabeled users, and (iii) A framework for understanding online radicalization as a process to assist counter-programming. Given the potentially significant social impact, we evaluate the performance of our algorithms to minimize mislabeling, where our approach outperforms a competitive baseline by 10.2% in precision.

contextual dimension, dimension, representation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3359253

1908.0652

Country:

North America > United States > Massachusetts > Bristol County > Dartmouth (0.14)
North America > United States > South Carolina > Richland County > Columbia (0.14)
North America > United States > Georgia > Clarke County > Athens (0.14)
(23 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Law Enforcement & Public Safety > Terrorism (1.00)
Government > Military (1.00)
Media (0.93)
Government > Regional Government > Asia Government > Middle East Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Quantifying Statistical Significance of Neural Network Representation-Driven Hypotheses by Selective Inference

Duy, Vo Nguyen Le, Iwazaki, Shogo, Takeuchi, Ichiro

arXiv.org Machine LearningOct-5-2020

In the past few years, various approaches have been developed to explain and interpret deep neural network (DNN) representations, but it has been pointed out that these representations are sometimes unstable and not reproducible. In this paper, we interpret these representations as hypotheses driven by DNN (called DNN-driven hypotheses) and propose a method to quantify the reliability of these hypotheses in statistical hypothesis testing framework. To this end, we introduce Selective Inference (SI) framework, which has received much attention in the past few years as a new statistical inference framework for data-driven hypotheses. The basic idea of SI is to make conditional inferences on the selected hypotheses under the condition that they are selected. In order to use SI framework for DNN representations, we develop a new SI algorithm based on homotopy method which enables us to derive the exact (non-asymptotic) conditional sampling distribution of the DNN-driven hypotheses. We conduct experiments on both synthetic and real-world datasets, through which we offer evidence that our proposed method can successfully control the false positive rate, has decent performance in terms of computational efficiency, and provides good results in practical applications. The remarkable predictive performance of deep neural networks (DNNs) stems from their ability to learn appropriate representations from data.

artificial intelligence, hypothesis, machine learning, (15 more...)

arXiv.org Machine Learning

2010.01823

Country: Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > Experimental Study (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.91)

Add feedback

Evolving test instances of the Hamiltonian completion problem

Lechien, Thibault, Jooken, Jorik, De Causmaecker, Patrick

arXiv.org Artificial IntelligenceOct-5-2020

Predicting and comparing algorithm performance on graph instances is challenging for multiple reasons. First, there is usually no standard set of instances to benchmark performance. Second, using existing graph generators results in a restricted spectrum of difficulty and the resulting graphs are usually not diverse enough to draw sound conclusions. That is why recent work proposes a new methodology to generate a diverse set of instances by using an evolutionary algorithm. We can then analyze the resulting graphs and get key insights into which attributes are most related to algorithm performance. We can also fill observed gaps in the instance space in order to generate graphs with previously unseen combinations of features. This methodology is applied to the instance space of the Hamiltonian completion problem using two different solvers, namely the Concorde TSP Solver and a multi-start local search algorithm.

artificial intelligence, evolutionary algorithm, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2011.02291

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks

Csordás, Róbert, van Steenkiste, Sjoerd, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceOct-5-2020

Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, including compositionality through efficient recombination of functional building blocks, interpretability, preventing catastrophic interference, etc. Understanding if and how NNs are modular could provide insights into how to improve them. Current inspection methods, however, fail to link modules to their functionality. In this paper, we present a novel method based on learning binary weight masks to identify individual weights and subnets responsible for specific functions. Using this powerful tool, we contribute an extensive study of emerging modularity in NNs that covers several standard architectures and datasets. We demonstrate how common NNs fail to reuse submodules and offer new insights into the related issue of systematic generalization on language tasks.

artificial intelligence, machine learning, module, (19 more...)

arXiv.org Artificial Intelligence

2010.02066

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback