AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Learning Optimal Fair Classification Trees

Jo, Nathanael, Aghaei, Sina, Benson, Jack, Gómez, Andrés, Vayanos, Phebe

arXiv.org Artificial IntelligenceJan-24-2022

The increasing use of machine learning in high-stakes domains -- where people's livelihoods are impacted -- creates an urgent need for interpretable and fair algorithms. In these settings it is also critical for such algorithms to be accurate. With these needs in mind, we propose a mixed integer optimization (MIO) framework for learning optimal classification trees of fixed depth that can be conveniently augmented with arbitrary domain specific fairness constraints. We benchmark our method against the state-of-the-art approach for building fair trees on popular datasets; given a fixed discrimination threshold, our approach improves out-of-sample (OOS) accuracy by 2.3 percentage points on average and obtains a higher OOS accuracy on 88.9% of the experiments. We also incorporate various algorithmic fairness notions into our method, showcasing its versatile modeling power that allows decision makers to fine-tune the trade-off between accuracy and fairness.

constraint, fairness, statistical parity, (14 more...)

arXiv.org Artificial Intelligence

2201.09932

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
(13 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

What You See is Not What the Network Infers: Detecting Adversarial Examples Based on Semantic Contradiction

Yang, Yijun, Gao, Ruiyuan, Li, Yu, Lai, Qiuxia, Xu, Qiang

arXiv.org Artificial IntelligenceJan-24-2022

Adversarial examples (AEs) pose severe threats to the applications of deep neural networks (DNNs) to safety-critical domains, e.g., autonomous driving. While there has been a vast body of AE defense solutions, to the best of our knowledge, they all suffer from some weaknesses, e.g., defending against only a subset of AEs or causing a relatively high accuracy loss for legitimate inputs. Moreover, most existing solutions cannot defend against adaptive attacks, wherein attackers are knowledgeable about the defense mechanisms and craft AEs accordingly. In this paper, we propose a novel AE detection framework based on the very nature of AEs, i.e., their semantic information is inconsistent with the discriminative features extracted by the target DNN model. To be specific, the proposed solution, namely ContraNet, models such contradiction by first taking both the input and the inference result to a generator to obtain a synthetic output and then comparing it against the original input. For legitimate inputs that are correctly inferred, the synthetic output tries to reconstruct the input. On the contrary, for AEs, instead of reconstructing the input, the synthetic output would be created to conform to the wrong label whenever possible. Consequently, by measuring the distance between the input and the synthetic output with metric learning, we can differentiate AEs from legitimate inputs. We perform comprehensive evaluations under various AE attack scenarios, and experimental results show that ContraNet outperforms existing solutions by a large margin, especially under adaptive attacks. Moreover, our analysis shows that successful AEs that can bypass ContraNet tend to have much-weakened adversarial semantics. We have also shown that ContraNet can be easily combined with adversarial training techniques to achieve further improved AE defense capabilities.

adaptive attack, classifier, contranet, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.14722/ndss.2022.24001

2201.0965

Country:

Asia > China > Hong Kong (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Belgium (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Knowledge Graph Embeddings based Approach for Author Name Disambiguation using Literals

Santini, Cristian, Gesese, Genet Asefa, Peroni, Silvio, Gangemi, Aldo, Sack, Harald, Alam, Mehwish

arXiv.org Artificial IntelligenceJan-24-2022

Data available in scholarly knowledge graphs (SKGs) - i.e., "a graph of data intended to accumulate and convey knowledge of the real world, whose nodes represent entities of interest and whose edges represent potentially different relations between these entities" [14] - is growing continuously every day, leading to a plethora of challenges concerning, for instance, article exploration and visualization [17], article recommendation [3], citation recommendation [11], and Author Name Disambiguation (AND) [24], which is relevant for the purposes of the present article. In particular, AND refers to a specific task of entity resolution which aims at resolving author mentions in bibliographic references to real-world people. Author persistent identifiers, such as ORCIDs and VIAFs, simplify the AND activity since such identifiers can be used for reconciling entities defined as different objects and representing the same real-world person. However, the availability of such persistent identifiers in SKGs - such as OpenCitations (OC) [22], AMiner [27] and Microsoft Academic Knowledge Graph (MAKG) [10] - is characterized by very low coverage and, as such, additional and computationally-oriented techniques must be adopted to identify different authors as the same person. In the past, many automatic approaches have been developed to automatically address AND by using publications metadata (e.g., title, abstract, keywords, venue, affiliation, etc.) to extract some features which can be used in the disambiguation task. These methods vary widely from supervised learning methods to unsupervised learning including recently developed deep neural network-based architectures [31]. However, the existing SKGs do not provide all the relevant contextual information necessary to reuse effectively and efficiently such approaches, that often rely on pure textual data. In contrast with the approaches mentioned above, this study focuses on performing AND for scholarly data represented as linked data or included in SKGs by considering the multi-modal information available in such collections, i.e., the structural information consisting of entities and relations between them as well as text or numeric values associated with the authors and publications defined in the form of literals (family name, given name, publication title, venue title, year of publication, etc.). The proposed framework to address this task is named Literally Author Name Disambiguation (LAND), which focuses on tackling the following research questions: - Can Knowledge Graph Embeddings (KGEs) - i.e. a technique that enables the creation of a "dense representation of the graph in a continuous, low-dimensional vector space that can then be used for machine learning tasks"[13] - be used effectively for the downstream task of clustering, more specifically for author name disambiguation?

author name disambiguation, dataset, information, (10 more...)

arXiv.org Artificial Intelligence

2201.09555

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.06)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
Europe > Netherlands > South Holland > Leiden (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Perform Sliced (Tiled) Inference and Detailed Error Analysis using YOLOv5 Models

#artificialintelligenceJan-23-2022, 10:10:08 GMT

This post will walk you through installation, sliced inference, error analysis, and interactive visualization steps for your YOLOv5 models. Create error analysis plots using the created result.json:

inference and detailed error analysis, perform sliced, potential gain, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.43)

Add feedback

Robust Wavelet-based Assessment of Scaling with Applications

Hamilton, Erin K., Jeon, Seonghye, Cobo, Pepa Ramirez, Lee, Kichun Sky, Vidakovic, Brani

arXiv.org Machine LearningJan-23-2022

A number of approaches have dealt with statistical assessment of self-similarity, and many of those are based on multiscale concepts. Most rely on certain distributional assumptions which are usually violated by real data traces, often characterized by large temporal or spatial mean level shifts, missing values or extreme observations. A novel, robust approach based on Theil-type weighted regression is proposed for estimating self-similarity in two-dimensional data (images). The method is compared to two traditional estimation techniques that use wavelet decompositions; ordinary least squares (OLS) and Abry-Veitch bias correcting estimator (AV). As an application, the suitability of the self-similarity estimate resulting from the the robust approach is illustrated as a predictive feature in the classification of digitized mammogram images as cancerous or non-cancerous. The diagnostic employed here is based on the properties of image backgrounds, which is typically an unused modality in breast cancer screening. Classification results show nearly 68% accuracy, varying slightly with the choice of wavelet basis, and the range of multiresolution levels used.

coefficient, estimator, regression, (14 more...)

arXiv.org Machine Learning

2201.0932

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > New York > New York County > New York City (0.04)
South America > Brazil > São Paulo (0.04)
(9 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.35)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Survival Prediction of Children Undergoing Hematopoietic Stem Cell Transplantation Using Different Machine Learning Classifiers by Performing Chi-squared Test and Hyper-parameter Optimization: A Retrospective Analysis

Ratul, Ishrak Jahan, Wani, Ummay Habiba, Nishat, Mirza Muntasir, Al-Monsur, Abdullah, Ar-Rafi, Abrar Mohammad, Faisal, Fahim, Kabir, Mohammad Ridwan

arXiv.org Artificial IntelligenceJan-22-2022

Bone Marrow Transplant, a gradational rescue for a wide range of disorders emanating from the bone marrow, is an efficacious surgical treatment. Several risk factors, such as post-transplant illnesses, new malignancies, and even organ damage, can impair long-term survival. Therefore, technologies like Machine Learning are deployed for investigating the survival prediction of BMT receivers along with the influences that limit their resilience. In this study, an efficient survival classification model is presented in a comprehensive manner, incorporating the Chi-squared feature selection method to address the dimensionality problem and Hyper Parameter Optimization (HPO) to increase accuracy. A synthetic dataset is generated by imputing the missing values, transforming the data using dummy variable encoding, and compressing the dataset from 59 features to the 11 most correlated features using Chi-squared feature selection. The dataset was split into train and test sets at a ratio of 80:20, and the hyperparameters were optimized using Grid Search Cross-Validation. Several supervised ML methods were trained in this regard, like Decision Tree, Random Forest, Logistic Regression, K-Nearest Neighbors, Gradient Boosting Classifier, Ada Boost, and XG Boost. The simulations have been performed for both the default and optimized hyperparameters by using the original and reduced synthetic dataset. After ranking the features using the Chi-squared test, it was observed that the top 11 features with HPO, resulted in the same accuracy of prediction (94.73%) as the entire dataset with default parameters. Moreover, this approach requires less time and resources for predicting the survivability of children undergoing BMT. Hence, the proposed approach may aid in the development of a computer-aided diagnostic system with satisfactory accuracy and minimal computation time by utilizing medical data records.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1155/2022/9391136

2201.08987

Country:

North America > United States > California > Orange County > Irvine (0.04)
Asia > Japan (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Hematology > Stem Cells (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

How can AI Prevent Fraud?

#artificialintelligenceJan-21-2022, 12:05:17 GMT

Multinational technology corporation IBM calculated that 72% of business leaders cited fraud as a growing concern in the last year, that $44 billion will be lost worldwide due to fraud by 2024, and that a quarter of e-commerce sales transactions that were declined by artificial intelligence (AI) were false positives. AI has become the leading tool for fighting fraud, but it can still be improved upon. In the past, rule-based engines and simple predictive models were used to computationally identify the majority of fraud attempts. But these methods have not kept up with the increasingly sophisticated nature of fraud attacks today. With a proliferation of digital technologies at criminals' disposal, fraud has grown in both scale and severity over the last few decades. Large criminal organizations and even state-sponsored groups use AI-like machine learning (ML) algorithms to defraud digital businesses for millions of dollars each year.

fraud, fraud prevention, prevention, (14 more...)

#artificialintelligence

Industry:

Banking & Finance (0.99)
Law Enforcement & Public Safety > Fraud (0.34)
Information Technology > Security & Privacy (0.30)

Technology:

Information Technology > e-Commerce > Financial Technology (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.37)

Add feedback

A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks

Ma, Xiaoyu, Sardy, Sylvain, Hengartner, Nick, Bobenko, Nikolai, Lin, Yen Ting

arXiv.org Machine LearningJan-21-2022

To fit sparse linear associations, a LASSO sparsity inducing penalty with a single hyperparameter provably allows to recover the important features (needles) with high probability in certain regimes even if the sample size is smaller than the dimension of the input vector (haystack). More recently learners known as artificial neural networks (ANN) have shown great successes in many machine learning tasks, in particular fitting nonlinear associations. Small learning rate, stochastic gradient descent algorithm and large training set help to cope with the explosion in the number of parameters present in deep neural networks. Yet few ANN learners have been developed and studied to find needles in nonlinear haystacks. Driven by a single hyperparameter, our ANN learner, like for sparse linear associations, exhibits a phase transition in the probability of retrieving the needles, which we do not observe with other ANN learners. To select our penalty parameter, we generalize the universal threshold of Donoho and Johnstone (1994) which is a better rule than the conservative (too many false detections) and expensive cross-validation. In the spirit of simulated annealing, we propose a warm-start sparsity inducing algorithm to solve the high-dimensional, non-convex and non-differentiable optimization problem. We perform precise Monte Carlo simulations to show the effectiveness of our approach.

lasso ann, neural network, phase transition, (16 more...)

arXiv.org Machine Learning

2201.08652

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Switzerland (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Introduction to Logistic Regression: Predicting Diabetes

#artificialintelligenceJan-20-2022, 18:15:06 GMT

Data can be broadly divided into continuous data, those that can take an infinite number of points within a given range such as distance or time, and categorical/discrete data, which contain a finite number of points or categories within a given group of data such as payment methods or customer complaints. We have already seen examples of applying regression to continuous prediction problems in the form of linear regression where we predicted sales, but in order to predict categorical outputs we can use logistic regression. While we are still using regression to predict outcomes, the main aim of logistic regression is to be able to predict which category and observation belongs to rather than an exact value. Examples of questions which this method can be used for include: "How likely is a person to suffer from a disease (outcome) given their age, sex, smoking status, etc (variables/features)?" "How likely is this email to be spam?" "Will a student pass a test given some predictors of performance?".

diabetes, probability, regression, (12 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.84)
Research Report > Experimental Study (0.84)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Learning Norms via Natural Language Teachings

Olson, Taylor, Forbus, Ken

arXiv.org Artificial IntelligenceJan-20-2022

To interact with humans, artificial intelligence (AI) systems must understand our social world. Within this world norms play an important role in motivating and guiding agents. However, very few computational theories for learning social norms have been proposed. There also exists a long history of debate on the distinction between what is normal (is) and what is normative (ought). Many have argued that being capable of learning both concepts and recognizing the difference is necessary for all social agents. This paper introduces and demonstrates a computational approach to learning norms from natural language text that accounts for both what is normal and what is normative. It provides a foundation for everyday people to train AI systems about social norms.

narrative function, norm frame, representation, (15 more...)

arXiv.org Artificial Intelligence

2201.10556

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas (0.04)
(10 more...)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback