AITopics

2107.1096

Country:

North America > United States (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Habibpour, Maryam, Gharoun, Hassan, Mehdipour, Mohammadreza, Tajally, AmirReza, Asgharnezhad, Hamzeh, Shamsi, Afshar, Khosravi, Abbas, Shafie-Khah, Miadreza, Nahavandi, Saeid, Catalao, Joao P. S.

Uncertainty-Aware Credit Card Fraud Detection Using Deep Learning

arXiv.org Artificial IntelligenceJul-28-2021

Countless research works of deep neural networks (DNNs) in the task of credit card fraud detection have focused on improving the accuracy of point predictions and mitigating unwanted biases by building different network architectures or learning models. Quantifying uncertainty accompanied by point estimation is essential because it mitigates model unfairness and permits practitioners to develop trustworthy systems which abstain from suboptimal decisions due to low confidence. Explicitly, assessing uncertainties associated with DNNs predictions is critical in real-world card fraud detection settings for characteristic reasons, including (a) fraudsters constantly change their strategies, and accordingly, DNNs encounter observations that are not generated by the same process as the training distribution, (b) owing to the time-consuming process, very few transactions are timely checked by professional experts to update DNNs. Therefore, this study proposes three uncertainty quantification (UQ) techniques named Monte Carlo dropout, ensemble, and ensemble Monte Carlo dropout for card fraud detection applied on transaction data. Moreover, to evaluate the predictive uncertainty estimates, UQ confusion matrix and several performance metrics are utilized. Through experimental results, we show that the ensemble is more effective in capturing uncertainty corresponding to generated predictions. Additionally, we demonstrate that the proposed UQ methods provide extra insight to the point predictions, leading to elevate the fraud prevention process.

detection, fraud detection, prediction, (16 more...)

2107.13508

Country:

North America > United States (0.14)
Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
Europe > Finland > Ostrobothnia > Vaasa (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Kruit, Benno, He, Hongyu, Urbani, Jacopo

Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

arXiv.org Artificial IntelligenceJul-28-2021

Tables in scientific papers contain a wealth of valuable knowledge for the scientific enterprise. To help the many of us who frequently consult this type of knowledge, we present Tab2Know, a new end-to-end system to build a Knowledge Base (KB) from tables in scientific papers. Tab2Know addresses the challenge of automatically interpreting the tables in papers and of disambiguating the entities that they contain. To solve these problems, we propose a pipeline that employs both statistical-based classifiers and logic-based reasoning. First, our pipeline applies weakly supervised classifiers to recognize the type of tables and columns, with the help of a data labeling system and an ontology specifically designed for our purpose. Then, logic-based reasoning is used to link equivalent entities (via sameAs links) in different tables. An empirical evaluation of our approach using a corpus of papers in the Computer Science domain has returned satisfactory performance. This suggests that ours is a promising step to create a large-scale KB of scientific knowledge.

knowledge base, query, tab2know, (16 more...)

doi: 10.1007/978-3-030-62419-4_20

2107.13306

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

#artificialintelligenceJul-27-2021, 05:20:18 GMT

Top 70+ Data Science Interview Questions and Answers for 2021

We can see Pr value here, and there are three stars associated with this Pr value. This basically means that we can reject the null hypothesis which states that there is no relationship between the age and the target columns. But since we have three stars over here, this null hypothesis can be rejected. There is a strong relationship between the age column and the target column. Now, we have other parameters like null deviance and residual deviance.

algorithm, data science, dataset, (15 more...)

#artificialintelligence

Country:

North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.51)

Industry:

Leisure & Entertainment (0.68)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

arXiv.org Machine LearningJul-27-2021

Subset selection for linear mixed models

Kowal, Daniel R.

Linear mixed models (LMMs) are instrumental for regression analysis with structured dependence, such as grouped, clustered, or multilevel data. However, selection among the covariates--while accounting for this structured dependence--remains a challenge. We introduce a Bayesian decision analysis for subset selection with LMMs. Using a Mahalanobis loss function that incorporates the structured dependence, we derive optimal linear actions for any subset of covariates and under any Bayesian LMM. Crucially, these actions inherit shrinkage or regularization and uncertainty quantification from the underlying Bayesian LMM. Rather than selecting a single "best" subset, which is often unstable and limited in its information content, we collect the acceptable family of subsets that nearly match the predictive ability of the "best" subset. The acceptable family is summarized by its smallest member and key variable importance metrics. Customized subset search and out-of-sample approximation algorithms are provided for more scalable computing. These tools are applied to simulated data and a longitudinal physical activity dataset, and in both cases demonstrate excellent prediction, estimation, and selection ability.

acceptable family, selection, subset, (17 more...)

2107.1289

Country: North America > United States > Texas > Harris County > Houston (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Therapeutic Area (0.47)
Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Khurana, Drona, Ravichandran, Srinivasan, Jain, Sparsh, Edakunni, Narayanan Unny

Statistical Guarantees for Fairness Aware Plug-In Algorithms

arXiv.org Machine LearningJul-27-2021

A plug-in algorithm to estimate Bayes Optimal Classifiers for fairness-aware binary classification has been proposed in (Menon & Williamson, 2018). However, the statistical efficacy of their approach has not been established. We prove that the plug-in algorithm is statistically consistent. We also derive finite sample guarantees associated with learning the Bayes Optimal Classifiers via the plug-in algorithm. Finally, we propose a protocol that modifies the plug-in approach, so as to simultaneously guarantee fairness and differential privacy with respect to a binary feature deemed sensitive.

menon & williamson, probability, statistical guarantee, (14 more...)

2107.12783

Country: Asia > India (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Fleury, Nicolas, Dubrunquez, Theo, Alouani, Ihsen

PDF-Malware: An Overview on Threats, Detection and Evasion Attacks

arXiv.org Artificial IntelligenceJul-27-2021

In the recent years, Portable Document Format, commonly known as PDF, has become a democratized standard for document exchange and dissemination. This trend has been due to its characteristics such as its flexibility and portability across platforms. The widespread use of PDF has installed a false impression of inherent safety among benign users. However, the characteristics of PDF motivated hackers to exploit various types of vulnerabilities, overcome security safeguards, thereby making the PDF format one of the most efficient malicious code attack vectors. Therefore, efficiently detecting malicious PDF files is crucial for information security. Several analysis techniques has been proposed in the literature, be it static or dynamic, to extract the main features that allow the discrimination of malware files from benign ones. Since classical analysis techniques may be limited in case of zero-days, machine-learning based techniques have emerged recently as an automatic PDF-malware detection method that is able to generalize from a set of training samples. These techniques are themselves facing the challenge of evasion attacks where a malicious PDF is transformed to look benign. In this work, we give an overview on the PDF-malware detection problem. We give a perspective on the new challenges and emerging solutions.

classifier, detection, pdf file, (16 more...)

2107.12873

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > France > Hauts-de-France (0.05)
Asia (0.05)
North America > United States > Hawaii (0.04)

Genre: Overview (0.89)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

arXiv.org Artificial IntelligenceJul-27-2021

Identify Apple Leaf Diseases Using Deep Learning Algorithm

Zhang, Daping, Yang, Hongyu, Cao, Jiayu

Agriculture is an essential industry in the both society and economy of a country. However, the pests and diseases cause a great amount of reduction in agricultural production while there is no sufficient guidance for farmers to avoid this disaster. To address this problem, we apply CNNs to plant disease recognition by building a classification model. Within the dataset of 3,642 images of apple leaves [1], We use a pre-trained image classification model Restnet34 based on Convolutional neural network (CNNs) with the Fastai framework in order to save the training time. Overall, the accuracy of classification is 93.765%.

accuracy, deep learning algorithm, identify apple leaf disease, (9 more...)

2107.12598

Country: North America > United States (0.05)

Genre: Research Report (0.83)

Industry: Food & Agriculture > Agriculture (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)

Gote, Christoph, Perri, Vincenzo, Scholtes, Ingo

Predicting Influential Higher-Order Patterns in Temporal Network Data

arXiv.org Machine LearningJul-26-2021

Networks are frequently used to model complex systems comprised of interacting elements. While links capture the topology of direct interactions, the true complexity of many systems originates from higher-order patterns in paths by which nodes can indirectly influence each other. Path data, representing ordered sequences of consecutive direct interactions, can be used to model these patterns. However, to avoid overfitting, such models should only consider those higher-order patterns for which the data provide sufficient statistical evidence. On the other hand, we hypothesise that network models, which capture only direct interactions, underfit higher-order patterns present in data. Consequently, both approaches are likely to misidentify influential nodes in complex networks. We contribute to this issue by proposing eight centrality measures based on MOGen, a multi-order generative model that accounts for all paths up to a maximum distance but disregards paths at higher distances. We compare MOGen-based centralities to equivalent measures for network models and path data in a prediction experiment where we aim to identify influential nodes in out-of-sample data. Our results show strong evidence supporting our hypothesis. MOGen consistently outperforms both the network model and path-based prediction. We further show that the performance difference between MOGen and the path-based approach disappears if we have sufficient observations, confirming that the error is due to overfitting.

centrality, influential higher-order pattern, node, (14 more...)

2107.121

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (0.46)
Health & Medicine > Therapeutic Area (0.46)
Telecommunications > Networks (0.42)
Information Technology > Networks (0.42)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Toussaint, Wiebke, Ding, Aaron Yi

SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification

arXiv.org Artificial IntelligenceJul-26-2021

Despite the success of deep neural networks (DNNs) in enabling on-device voice assistants, increasing evidence of bias and discrimination in machine learning is raising the urgency of investigating the fairness of these systems. Speaker verification is a form of biometric identification that gives access to voice assistants. Due to a lack of fairness metrics and evaluation frameworks that are appropriate for testing the fairness of speaker verification components, little is known about how model performance varies across subgroups, and what factors influence performance variation. To tackle this emerging challenge, we design and develop SVEva Fair, an accessible, actionable and model-agnostic framework for evaluating the fairness of speaker verification components. The framework provides evaluation measures and visualisations to interrogate model performance across speaker subgroups and compare fairness between models. We demonstrate SVEva Fair in a case study with end-to-end DNNs trained on the VoxCeleb datasets to reveal potential bias in existing embedded speech recognition systems based on the demographic attributes of speakers. Our evaluation shows that publicly accessible benchmark models are not fair and consistently produce worse predictions for some nationalities, and for female speakers of most nationalities. To pave the way for fair and reliable embedded speaker verification, SVEva Fair has been implemented as an open-source python library and can be integrated into the embedded ML development pipeline to facilitate developers and researchers in troubleshooting unreliable speaker verification performance, and selecting high impact approaches for mitigating fairness challenges

fairness, speaker verification component, subgroup, (10 more...)

2107.12049

Country:

North America > Canada (0.05)
Oceania > Australia (0.05)
Europe > Norway (0.05)
(9 more...)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)