AITopics

2106.06243

Country:

Europe > Italy (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Castelnovo, Alessandro, Crupi, Riccardo, Greco, Greta, Regoli, Daniele

The zoo of Fairness metrics in Machine Learning

arXiv.org Machine LearningJun-11-2021

In recent years, the problem of addressing fairness in Machine Learning (ML) and automatic decision-making has attracted a lot of attention in the scientific communities dealing with Artificial Intelligence. A plethora of different definitions of fairness in ML have been proposed, that consider different notions of what is a "fair decision" in situations impacting individuals in the population. The precise differences, implications and "orthogonality" between these notions have not yet been fully analyzed in the literature. In this work, we try to make some order out of this zoo of definitions.

criteria, fairness, information, (15 more...)

2106.00467

Country:

North America > United States > New York (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)

Genre: Research Report (0.64)

Industry:

Law (0.67)
Law Enforcement & Public Safety (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Yang, Junchen, Lindenbaum, Ofir, Kluger, Yuval

Locally Sparse Networks for Interpretable Predictions

arXiv.org Machine LearningJun-11-2021

Despite the enormous success of neural networks, they are still hard to interpret and often overfit when applied to low-sample-size (LSS) datasets. To tackle these obstacles, we propose a framework for training locally sparse neural networks where the local sparsity is learned via a sample-specific gating mechanism that identifies the subset of most relevant features for each measurement. The sample-specific sparsity is predicted via a \textit{gating} network, which is trained in tandem with the \textit{prediction} network. By learning these subsets and weights of a prediction model, we obtain an interpretable neural network that can handle LSS data and can remove nuisance variables, which are irrelevant for the supervised learning task. Using both synthetic and real-world datasets, we demonstrate that our method outperforms state-of-the-art models when predicting the target function with far fewer features per instance.

dataset, invase, neural network, (13 more...)

2106.06468

Country:

Oceania > New Zealand (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Analyzing Non-Textual Content Elements to Detect Academic Plagiarism

Meuschke, Norman

alignment-based similarity analysis, diabetes, presentational mathematical feature, (30 more...)

Identifying academic plagiarism is a pressing problem, among others, for research institutions, publishers, and funding organizations. Detection approaches proposed so far analyze lexical, syntactical, and semantic text similarity. These approaches find copied, moderately reworded, and literally translated text. However, reliably detecting disguised plagiarism, such as strong paraphrases, sense-for-sense translations, and the reuse of non-textual content and ideas, is an open research problem. The thesis addresses this problem by proposing plagiarism detection approaches that implement a different concept: analyzing non-textual content in academic documents, specifically citations, images, and mathematical content. To validate the effectiveness of the proposed detection approaches, the thesis presents five evaluations that use real cases of academic plagiarism and exploratory searches for unknown cases. The evaluation results show that non-textual content elements contain a high degree of semantic information, are language-independent, and largely immutable to the alterations that authors typically perform to conceal plagiarism. Analyzing non-textual content complements text-based detection approaches and increases the detection effectiveness, particularly for disguised forms of academic plagiarism. To demonstrate the benefit of combining non-textual and text-based detection methods, the thesis describes the first plagiarism detection system that integrates the analysis of citation-based, image-based, math-based, and text-based document similarity. The system's user interface employs visualizations that significantly reduce the effort and time users must invest in examining content similarity.

doi: 10.5281/zenodo.4913345

2106.05764

Country:

North America > United States > California (0.45)
Europe > Netherlands (0.13)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.13)
(5 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
(2 more...)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Wheelchair automation by a hybrid BCI system using SSVEP and eye blinks

Kanungo, Lizy, Garg, Nikhil, Bhobe, Anish, Rajguru, Smit, Baths, Veeky

This work proposes a hybrid Brain Computer Interface system for the automation of a wheelchair for the disabled. Herein a working prototype of a BCI-based wheelchair is detailed that can navigate inside a typical home environment with minimum structural modification and without any visual obstruction and discomfort to the user. The prototype is based on a combined mechanism of steady-state visually evoked potential and eye blinks. To elicit SSVEP, LEDs flickering at 13Hz and 15Hz were used to select the left and right direction, respectively, and EEG data was recorded. In addition, the occurrence of three continuous blinks was used as an indicator for stopping an ongoing action. The wavelet packet denoising method was applied, followed by feature extraction methods such as Wavelet Packet Decomposition and Canonical Correlation Analysis over narrowband reconstructed EEG signals. Bayesian optimization was used to obtain 5 fold cross-validations to optimize the hyperparameters of the Support Vector Machine. The resulting new model was tested and the average cross-validation accuracy 89.65% + 6.6% (SD) and testing accuracy 83.53% + 8.59% (SD) were obtained. The wheelchair was controlled by RaspberryPi through WiFi. The developed prototype demonstrated an average of 86.97% success rate for all trials with 4.015s for each command execution. The prototype can be used efficiently in a home environment without causing any discomfort to the user.

engineering, prototype, wheelchair, (15 more...)

2106.11008

Country:

North America > United States > Oregon > Lane County > Eugene (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.35)

Deng, Jianyuan, Yang, Zhibo, Samaras, Dimitris, Wang, Fusheng

Artificial Intelligence in Drug Discovery: Applications and Techniques

Artificial intelligence (AI) has been transforming the practice of drug discovery in the past decade. Various AI techniques have been used in a wide range of applications, such as virtual screening and drug design. In this perspective, we first give an overview on drug discovery and discuss related applications, which can be reduced to two major tasks, i.e., molecular property prediction and molecule generation. We then discuss common data resources, molecule representations and benchmark platforms. Furthermore, to summarize the progress in AI-driven drug discovery, we present the relevant AI techniques including model architectures and learning paradigms in the surveyed papers. We expect that the perspective will serve as a guide for researchers who are interested in working at this intersected area of artificial intelligence and drug discovery. We also provide a GitHub repository\footnote{\url{https://github.com/dengjianyuan/Survey_AI_Drug_Discovery}} with the collection of papers and codes, if applicable, as a learning resource, which will be regularly updated.

arxiv preprint arxiv, drug discovery, molecule, (11 more...)

2106.05386

Country: North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Amaral, Orlando, Abualhaija, Sallam, Torre, Damiano, Sabetzadeh, Mehrdad, Briand, Lionel C.

AI-enabled Automation for Completeness Checking of Privacy Policies

Technological advances in information sharing have raised concerns about data protection. Privacy policies contain privacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g., a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). A prerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisions of GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related software specifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automation for the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterize the privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop an automated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machine learning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against the completeness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseen privacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23 false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword search only, our approach results in an improvement of 24.5% in precision and 38% in recall.

metadata type, personal data, privacy policy, (15 more...)

2106.05688

Country:

Europe > Jersey (0.14)
South America > Argentina (0.04)
Oceania > New Zealand (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

A comprehensive solution to retrieval-based chatbot construction

Moore, Kristen, Zhong, Shenjun, He, Zhen, Rudolf, Torsten, Fisher, Nils, Victor, Brandon, Jindal, Neha

In this paper we present the results of our experiments in training and deploying a self-supervised retrieval-based chatbot trained with contrastive learning for assisting customer support agents. In contrast to most existing research papers in this area where the focus is on solving just one component of a deployable chatbot, we present an end-to-end set of solutions to take the reader from an unlabelled chatlogs to a deployed chatbot. This set of solutions includes creating a self-supervised dataset and a weakly labelled dataset from chatlogs, as well as a systematic approach to selecting a fixed list of canned responses. We present a hierarchical-based RNN architecture for the response selection model, chosen for its ability to cache intermediate utterance embeddings, which helped to meet deployment inference speed requirements. We compare the performance of this architecture across 3 different learning objectives: self-supervised contrastive learning, binary classification, and multi-class classification. We find that using a self-supervised contrastive learning model outperforms training the binary and multi-class classification models on a weakly labelled dataset. Our results validate that the self-supervised contrastive learning approach can be effectively used for a real-world chatbot scenario.

canned response, dataset, utterance, (13 more...)

2106.06139

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation

Zhu, Wanrong, Wang, Xin Eric, Yan, An, Eckstein, Miguel, Wang, William Yang

Automatic evaluations for natural language generation (NLG) conventionally rely on token-level or embedding-level comparisons with the text references. This is different from human language processing, for which visual imaginations often improve comprehension. In this work, we propose ImaginE, an imagination-based automatic evaluation metric for natural language generation. With the help of CLIP and DALL-E, two cross-modal models pre-trained on large-scale image-text pairs, we automatically generate an image as the embodied imagination for the text snippet and compute the imagination similarity using contextual embeddings. Experiments spanning several text generation tasks demonstrate that adding imagination with our ImaginE displays great potential in introducing multi-modal information into NLG evaluation, and improves existing automatic metrics' correlations with human similarity judgments in many circumstances.

correlation, imagination, similarity, (10 more...)

2106.0597

Country:

Asia > Nepal (0.14)
Asia > Indonesia (0.05)
Asia > Singapore (0.04)
(13 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Sports (1.00)
Government (1.00)
Consumer Products & Services > Restaurants (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Celis, L. Elisa, Mehrotra, Anay, Vishnoi, Nisheeth K.

Fair Classification with Adversarial Perturbations

arXiv.org Machine LearningJun-10-2021

We study fair classification in the presence of an omniscient adversary that, given an $\eta$, is allowed to choose an arbitrary $\eta$-fraction of the training samples and arbitrarily perturb their protected attributes. The motivation comes from settings in which protected attributes can be incorrect due to strategic misreporting, malicious actors, or errors in imputation; and prior approaches that make stochastic or independence assumptions on errors may not satisfy their guarantees in this adversarial setting. Our main contribution is an optimization framework to learn fair classifiers in this adversarial setting that comes with provable guarantees on accuracy and fairness. Our framework works with multiple and non-binary protected attributes, is designed for the large class of linear-fractional fairness metrics, and can also handle perturbations besides protected attributes. We prove near-tightness of our framework's guarantees for natural hypothesis classes: no algorithm can have significantly better accuracy and any algorithm with better fairness must have lower accuracy. Empirically, we evaluate the classifiers produced by our framework for statistical rate on real-world and synthetic datasets for a family of adversaries.

classifier, equation, lemma 6, (15 more...)

2106.05964

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)