AITopics

1910.14356

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Government (0.67)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

#artificialintelligenceOct-30-2019, 14:34:18 GMT

Transforming AML with AI - Feedzai

Current anti-money laundering solutions rely on techniques that generate excessive false positive rates which require burdensome manual reviews. Legacy money laundering solutions cannot keep pace with the increasingly sophisticated layering schemes, as well as growing compliance requirements from regulators. Can your bank/financial institution successfully navigate the growing regulatory demands? Is your bank equipped with the newest digital and AI based tools to stay ahead of the new global trends in money laundering?

feedzai, transforming aml

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Calafiore, Giuseppe Carlo, Morales, Marisa Hillary, Tiozzo, Vittorio, Marquie, Serge

A Classifiers Voting Model for Exit Prediction of Privately Held Companies

The difficulty of the problem stems from the lack of reliable, quantitative and publicly available data. In this paper, we contribute to this endeavour by constructing an exit predictor model based on qualitative data, which blends the outcomes of three classifiers, namely, a Logistic Regression model, a Random Forest model, and a Support V ector Machine model. The output of the combined model is selected on the basis of the majority of the output classes of the component models. The models are trained using data extracted from the Thomson Reuters Eikon repository of 54697 US and European companies over the 1996-2011 time span. Experiments have been conducted for predicting whether the company eventually either gets acquired or goes public (IPO), against the complementary event that it remains private or goes bankrupt, in the considered time window. Our model achieves a 63% predictive accuracy, which is quite a valuable figure for Private Equity investors, who typically expect very high returns from successful investments.

algorithm, investment, ipo, (14 more...)

1910.13969

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.36)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Sobolev Independence Criterion

Mroueh, Youssef, Sercu, Tom, Rigotti, Mattia, Padhi, Inkit, Santos, Cicero Dos

We propose the Sobolev Independence Criterion (SIC), an interpretable dependency measure between a high dimensional random variable X and a response variable Y . SIC decomposes to the sum of feature importance scores and hence can be used for nonlinear feature selection. SIC can be seen as a gradient regularized Integral Probability Metric (IPM) between the joint distribution of the two random variables and the product of their marginals. We use sparsity inducing gradient penalties to promote input sparsity of the critic of the IPM. In the kernel version we show that SIC can be cast as a convex optimization problem by introducing auxiliary variables that play an important role in feature selection as they are normalized feature importance scores. We then present a neural version of SIC where the critic is parameterized as a homogeneous neural network, improving its representation power as well as its interpretability. We conduct experiments validating SIC for feature selection in synthetic and real-world experiments. We show that SIC enables reliable and interpretable discoveries, when used in conjunction with the holdout randomization test and knockoffs to control the False Discovery Rate. Code is available at http://github.com/ibm/sic.

dataset, feature 100, sic, (15 more...)

1910.14212

Country:

North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Balashankar, Ananth, Lees, Alyssa, Welty, Chris, Subramanian, Lakshminarayanan

What is Fair? Exploring Pareto-Efficiency for Fairness Constrained Classifiers

The potential for learned models to amplify existing societal biases has been broadly recognized. Fairness-aware classifier constraints, which apply equality metrics of performance across subgroups defined on sensitive attributes such as race and gender, seek to rectify inequity but can yield non-uniform degradation in performance for skewed datasets. In certain domains, imbalanced degradation of performance can yield another form of unintentional bias. In the spirit of constructing fairness-aware algorithms as societal imperative, we explore an alternative: Pareto-Efficient Fairness (PEF). Theoretically, we prove that PEF identifies the operating point on the Pareto curve of subgroup performances closest to the fairness hyperplane, maximizing multiple subgroup accuracy. Empirically we demonstrate that PEF outperforms by achieving Pareto levels in accuracy for all subgroups compared to strict fairness constraints in several UCI datasets.

accuracy, operating point, subgroup, (16 more...)

1910.1412

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Ibáñez-Berganza, Miguel, Lancia, Gian Luca, Amico, Ambra, Monechi, Bernardo, Loreto, Vittorio

Unsupervised inference approach to facial attractiveness

The perception of facial beauty is a complex phenomenon depending on many, detailed and global facial features influencing each other. In the machine learning community this problem is typically tackled as a problem of supervised inference. However, it has been conjectured that this approach does not capture the complexity of the phenomenon. A recent original experiment (Ib\'a\~nez-Berganza et al., Scientific Reports 9, 8364, 2019) allowed different human subjects to navigate the face-space and "sculpt" their preferred modification of a reference facial portrait. Here we present an unsupervised inference study of the set of sculpted facial vectors in that experiment. We first infer minimal, interpretable, and faithful probabilistic models (through Maximum Entropy and artificial neural networks) of the preferred facial variations, that capture the origin of the observed inter-subject diversity in the sculpted faces. The application of such generative models to the supervised classification of the gender of the sculpting subjects, reveals an astonishingly high prediction accuracy. This result suggests that much relevant information regarding the subjects may influence (and be elicited from) her/his facial preference criteria, in agreement with the multiple motive theory of attractiveness proposed in previous works.

constraint, correlation, interaction, (17 more...)

1910.14072

Country:

Europe > Austria > Vienna (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > District of Columbia > Washington (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.66)

Bakker, Michiel A., Tu, Duy Patrick, Valdés, Humberto Riverón, Gummadi, Krishna P., Varshney, Kush R., Weller, Adrian, Pentland, Alex

DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning

We introduce a framework for dynamic adversarial discovery of information (DADI), motivated by a scenario where information (a feature set) is used by third parties with unknown objectives. We train a reinforcement learning agent to sequentially acquire a subset of the information while balancing accuracy and fairness of predictors downstream. Based on the set of already acquired features, the agent decides dynamically to either collect more information from the set of available features or to stop and predict using the information that is currently available. Building on previous work exploring adversarial representation learning, we attain group fairness (demographic parity) by rewarding the agent with the adversary's loss, computed over the final feature set. Importantly, however, the framework provides a more general starting point for fair or private dynamic information discovery. Finally, we demonstrate empirically, using two real-world datasets, that we can trade-off fairness and predictive performance

information, learning, representation, (15 more...)

1910.13983

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > Mexico (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Zhao, Xilei, Zhou, Zhengze, Yan, Xiang, Van Hentenryck, Pascal

Distilling Black-Box Travel Mode Choice Model for Behavioral Interpretation

Machine learning has proved to be very successful for making predictions in travel behavior modeling. However, most machine-learning models have complex model structures and offer little or no explanation as to how they arrive at these predictions. Interpretations about travel behavior models are essential for decision makers to understand travelers' preferences and plan policy interventions accordingly. Therefore, this paper proposes to apply and extend the model distillation approach, a model-agnostic machine-learning interpretation method, to explain how a black-box travel mode choice model makes predictions for the entire population and subpopulations of interest. Model distillation aims at compressing knowledge from a complex model (teacher) into an understandable and interpretable model (student). In particular, the paper integrates model distillation with market segmentation to generate more insights by accounting for heterogeneity. Furthermore, the paper provides a comprehensive comparison of student models with the benchmark model (decision tree) and the teacher model (gradient boosting trees) to quantify the fidelity and accuracy of the students' interpretations.

decision tree, student model, teacher model, (13 more...)

1910.1393

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.28)
North America > United States > Florida > Alachua County > Gainesville (0.14)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Air (0.64)
Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

arXiv.org Machine LearningOct-29-2019

Learning Without Loss

Elser, Veit

We explore a new approach for training neural networks where all lo ss functions are replaced by hard constraints. The same approach is very successfu l in phase retrieval, where signals are reconstructed from magnitude constraints and gener al characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed-reflect-reflect (RRR), derives its step s from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation x · w y . These projections, along with a host of other local projections (constraining pre-and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently -- across th e network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RR R optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative m atrix factorization, and concluding with a generative model comprising an autoencoder and c lassifier, all applications and their implementations by projections are described in comp lete detail. Although the new approach has the potential to extend the scope of neura l networks (e.g. by defining activation not through functions but constraint sets), most o f the featured models are standard to allow comparison with stochastic gradient descent.

algorithm, constraint, projection, (17 more...)

1911.00493

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Pang, Guansong, Hengel, Anton van den, Shen, Chunhua

Weakly-supervised Deep Anomaly Detection with Pairwise Relation Learning

arXiv.org Machine LearningOct-29-2019

This paper studies a rarely explored but critical anomaly detection problem: weakly-supervised anomaly detection with limited labeled anomalies and a large unlabeled data set. This problem is very important because it (i) enables anomaly-informed modeling which helps identify anomalies of interests and address the notorious high false positives in unsupervised anomaly detection, and (ii) eliminates the reliance on large-scale and complete labeled anomaly data in fully-supervised settings. However, the problem is especially challenging since we have only limited labeled data for a single class, and moreover, the seen anomalies often cannot cover all types of anomalies (i.e., unseen anomalies). We address this problem by formulating the problem as a pairwise relation learning task. Particularly, our approach defines a two-stream ordinal regression network to learn the relation of randomly selected instance pairs, i.e., whether the instance pair contains labeled anomalies or just unlabeled data instances. The resulting model leverages both the labeled and unlabeled data to effectively augment the data and learn generalized representations of both normality and abnormality. Extensive empirical results show that our approach (i) significantly outperforms state-of-the-art competing methods in detecting both seen and unseen anomalies and (ii) is substantially more data-efficient. Introduction Anomaly detection aims at identifying exceptional data instances that have a significant deviation from the majority of data instances, which can offer important insights into broad applications, such as identifying fraudulent transactions or insider trading, detecting network intrusions, and early detection of diseases.

anomaly, anomaly score, detection, (16 more...)

1910.13601

Country:

North America > United States (0.14)
Oceania > Australia > South Australia > Adelaide (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety (0.88)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)