Goto

Collaborating Authors

 Accuracy


Putting Fairness Principles into Practice: Challenges, Metrics, and Improvements

arXiv.org Machine Learning

As more researchers have become aware of and passionate about algorithmic fairness, there has been an explosion in papers laying out new metrics, suggesting algorithms to address issues, and calling attention to issues in existing applications of machine learning. This research has greatly expanded our understanding of the concerns and challenges in deploying machine learning, but there has been much less work in seeing how the rubber meets the road. In this paper we provide a case-study on the application of fairness in machine learning research to a production classification system, and offer new insights in how to measure and address algorithmic fairness issues. We discuss open questions in implementing equality of opportunity and describe our fairness metric, conditional equality, that takes into account distributional differences. Further, we provide a new approach to improve on the fairness metric during model training and demonstrate its efficacy in improving performance for a real-world product


Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods

arXiv.org Machine Learning

In this work, we investigate the importance of ethnicity in colorectal cancer survivability prediction using machine learning techniques and the SEER cancer incidence database. We compare model performances for 2-year survivability prediction and feature importance rankings between Hispanic, White, and mixed patient populations. Our models consistently perform better on single-ethnicity populations and provide different feature importance rankings when trained in different populations. Additionally, we show our models achieve higher Area Under Curve (AUC) score than the best reported in the literature. We also apply imbalanced classification techniques to improve classification performance when the number of patients who have survived from colorectal cancer is much larger than who have not. These results provide evidence in favor for increased consideration of patient ethnicity in cancer survivability prediction, and for more personalized medicine in general.


Enhancing Explainability of Neural Networks through Architecture Constraints

arXiv.org Machine Learning

Prediction accuracy and model explainability are the two most important objectives when developing machine learning algorithms to solve real-world problems. The neural networks are known to possess good prediction performance, but lack of sufficient model explainability. In this paper, we propose to enhance the explainability of neural networks through the following architecture constraints: a) sparse additive subnetworks; b) orthogonal projection pursuit; and c) smooth function approximation. It leads to a sparse, orthogonal and smooth explainable neural network (SOSxNN). The multiple parameters in the SOSxNN model are simultaneously estimated by a modified mini-batch gradient descent algorithm based on the backpropagation technique for calculating the derivatives and the Cayley transform for preserving the projection orthogonality. The hyperparameters controlling the sparse and smooth constraints are optimized by the grid search. Through simulation studies, we compare the SOSxNN method to several benchmark methods including least absolute shrinkage and selection operator, support vector machine, random forest, and multi-layer perceptron. It is shown that proposed model keeps the flexibility of pursuing prediction accuracy while attaining the improved interpretability, which can be therefore used as a promising surrogate model for complex model approximation. Finally, the real data example from the Lending Club is employed as a showcase of the SOSxNN application.


It's AI versus the hackers - Enterprise & Hybrid Cloud Services

#artificialintelligence

Google now checks for security breaches even after a user has logged in. Last year, Microsoft Corp's Azure security team detected suspicious activity in the cloud computing usage of a large retailer: One of the company's administrators, who usually logs on from New York, was trying to gain entry from Romania. A hacker had broken in. Microsoft quickly alerted its customer, and the attack was foiled before the intruder got too far. Inc and various start-ups are moving away from solely using older "rules-based" technology designed to respond to specific kinds of intrusion and deploying machine-learning algorithms that crunch massive amounts of data on logins, behaviour and previous attacks to ferret out and stop hackers.


The Deeper, the Better: Analysis of Person Attributes Recognition

arXiv.org Artificial Intelligence

In person attributes recognition, we describe a person in terms of their appearance. Typically, this includes a wide range of traits including age, gender, clothing, and footwear. Although this could be used in a wide variety of scenarios, it generally is applied to video surveillance, where attribute recognition is impacted by low resolution, and other issues such as variable pose, occlusion and shadow. Recent approaches have used deep convolutional neural networks (CNNs) to improve the accuracy in person attribute recognition. However, many of these networks are relatively shallow and it is unclear to what extent they use contextual cues to improve classification accuracy. In this paper, we propose deeper methods for person attribute recognition. Interpreting the reasons behind the classification is highly important, as it can provide insight into how the classifier is making decisions. Interpretation suggests that deeper networks generally take more contextual information into consideration, which helps improve classification accuracy and generalizability. We present experimental analysis and results for whole body attributes using the PA-100K and PETA datasets and facial attributes using the CelebA dataset.


A mixed model approach to drought prediction using artificial neural networks: Case of an operational drought monitoring environment

arXiv.org Machine Learning

Droughts, with their increasing frequency of occurrence, continue to negatively affect livelihoods and elements at risk. For example, the 2011 in drought in east Africa has caused massive losses document to have cost the Kenyan economy over $12bn. With the foregoing, the demand for ex-ante drought monitoring systems is ever-increasing. The study uses 10 precipitation and vegetation variables that are lagged over 1, 2 and 3-month time-steps to predict drought situations. In the model space search for the most predictive artificial neural network (ANN) model, as opposed to the traditional greedy search for the most predictive variables, we use the General Additive Model (GAM) approach. Together with a set of assumptions, we thereby reduce the cardinality of the space of models. Even though we build a total of 102 GAM models, only 21 have R2 greater than 0.7 and are thus subjected to the ANN process. The ANN process itself uses the brute-force approach that automatically partitions the training data into 10 sub-samples, builds the ANN models in these samples and evaluates their performance using multiple metrics. The results show the superiority of 1-month lag of the variables as compared to longer time lags of 2 and 3 months. The champion ANN model recorded an R2 of 0.78 in model testing using the out-of-sample data. This illustrates its ability to be a good predictor of drought situations 1-month ahead. Investigated as a classifier, the champion has a modest accuracy of 66% and a multi-class area under the ROC curve (AUROC) of 89.99%


Performance Analysis of Machine Learning Techniques to Predict Diabetes Mellitus

arXiv.org Machine Learning

Diabetes mellitus is a common disease of human body caused by a group of metabolic disorders where the sugar levels over a prolonged period is very high. It affects different organs of the human body which thus harm a large number of the body's system, in particular the blood veins and nerves. Early prediction in such disease can be controlled and save human life. To achieve the goal, this research work mainly explores various risk factors related to this disease using machine learning techniques. Machine learning techniques provide efficient result to extract knowledge by constructing predicting models from diagnostic medical datasets collected from the diabetic patients. Extracting knowledge from such data can be useful to predict diabetic patients. In this work, we employ four popular machine learning algorithms, namely Support Vector Machine (SVM), Naive Bayes (NB), K-Nearest Neighbor (KNN) and C4.5 Decision Tree, on adult population data to predict diabetic mellitus. Our experimental results show that C4.5 decision tree achieved higher accuracy compared to other machine learning techniques.


Dirichlet Variational Autoencoder

arXiv.org Machine Learning

This paper proposes Dirichlet Variational Autoencoder (DirVAE) using a Dirichlet prior for a continuous latent variable that exhibits the characteristic of the categorical probabilities. To infer the parameters of DirVAE, we utilize the stochastic gradient method by approximating the Gamma distribution, which is a component of the Dirichlet distribution, with the inverse Gamma CDF approximation. Additionally, we reshape the component collapsing issue by investigating two problem sources, which are decoder weight collapsing and latent value collapsing, and we show that DirVAE has no component collapsing; while Gaussian VAE exhibits the decoder weight collapsing and Stick-Breaking VAE shows the latent value collapsing. The experimental results show that 1) DirVAE models the latent representation result with the best log-likelihood compared to the baselines; and 2) DirVAE produces more interpretable latent values with no collapsing issues which the baseline models suffer from. Also, we show that the learned latent representation from the DirVAE achieves the best classification accuracy in the semi-supervised and the supervised classification tasks on MNIST, OMNIGLOT, and SVHN compared to the baseline VAEs. Finally, we demonstrated that the DirVAE augmented topic models show better performances in most cases.


A new generation of artificial intelligence is taking on hackers

#artificialintelligence

Last year, Microsoft's Azure security team detected suspicious activity in the cloud-computing usage of a large retailer: One of the company's administrators, who usually logs on from New York, was trying to gain entry from Romania. A hacker had broken in. Microsoft quickly alerted its customer, and the attack was foiled before the intruder got too far. Microsoft, Google, Amazon and various startups are moving away from solely using older "rules-based" technology designed to respond to specific kinds of intrusion and deploying machine-learning algorithms that crunch massive amounts of data on logins, behavior and previous attacks to ferret out and stop hackers. "Machine learning is a very powerful technique for security -- it's dynamic, while rules-based systems are very rigid," says Dawn Song, a professor at the University of California at Berkeley's Artificial Intelligence Research Lab. "It's a very manual-intensive process to change them, whereas machine learning is automated, dynamic and you can retrain it easily."


Computational Register Analysis and Synthesis

arXiv.org Artificial Intelligence

The study of register in computational language research has historically been divided into register analysis, seeking to determine the registerial character of a text or corpus, and register synthesis, seeking to generate a text in a desired register. This article surveys the different approaches to these disparate tasks. Register synthesis has tended to use more theoretically articulated notions of register and genre than analysis work, which often seeks to categorize on the basis of intuitive and somewhat incoherent notions of prelabeled 'text types'. I argue that an integration of computational register analysis and synthesis will benefit register studies as a whole, by enabling a new large-scale research program in register studies. It will enable comprehensive global mapping of functional language varieties in multiple languages, including the relationships between them. Furthermore, computational methods together with high coverage systematically collected and analyzed data will thus enable rigorous empirical validation and refinement of different theories of register, which will have also implications for our understanding of linguistic variation in general.