AITopics

Country: North America > United States (0.28)

Industry:

Education > Educational Setting > K-12 Education > Secondary School (0.62)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.42)

arXiv.org Machine LearningMar-5-2019

Copying Machine Learning Classifiers

Unceta, Irene, Nin, Jordi, Pujol, Oriol

We study model-agnostic copies of machine learning classifiers. We develop the theory behind the problem of copying, highlighting its differences with that of learning, and propose a framework to copy the functionality of any classifier using no prior knowledge of its parameters or training data distribution. We identify the different sources of loss and provide guidelines on how best to generate synthetic sets for the copying process. We further introduce a set of metrics to evaluate copies in practice. We validate our framework through extensive experiments using data from a series of well-known problems. We demonstrate the value of copies in use cases where desiderata such as interpretability, fairness or productivization constrains need to be addressed. Results show that copies can be exploited to enhance existing solutions and improve them adding new features and characteristics.

artificial intelligence, dataset, machine learning, (17 more...)

1903.01879

Country:

Europe (0.46)
North America (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Government (0.92)
Law (0.92)
Information Technology > Security & Privacy (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)

arXiv.org Machine LearningFeb-27-2019

Robust Decision Trees Against Adversarial Examples

Chen, Hongge, Zhang, Huan, Boning, Duane, Hsieh, Cho-Jui

Although adversarial examples and model robustness have been extensively studied in the context of linear models and neural networks, research on this issue in tree-based models and how to make tree-based models robust against adversarial examples is still limited. In this paper, we show that tree based models are also vulnerable to adversarial examples and develop a novel algorithm to learn robust trees. At its core, our method aims to optimize the performance under the worst-case perturbation of input features, which leads to a max-min saddle point problem. Incorporating this saddle point objective into the decision tree building procedure is non-trivial due to the discrete nature of trees --- a naive approach to finding the best split according to this saddle point objective will take exponential time. To make our approach practical and scalable, we propose efficient tree building algorithms by approximating the inner minimizer in this saddle point problem, and present efficient implementations for classical information gain based trees as well as state-of-the-art tree boosting models such as XGBoost. Experimental results on real world datasets demonstrate that the proposed algorithms can substantially improve the robustness of tree-based models against adversarial examples.

adversarial example, artificial intelligence, machine learning, (18 more...)

1902.1066

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

#artificialintelligenceFeb-26-2019, 15:15:58 GMT

Demystifying Maths of Gradient Boosting – Towards Data Science

Boosting is an ensemble learning technique. Conceptually, these techniques involve: 1. learning base learners; 2. using all of the models to come to a final prediction. Ensemble learning techniques are of different types and all differ from each other in terms of how they go about implementing the learning process for the base learners and then using their output to give out the final result. Techniques that are used in ensemble learning are Bootstrap Aggregation (a.k.a. In this article, we shall discuss briefly about Bagging and then move on to Gradient Boosting which is the focus of this article.

artificial intelligence, base learner, machine learning, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.95)

arXiv.org Artificial IntelligenceFeb-24-2019

Entity Personalized Talent Search Models with Tree Interaction Features

Ozcaglar, Cagri, Geyik, Sahin, Schmitz, Brian, Sharma, Prakhar, Shelkovnykov, Alex, Ma, Yiming, Buchanan, Erik

Talent Search systems aim to recommend potential candidates who are a good match to the hiring needs of a recruiter expressed in terms of the recruiter's search query or job posting. Past work in this domain has focused on linear and nonlinear models which lack preference personalization in the user-level due to being trained only with globally collected recruiter activity data. In this paper, we propose an entity-personalized Talent Search model which utilizes a combination of generalized linear mixed (GLMix) models and gradient boosted decision tree (GBDT) models, and provides personalized talent recommendations using nonlinear tree interaction features generated by the GBDT. We also present the offline and online system architecture for the productionization of this hybrid model approach in our Talent Search systems. Finally, we provide offline and online experiment results benchmarking our entity-personalized model with tree interaction features, which demonstrate significant improvements in our precision metrics compared to globally trained non-personalized models.

artificial intelligence, glmix model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3308558.3313672

1902.09041

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
(2 more...)

#artificialintelligenceFeb-21-2019, 10:58:09 GMT

Random Forest Algorithm in Machine Learning

Random forest algorithm is a one of the most popular and most powerful supervised Machine Learning algorithm in Machine Learning that is capable of performing both regression and classification tasks. As the name suggest, this algorithm creates the forest with a number of decision trees. Random Forest Algorithm in Machine Learning: Machine learning is a scientific discipline that explores the construction and study of algorithms that can learn from data. Such algorithms operate by building a model from example inputs and using that to make predictions or decisions, rather than following strictly static program instructions. Machine learning is closely related to and often overlaps with computational statistics; a discipline that also specializes in prediction-making.

algorithm, forest algorithm, random forest algorithm, (11 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.56)

Industry: Education > Educational Setting > Online (0.77)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Barnwal, Avinash, Bharti, Haripad, Ali, Aasim, Singh, Vishal

Stacking with Neural network for Cryptocurrency investment

arXiv.org Machine LearningFeb-20-2019

Predicting the direction of assets have been an active area of study and a difficult task. Machine learning models have been used to build robust models to model the above task. Ensemble methods is one of them showing results better than a single supervised method. In this paper, we have used generative and discriminative classifiers to create the stack, particularly 3 generative and 9 discriminative classifiers and optimized over one-layer Neural Network to model the direction of price cryptocurrencies. Features used are technical indicators used are not limited to trend, momentum, volume, volatility indicators, and sentiment analysis has also been used to gain useful insight combined with the above features. For Cross-validation, Purged Walk forward cross-validation has been used. In terms of accuracy, we have done a comparative analysis of the performance of Ensemble method with Stacking and Ensemble method with blending. We have also developed a methodology for combined features importance for the stacked model. Important indicators are also identified based on feature importance.

feature importance, indicator, parameter apr-may 2018, (15 more...)

1902.07855

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New Jersey > Essex County > Newark (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.95)
(2 more...)

arXiv.org Machine LearningFeb-20-2019

Inference of a Multi-Domain Machine Learning Model to Predict Mortality in Hospital Stays for Patients with Cancer upon Febrile Neutropenia Onset

Du, Xinsong, Min, Jae, Lemas, Dominick J., Prosperi, Mattia

Febrile neutropenia (FN) has been associated with high mortality, especially among adults with cancer. Understanding the patient and provider level heterogeneity in FN hospital admissions has potential to inform personalized interventions focused on increasing survival of individuals with FN. We leverage machine learning techniques to disentangling the complex interactions among multi domain risk factors in a population with FN. Data from the Healthcare Cost and Utilization Project (HCUP) National Inpatient Sample and Nationwide Inpatient Sample (NIS) were used to build machine learning based models of mortality for adult cancer patients who were diagnosed with FN during a hospital admission. In particular, the importance of risk factors from different domains (including demographic, clinical, and hospital associated information) was studied. A set of more interpretable (decision tree, logistic regression) as well as more black box (random forest, gradient boosting, neural networks) models were analyzed and compared via multiple cross validation. Our results demonstrate that a linear prediction score of FN mortality among adults with cancer, based on admission information is effective in classifying high risk patients; clinical diagnoses is the domain with the highest predictive power. A number of the risk variables (e.g. sepsis, kidney failure, etc.) identified in this study are clinically actionable and may inform future studies looking at the patients prior medical history are warranted.

diagnosis, mortality, neutropenia, (13 more...)

1902.07839

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)

Makhotin, Ivan, Koroteev, Dmitry, Burnaev, Evgeny

Gradient Boosting to Boost the Efficiency of Hydraulic Fracturing

arXiv.org Machine LearningFeb-19-2019

Journal of Petroleum Exploration and Production Technology manuscript No. (will be inserted by the editor) Abstract In this paper we present a data-driven model for forecasting the production increase after hydraulic fracturing (HF). We use data from fracturing jobs performed at one of the Siberian oilfields. The data includes features, characterizing the jobs, and a geological information. To predict an oil rate after the fracturing machine learning (ML) technique was applied. The MLbased prediction is compared to a prediction based on the experience of reservoir and production engineers responsible for the HFjob planning.

artificial intelligence, gradient boosting, upstream oil & gas, (18 more...)

1902.02223

Country: Europe > Spain (0.15)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.46)

#artificialintelligenceFeb-14-2019, 22:13:23 GMT

Improve Machine Learning Results with Ensemble Learning

NOTE: This article assumes that you are familiar with a basic understanding of Machine Learning algorithms. Suppose you want to buy a new mobile phone, will you walk directly to the first shop and purchase the mobile based on the advice of shopkeeper? You would visit some of the online mobile seller sites where you can see a variety of mobile phones, their specifications, features, and prices. You may also consider the reviews that people posted on the site. However, you probably might also ask your friends and colleagues for their opinions.

base learner, ensemble, learner, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.39)