Goto

Collaborating Authors

 Decision Tree Learning


Intelligent and Affectively Aligned Evaluation of Online Health Information for Older Adults

AAAI Conferences

Online health resources aimed at older adults can have a significant impact on patient-physician relationships and on health outcomes. High quality online resources that are delivered in an ethical, emotionally aligned way can increase trust and reduce negative health outcomes such as anxiety. In contrast, low quality or misaligned resources can lead to harmful consequences such as inappropriate use of health care services and poor health decision-making. This paper investigates mechanisms for ensuring both quality and alignment of online health resources and interventions. First, the recently proposed QUEST evaluation instrument is examined. QUEST assesses the quality of online health information along six validated dimensions (authorship, attribution, conflict of interest, currency, complementarity, tone). A decision tree classifier is learned that is able to predict one criteria of the QUEST tool, complementarity, with an F1-score of 0.9 on a manually annotated dataset of 50 articles giving advice about Alzheimer disease. A social-psychological theory of affective (emotional) alignment is then presented, and demonstrated to gauge older adults emotional interpretations of eight examples of health recommendation systems related to Alzheimer disease (online memory tests). The paper concludes with a synthesizing view and a vision for the future of this important societal challenge.


"Why Did You Do That?" Explainable Intelligent Robots

AAAI Conferences

As autonomous intelligent systems become more widespread, society is beginning to ask: "What are the machines up to?". Various forms of artificial intelligence control our latest cars, load balance components of our power grids, dictate much of the movement in our stock markets and help doctors diagnose and treat our ailments. As they become increasingly able to learn and model more complex phenomena, so the ability of human users to understand the reasoning behind their decisions often decreases. It becomes very difficult to ensure that the robot will perform properly and that it is possible to correct errors. In this paper, we outline a variety of techniques for generating the underlying knowledge required for explainable artificial intelligence, ranging from early work in expert systems through to systems based on Behavioural Cloning. These are techniques that may be used to build intelligent robots that explain their decisions and justify their actions. We will then illustrate how decision trees are particularly well suited to generating these kinds of explanations. We will also discuss how additional explanations can be obtained, beyond simply the structure of the tree, based on knowledge of how the training data was generated. Finally, we will illustrate these capabilities in the context of a robot learning to drive over rough terrain in both simulation and in reality.


Homelessness Service Provision: A Data Science Perspective

AAAI Conferences

We study homeless service provision in the United States from a data science perspective, with the goal of informing homelessness prevention efforts. We use machine learning techniques to predict household reentry into a homeless system using an administrative dataset containing both demographic and service information. This data recorded all publicly funded services provided in a Midwestern US community from 2007 through 2014. We find that several techniques can provide useful lift in the prediction task, with random forests achieving an AUC around 0.7. Prediction improves significantly when conducted within calendar years, compared to across years, suggesting that changing dynamics drive repeated need for homeless services. We also analyze key service usage patterns that are associated with lower probabilities for reentry. Counterintuitively, individuals receiving the least intensive services provided through the homelessness system exhibit significantly lower likelihoods for further system involvement compared to individuals who received more intensive services, even after accounting for initial differences through propensity score and nearest neighbor matching. These result provide intriguing insights into homelessness service delivery that need to be further probed. In particular, it is unclear whether these less intensive services sustainably address housing needs, or whether, in contrast, frustration with inadequate services drives clients away from the homelessness system. Our results provide a proof-of-concept for how data science approaches can drive interesting, socially important research in the provision of public services.


Integration of Machine Learning Techniques to Evaluate Dynamic Customer Segmentation Analysis for Mobile Customers

arXiv.org Machine Learning

The telecommunications industry is highly competitive, which means that the mobile providers need a business intelligence model that can be used to achieve an optimal level of churners, as well as a minimal level of cost in marketing activities. Machine learning applications can be used to provide guidance on marketing strategies. Furthermore, data mining techniques can be used in the process of customer segmentation. The purpose of this paper is to provide a detailed analysis of the C.5 algorithm, within naive Bayesian modelling for the task of segmenting telecommunication customers behavioural profiling according to their billing and socio-demographic aspects. Results have been experimentally implemented.


The Fourth Industrial Revolution: How Big Data and Machine Learning Can Boost Inclusive Fintech

#artificialintelligence

The lending and credit scoring sector have more data than ever before at their disposal. How they leverage this data to create value for their clients and social impact determines the outcomes they can achieve in the financial services space. In 1959, Arthur Samuel, a pioneer in the field of machine learning (ML) and artificial intelligence during an era when computers filled an entire building, defined machine learning as "a field of study that gives computers the ability to learn without being explicitly programmed." During a recent keynote, Microsoft CEO Satya Nadella referred to data used in this context as "the new electricity," calling our current era a "fourth industrial revolution" following steam, electricity and digital technology. Scott Guthrie, Microsoft executive vice president, also acknowledged that data is "enabling every business to be the disrupters of their industry by harnessing the power to drive insight from this data."


Comparison of ML Classifiers Using Sparklyr

#artificialintelligence

You can use sparklyr to run a variety of classifiers in Apache Spark. For the Titanic data, the best performing models were tree based models. Gradient boosted trees was one of the best models, but also had a much longer average run time than the other models. Random forests and decision trees both had good performance and fast run times. While these models were run on a tiny data set in a local spark cluster, these methods will scale for analysis on data in a distributed Apache Spark cluster.


Machine-learning to inspire Singapore metro buildout

#artificialintelligence

Researchers are trying to distill smart transit philosophy into a machine-learning algorithm. Scientists hope their smart transit model will reveal a recipe for a smarter city, organized in way that relieves the congestion common on the mass transit systems of major cities. "Singapore needs an efficient transport system to support people's activities given the existing and planned infrastructure," project leader Christopher Monterola, a researcher at the Agency for Science, Technology and Research's Institute of High Performance Computing, explained in a news release. "To guide planners, we needed a model that could predict ridership under the regional centers plan." Like many cities, Singapore consists of a large central downtown, or an inner central business district, surrounded by less dense residential and industrial zones. With so many commuting in and out of the central business district at rush hour, the setup promotes congestion.


How machine-learning models can help banks capture more value Digital McKinsey

#artificialintelligence

Machine learning (ML) methods have been around for ages, but the big-data revolution and the plummeting cost of computing power are now making them truly excellent and practical analytical tools in banking across a variety of use cases, including credit risk. ML algorithms may sound complex and futuristic, but the way they work is quite simple. Essentially they combine a massive set of decision trees (i.e., a decision-making model that breaks out individual decisions and possible consequences, also known as "learners") to create an accurate model. By churning through these learners at high speeds, ML models are able to find "hidden" patterns, particularly in unstructured data that common statistical tools miss. Overfitting (the analytical description of random errors rather than underlying relationships) of the model is a typical concern about ML.



ŷhat Random Forests in Python

#artificialintelligence

Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. It can be used to model the impact of marketing on customer acquisition, retention, and churn or to predict disease risk and susceptibility in patients. Random forest is capable of regression and classification. It can handle a large number of features, and it's helpful for estimating which of your variables are important in the underlying data being modeled. Random forest is solid choice for nearly any prediction problem (even non-linear ones).