AITopics | Diagnosis

Collaborating Authors

Diagnosis

News Overviews Instructional Materials AI-Alerts Classics

Introduction to Outlier Detection Methods

#artificialintelligenceMar-24-2016, 20:00:50 GMT

This post is a summary of 3 different posts about outlier detection methods. One of the challenges in data analysis in general and predictive modeling in particular is dealing with outliers. There are many modeling techniques which are resistant to outliers or reduce the impact of them, but still detecting outliers and understanding them can lead to interesting findings. We generally define outliers as samples that are exceptionally far from the mainstream of data.There is no rigid mathematical definition of what constitutes an outlier; determining whether or not an observation is an outlier is ultimately a subjective exercise. There are several approaches for detecting Outliers.

artificial intelligence, data mining, outlier, (11 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.63)

Add feedback

How to Bin or Convert Numerical Variables to Categorical Variables with Decision Trees

@machinelearnbotMar-23-2016, 19:55:17 GMT

Why would you want to convert a numerical variable into categorical one? Depending on the situation, it can lead to a better interpretation of the numerical variable, quick segmentation or just an additional feature for building your predictive model by creating bins for the numerical variable. Binning is a popular feature engineering technique. Suppose your hypothesis is that the age of a customer is correlated with their tendency to interact with a mobile app. The age of the user is plotted on x-axis and user interaction with the app is plotted on the y-axis.

artificial intelligence, decision tree learning, machine learning, (15 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

Debugging Machine Learning Tasks

Chakarov, Aleksandar, Nori, Aditya, Rajamani, Sriram, Sen, Shayak, Vijaykeerthy, Deepak

arXiv.org Machine LearningMar-23-2016

Unlike traditional programs (such as operating systems or word processors) which have large amounts of code, machine learning tasks use programs with relatively small amounts of code (written in machine learning libraries), but voluminous amounts of data. Just like developers of traditional programs debug errors in their code, developers of machine learning tasks debug and fix errors in their data. However, algorithms and tools for debugging and fixing errors in data are less common, when compared to their counterparts for detecting and fixing errors in code. In this paper, we consider classification tasks where errors in training data lead to misclassifications in test points, and propose an automated method to find the root causes of such misclassifications. Our root cause analysis is based on Pearl's theory of causation, and uses Pearl's PS (Probability of Sufficiency) as a scoring metric. Our implementation, Psi, encodes the computation of PS as a probabilistic program, and uses recent work on probabilistic programs and transformations on probabilistic programs (along with gray-box models of machine learning algorithms) to efficiently compute PS. Psi is able to identify root causes of data errors in interesting data sets.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1603.07292

Country:

North America > United States > Wyoming (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > New Finding (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Annotated Decision Trees for Simple Moral Machines

Bendel, Oliver (Northwestern Switzerland School of Business)

AAAI ConferencesMar-16-2016

Autonomization often follows after the automization on which it is based. More and more machines have to make decisions with moral implications. Machine ethics, which can be seen as an equivalent of human ethics, analyses the chances and limits of moral machines. So far, decision trees have not been commonly used for modelling moral machines. This article proposes an approach for creating annotated decision trees, and specifies their central components. The focus is on simple moral machines. The chances of such models are illustrated with the example of a self-driving car that is friendly to humans and animals. Finally the advantages and disadvantages are discussed and conclusions are drawn.

assumption, decision tree, moral machine, (13 more...)

AAAI Conferences

2016 AAAI Spring Symposium Series

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
(4 more...)

Industry:

Transportation > Passenger (1.00)
Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.89)
Information Technology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.86)

Add feedback

The Devil’s Triangle: Ethical Considerations on Developing Bot Detection Methods

Thieltges, Andree (Universität Siegen) | Schmidt, Florian (Universität Siegen) | Hegelich, Simon (Universität Siegen)

AAAI ConferencesMar-16-2016

Social media is increasingly populated with bots. To protect the authenticity of the user, experience machine learning algorithms are used to detect these bots. Ethical dimensions of these methods have not been thoroughly considered yet. Taking histogram analysis of Twitter users' profile images as example, the paper demonstrates the trade-offs of accuracy, transparency, and robustness. Because there is no general optimum in ethical considerations, these dimensions form a "devil's triangle".

artificial intelligence, machine learning, social media, (16 more...)

AAAI Conferences

2016 AAAI Spring Symposium Series

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > District of Columbia > Washington (0.04)
(4 more...)

Industry:

Information Technology > Services (0.71)
Information Technology > Security & Privacy (0.43)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.52)

Add feedback

Data-Augmented Software Diagnosis

Elmishali, Amir (Ben Gurion University of the Negev) | Stern, Roni (Ben Gurion University of the Negev) | Kalech, Meir (Ben Gurion University of the Negev)

AAAI ConferencesFeb-10-2016

Software fault prediction algorithms predict which software components is likely to contain faults using machine learning techniques. Software diagnosis algorithm identify the faulty software components that caused a failure using model-based or spectrum based approaches. We show how software fault prediction algorithms can be used to improve software diagnosis. The resulting data-augmented diagnosis algorithm overcomes key problems in software diagnosis algorithms: ranking diagnoses and distinguishing between diagnoses with high probability and low probability. We demonstrate the efficiency of the proposed approach empirically on three open sources domains, showing significant increase in accuracy of diagnosis and efficiency of troubleshooting. These encouraging results suggests broader use of data-driven methods to complement and improve existing model-based methods.

artificial intelligence, diagnosis, machine learning, (17 more...)

AAAI Conferences

Twenty-Eighth IAAI Conference

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimally Pruning Decision Tree Ensembles With Feature Cost

Nan, Feng, Wang, Joseph, Saligrama, Venkatesh

arXiv.org Machine LearningJan-5-2016

We consider the problem of learning decision rules for prediction with feature budget constraint. In particular, we are interested in pruning an ensemble of decision trees to reduce expected feature cost while maintaining high prediction accuracy for any test example. We propose a novel 0-1 integer program formulation for ensemble pruning. Our pruning formulation is general - it takes any ensemble of decision trees as input. By explicitly accounting for feature-sharing across trees together with accuracy/cost trade-off, our method is able to significantly reduce feature cost by pruning subtrees that introduce more loss in terms of feature cost than benefit in terms of prediction accuracy gain. Theoretically, we prove that a linear programming relaxation produces the exact solution of the original integer program. This allows us to use efficient convex optimization tools to obtain an optimally pruned ensemble for any given budget. Empirically, we see that our pruning algorithm significantly improves the performance of the state of the art ensemble method BudgetRF.

artificial intelligence, constraint, machine learning, (15 more...)

arXiv.org Machine Learning

1601.00955

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.85)

Add feedback

Query-Answer Causality in Databases: Abductive Diagnosis and View-Updates

Salimi, Babak, Bertossi, Leopoldo

arXiv.org Artificial IntelligenceSep-19-2015

Causality has been recently introduced in databases, to model, characterize and possibly compute causes for query results (answers). Connections between query causality and consistency-based diagnosis and database repairs (wrt. integrity constrain violations) have been established in the literature. In this work we establish connections between query causality and abductive diagnosis and the view-update problem. The unveiled relationships allow us to obtain new complexity results for query causality -the main focus of our work- and also for the two other areas.

artificial intelligence, logic & formal reasoning, natural language, (21 more...)

arXiv.org Artificial Intelligence

1506.04299

Country: North America > Canada > Ontario > National Capital Region > Ottawa (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Add feedback

Causal Decision Trees

Li, Jiuyong, Ma, Saisai, Le, Thuc Duy, Liu, Lin, Liu, Jixue

arXiv.org Artificial IntelligenceAug-16-2015

Uncovering causal relationships in data is a major objective of data analytics. Causal relationships are normally discovered with designed experiments, e.g. randomised controlled trials, which, however are expensive or infeasible to be conducted in many cases. Causal relationships can also be found using some well designed observational studies, but they require domain experts' knowledge and the process is normally time consuming. Hence there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree where nodes have causal interpretations. Our method follows a well established causal inference framework and makes use of a classic statistical test. The method is practical for finding causal signals in large data sets.

artificial intelligence, causal relationship, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2016.2619350

1508.03812

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > South Australia (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.75)

Add feedback

Appropriate Causal Models and the Stability of Causation

Halpern, Joseph Y.

arXiv.org Artificial IntelligenceAug-3-2015

Causal models defined in terms of structural equations have proved to be quite a powerful way of representing knowledge regarding causality. However, a number of authors have given examples that seem to show that the Halpern-Pearl (HP) definition of causality gives intuitively unreasonable answers. Here it is shown that, for each of these examples, we can give two stories consistent with the description in the example, such that intuitions regarding causality are quite different for each story. By adding additional variables, we can disambiguate the stories. Moreover, in the resulting causal models, the HP definition of causality gives the intuitively correct answer. It is also shown that, by adding extra variables, a modification to the original HP definition made to deal with an example of Hopkins and Pearl may not be necessary. Given how much can be done by adding extra variables, there might be a concern that the notion of causality is somewhat unstable. Can adding extra variables in a "conservative" way (i.e., maintaining all the relations between the variables in the original model) cause the answer to the question "Is X=x a cause of Y=y" to alternate between "yes" and "no"? It is shown that we can have such alternation infinitely often, but if we take normality into consideration, we cannot. Indeed, under appropriate normality assumptions. adding an extra variable can change the answer from "yes" to "no", but after that, it cannot cannot change back to "yes".

ac2, artificial intelligence, equation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1017/S1755020315000246

1412.3518

Genre: Research Report (0.84)

Industry: Government > Voting & Elections (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.82)

Add feedback