AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.65)

#artificialintelligenceFeb-11-2019, 10:10:21 GMT

10 Machine Learning Algorithms You need to Know – Towards Data Science

We live in a start of revolutionized era due to development of data analytics, large computing power, and cloud computing. Machine learning will definitely have a huge role there and the brains behind Machine Learning is based on algorithms. This article covers 10 most popular Machine Learning Algorithms which uses currently. These algorithms can be categorized into 3 main categories. Following algorithms are going to be covered in this article.

algorithm, artificial intelligence, machine learning, (16 more...)

Genre: Research Report (0.99)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

#artificialintelligenceFeb-11-2019, 07:31:13 GMT

How It Feels to Learn Data Science in 2019 – Towards Data Science

So I just have to buy a Tableau license and I'm now a data scientist? Okay, let's just take that sales pitch with a grain of salt. I may be clueless, but I know there is more to data science than making pretty visualizations. I can do that in Excel. You got to admit it is slick marketing though. Charting data is the fun stage, and they leave out the painful and time-consuming parts of working with data: cleaning, wrangling, transforming, and loading it. God help you if you need your own custom domain logic when using closed tools. Yes, and that is why I suspect there is value in learning to code. Maybe you can learn Alteryx.

artificial intelligence, machine learning, natural language, (14 more...)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
(2 more...)

#artificialintelligenceFeb-11-2019, 07:31:13 GMT

How It Feels to Learn Data Science in 2019 – Towards Data Science

artificial intelligence, machine learning, natural language, (14 more...)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
(2 more...)

de Franca, Fabricio Olivetti, Aldeia, Guilherme Seidyo Imai

Interaction-Transformation Evolutionary Algorithm for Symbolic Regression

arXiv.org Machine LearningFeb-11-2019

Abstract--The Interaction-Transformation (IT) is a new representation for Symbolic Regression that restricts the search space into simpler, but expressive, function forms. This representation has the advantage of creating a smoother search space unlike the space generated by Expression Trees, the common representation used in Genetic Programming. This paper introduces an Evolutionary Algorithmcapable of evolving a population of IT expressions supported only by the mutation operator. The results show that this representation is capable of finding better approximations to real-world data sets when compared to traditional approaches and a state-of-the-art Genetic Programming algorithm. I. INTRODUCTION Regression analysis has the objective of describing the relationship between measurable variables [1]. This analysis can be used to make predictions of not yet observed samples, to study a system's behavior or to calculate the statistical properties of such system. F. O. de Franca is with Federal University of ABC, Center for Mathematics, Computationand Cognition, Heuristics, Analysis and Learning Laboratory, São Paulo, Brazil, email: folivetti@ufabc.edu.br,

algorithm, expression, representation, (13 more...)

1902.03983

Country: South America > Brazil > São Paulo (0.25)

Genre:

Research Report > New Finding (0.54)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)

#artificialintelligenceFeb-10-2019, 14:52:49 GMT

TOP 10 Machine Learning Algorithms – garvitanand2 – Medium

Linear regression is perhaps one of the most well-known and well-understood algorithms in statistics and machine learning. Predictive modeling is primarily concerned with minimizing the error of a model or making the most accurate predictions possible, at the expense of explainability. We will borrow, reuse and steal algorithms from many different fields, including statistics and use them towards these ends. The representation of linear regression is an equation that describes a line that best fits the relationship between the input variables (x) and the output variables (y), by finding specific weightings for the input variables called coefficients (B). We will predict y given the input x and the goal of the linear regression learning algorithm is to find the values for the coefficients B0 and B1.

algorithm, artificial intelligence, machine learning, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)

Shen, Yanyao, Sanghavi, Sujay

Iterative Least Trimmed Squares for Mixed Linear Regression

arXiv.org Machine LearningFeb-10-2019

In vanilla linear regression, one (implicitly) assumes that each sample is a linear measurement of a single unknown vector, which needs to be recovered from these measurements. Statistically, it is typically studied in the setting where the samples come from such a ground truth unknown vector, and we are interested in the (computational/statistical complexity of) recovery of this ground truth vector. Mixed linear regression (MLR for brevity) is the problem where there are multiple unknown vectors, and each sample can come from any one of them (and we do not know which one, a-priori). Our objective is again to recover all (or some, or one) of them from the samples. In this paper we consider MLR with the additional presence of corruptions - i.e. adversarial additive errors in the responses - for some unknown subset of the samples. There is now a healthy and quickly growing body of work on algorithms, and corresponding theoretical guarantees, for MLR with and without additive noise and corruptions; we review these in detail in the related work section. In our paper we start from a classical (but hard to compute) approach from robust statistics: least trimmed squares [Rou84]. This advocates fitting a model so as to minimize the loss on only a fraction τ of the samples, instead of all of them - but crucially, the subset S of samples chosen and the model to fit them are to be estimated jointly.

algorithm, corruption, regression, (16 more...)

1902.03653

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.81)

Friedler, Sorelle A., Roy, Chitradeep Dutta, Scheidegger, Carlos, Slack, Dylan

Assessing the Local Interpretability of Machine Learning Models

arXiv.org Machine LearningFeb-9-2019

The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on two definitions of interpretability that have been introduced in the machine learning literature: simulatability (a user's ability to run a model on a given input) and "what if" local explainability (a user's ability to correctly indicate the outcome to a model under local changes to the input). Through a user study with 1000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable. We find evidence consistent with the common intuition that decision trees and logistic regression models are interpretable and are more interpretable than neural networks. We propose a metric - the runtime operation count on the simulatability task - to indicate the relative interpretability of models and show that as the number of operations increases the users' accuracy on the local interpretability tasks decreases.

decision tree, interpretability, neural network, (11 more...)

1902.03501

Country:

Europe (0.28)
North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.72)

Li, Alexander Hanbo, Bradic, Jelena

Censored Quantile Regression Forests

arXiv.org Machine LearningFeb-8-2019

In many applications, we want to predict and estimate the effect of a covariate on survival timeof interests. Examples include treatment, surgical procedure, or immunization on survival time of patients, who for example, could be individuals who have metastatic breast cancer, military casualties suffering from various injuries, or survival time of infectious diseases.Classically, most datasets have been too small to meaningfully examine the heterogeneity of the data beyond dividing them into a few subpopulations. In the past few years, however, there has been an explosion of experimental settings where it is potentially feasible to explore heterogeneity to its full extent. An impediment to exploring heterogeneous effects is the fear that scientists with two opposite agendas could hypothetically string together two opposite but coherent results by searching through many different possible models and then reporting only the very extreme ones - highlighting solely spurious results (Olken, 2015). Thus, protocols for clinical trials must specify in advance the pre-analysis plans and then learn from the data.

oracle, quantile loss, tau 0, (11 more...)

1902.03327

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Law > Civil Rights & Constitutional Law (0.57)
Health & Medicine > Therapeutic Area > Oncology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Data Science (0.67)

Bertsimas, Dimitris, Li, Michael Lingzhi

Accounting for Significance and Multicollinearity in Building Linear Regression Models

arXiv.org Machine LearningFeb-8-2019

We derive explicit Mixed Integer Optimization (MIO) constraints, as opposed to iteratively imposing them in a cutting plane framework, that impose significance and avoid multicollinearity for building linear regression models. In this way we extend and improve the research program initiated in Bertsimas and King (2016) that imposes sparsity, robustness, pairwise collinearity and group sparsity explicitly and significance and avoiding multicollinearity iteratively. We present a variety of computational results on real and synthetic datasets that suggest that the proposed MIO has a significant computational edge compared to Bertsimas and King (2016) in accuracy, false detection rate and computational time in accounting for significance and multicollinearity as well as providing a holistic framework to produce regression models with desirable properties a priori.

inform journal, multicollinear relationship, optimization 00, (10 more...)

1902.03272

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)