AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

7 Machine Learning Algorithms Every Engineer Should Know

#artificialintelligenceAug-4-2017, 15:51:47 GMT

Machine Learning, the branch of Artificial Intelligence is based on the idea that machines should be able to learn and adapt through experience. It is increasingly gaining popularity over the last couple of years. Machine learning is one approach to achieve Artificial Intelligence by using algorithms. It is predicted that Machine Learning Algorithms may replace a wealth of jobs in the coming years. Logistic Regression is a powerful statistical way of estimating discrete values (usually binary values) from a set of independent variables.

algorithm, artificial intelligence, machine learning, (7 more...)

#artificialintelligence

Genre: Research Report (0.85)

Industry: Banking & Finance (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.57)

Add feedback

Regularization in Logistic Regression: Better Fit and Better Generalization?

@machinelearnbotAug-4-2017, 15:35:38 GMT

Regularization does NOT improve the performance on the data set that the algorithm used to learn the model parameters (feature weights). However, it can improve the generalization performance, i.e., the performance on new, unseen data, which is exactly what we want. In intuitive terms, we can think of regularization as a penalty against complexity. Increasing the regularization strength penalizes "large" weight coefficients -- our goal is to prevent that our model picks up "peculiarities," "noise," or "imagines a pattern where there is none." Again, we don't want the model to memorize the training dataset, we want a model that generalizes well to new, unseen data. In more specific terms, we can think of regularization as adding (or increasing the) bias if our model suffers from (high) variance (i.e., it overfits the training data).

artificial intelligence, cost function, machine learning, (8 more...)

@machinelearnbot

Country: North America > United States > Michigan (0.07)

Genre:

Research Report > New Finding (0.40)
Research Report > Experimental Study (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

A comprehensive beginners guide for Linear, Ridge and Lasso Regression

#artificialintelligenceAug-3-2017, 22:27:00 GMT

I was talking to one of my friends who happens to be an operations manager at one of the Supermarket chains in India. Over our discussion, we started talking about the amount of preparation the store chain needs to do before the Indian festive season (Diwali) kicks in. He told me how critical it is for them to estimate / predict which product will sell like hot cakes and which would not prior to the purchase. A bad decision can leave your customers to look for offers and products in the competitor stores. The challenge does not finish there – you need to estimate the sales of products across a range of different categories for stores in varied locations and with consumers having different consumption techniques. While my friend was describing the challenge, the data scientist in me started smiling! I just figured out a potential topic for my next article. In today's article, I will tell you everything you need to know about regression models and how they can be used to solve prediction problems like the one mentioned above. Take a moment to list down all those factors you can think, on which the sales of a store will be dependent on. For each factor create an hypothesis about why and how that factor would influence the sales of various products. For example – I expect the sales of products to depend on the location of the store, because the local residents in each area would have different lifestyle. The amount of bread a store will sell in Ahmedabad would be a fraction of similar store in Mumbai. Similarly list down all possible factors you can think of. Location of your shop, availability of the products, size of the shop, offers on the product, advertising done by a product, placement in the store could be some features on which your sales would depend on.

artificial intelligence, machine learning, regression, (17 more...)

#artificialintelligence

Country: Asia > India > Maharashtra > Mumbai (0.24)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Ensemble representation learning: an analysis of fitness and survival for wrapper-based genetic programming methods

La Cava, William, Moore, Jason H.

arXiv.org Machine LearningAug-3-2017

University of Pennsylvania 3700 Hamilton Walk Philadelphia, PA 19104 lacava@upenn.edu Recently we proposed a general, ensemble-based feature engineering wrapper (FEW) that was paired with a number of machine learning methods to solve regression problems. Here, we adapt FEW for supervised classification and perform a thorough analysis of fitness and survival methods within this framework. Our tests demonstrate that two fitness metrics, one introduced as an adaptation of the silhouette score, outperform the more commonly used Fisher criterion. We analyze survival methods and demonstrate that ϵ-lexicase survival works best across our test problems, followed by random survival which outperforms both tournament and deterministic crowding. We conduct a benchmark comparison to several classification methods using a large set of problems and show that FEW can improve the best classifier performance in several cases. We show that FEW generates consistent, meaningful features for a biomedical problem with different ML pairings.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1145/3071178/3071215

1703.06934

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.24)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.68)

Add feedback

Streaming kernel regression with provably adaptive mean, variance, and regularization

Durand, Audrey, Maillard, Odalric-Ambrym, Pineau, Joelle

arXiv.org Machine LearningAug-2-2017

We consider the problem of streaming kernel regression, when the observations arrive sequentially and the goal is to recover the underlying mean function, assumed to belong to an RKHS. The variance of the noise is not assumed to be known. In this context, we tackle the problem of tuning the regularization parameter adaptively at each time step, while maintaining tight confidence bounds estimates on the value of the mean function at each point. To this end, we first generalize existing results for finite-dimensional linear regression with fixed regularization and known variance to the kernel setup with a regularization parameter allowed to be a measurable function of past observations. Then, using appropriate self-normalized inequalities we build upper and lower bound estimates for the variance, leading to Bersntein-like concentration bounds. The later is used in order to define the adaptive regularization. The bounds resulting from our technique are valid uniformly over all observation points and all time steps, and are compared against the literature with numerical experiments. Finally, the potential of these tools is illustrated by an application to kernelized bandits, where we revisit the Kernel UCB and Kernel Thompson Sampling procedures, and show the benefits of the novel adaptive kernel tuning strategy.

artificial intelligence, kernel regression, machine learning, (18 more...)

arXiv.org Machine Learning

1708.00768

Country: North America > Canada (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Fairness-aware machine learning: a perspective

Zliobaite, Indre

arXiv.org Machine LearningAug-2-2017

Algorithms learned from data are increasingly used for deciding many aspects in our life: from movies we see, to prices we pay, or medicine we get. Yet there is growing evidence that decision making by inappropriately trained algorithms may unintentionally discriminate people. For example, in automated matching of candidate CVs with job descriptions, algorithms may capture and propagate ethnicity related biases. Several repairs for selected algorithms have already been proposed, but the underlying mechanisms how such discrimination happens from the computational perspective are not yet scientifically understood. We need to develop theoretical understanding how algorithms may become discriminatory, and establish fundamental machine learning principles for prevention. We need to analyze machine learning process as a whole to systematically explain the roots of discrimination occurrence, which will allow to devise global machine learning optimization criteria for guaranteed prevention, as opposed to pushing empirical constraints into existing algorithms case-by-case. As a result, the state-of-the-art will advance from heuristic repairing, to proactive and theoretically supported prevention. This is needed not only because law requires to protect vulnerable people. Penetration of big data initiatives will only increase, and computer science needs to provide solid explanations and accountability to the public, before public concerns lead to unnecessarily restrictive regulations against machine learning.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1708.00754

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.30)

Add feedback

Recursive Partitioning for Personalization using Observational Data

Kallus, Nathan

arXiv.org Machine LearningAug-1-2017

We study the problem of learning to choose from m discrete treatment options (e.g., news item or medical drug) the one with best causal effect for a particular instance (e.g., user or patient) where the training data consists of passive observations of covariates, treatment, and the outcome of the treatment. The standard approach to this problem is regress and compare: split the training data by treatment, fit a regression model in each split, and, for a new instance, predict all m outcomes and pick the best. By reformulating the problem as a single learning task rather than m separate ones, we propose a new approach based on recursively partitioning the data into regimes where different treatments are optimal. We extend this approach to an optimal partitioning approach that finds a globally optimal partition, achieving a compact, interpretable, and impactful personalization model. We develop new tools for validating and evaluating personalization models on observational data and use these to demonstrate the power of our novel approaches in a personalized medicine and a job training application.

artificial intelligence, machine learning, personalization, (13 more...)

arXiv.org Machine Learning

1608.08925

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Scalable MCMC for Large Data Problems using Data Subsampling and the Difference Estimator

Quiroz, Matias, Villani, Mattias, Kohn, Robert

arXiv.org Machine LearningAug-1-2017

We propose a generic Markov Chain Monte Carlo (MCMC) algorithm to speed up computations for datasets with many observations. A key feature of our approach is the use of the highly efficient difference estimator from the survey sampling literature to estimate the log-likelihood accurately using only a small fraction of the data. Our algorithm improves on the $O(n)$ complexity of regular MCMC by operating over local data clusters instead of the full sample when computing the likelihood. The likelihood estimate is used in a Pseudo-marginal framework to sample from a perturbed posterior which is within $O(m^{-1/2})$ of the true posterior, where $m$ is the subsample size. The method is applied to a logistic regression model to predict firm bankruptcy for a large data set. We document a significant speed up in comparison to the standard MCMC on the full dataset.

artificial intelligence, difference estimator, machine learning, (3 more...)

arXiv.org Machine Learning

1507.02971

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Add feedback

Cut off point in logistic regression

@machinelearnbotJul-31-2017, 17:25:50 GMT

If your event rate is around 17% and you say that at 50% cutoff you're getting a very good classification, there's something fishy! How can a logistic model trained to fit only 17% be better than what information the dataset has? Unless, you're measure of accuracy of fit is different from misclassification! Remember, the model usually fits the remaining 83% well, so the misclassification there would be low as compared to the 17%. But I'm unsure how you're getting a 50% cutoff more accurate in terms of misclassification - since, a decrease here, is going to increase it there. The best way to find out the cutoff is by plotting for different values as already suggested, but it's usually got to be around the event rate!

artificial intelligence, logistic regression, machine learning, (4 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Machine Learning - Predict Stock Prices using Regression

#artificialintelligenceJul-31-2017, 01:36:35 GMT

The other day I was reading an article on how AI has progressed so far and where it is going. I was awestruck and had a hard time digesting the picture the author drew on possibilities in the future. Here is how I reacted. "A surgeon could control a machine scalpel with her motor cortex instead of holding one in her hand, and she could receive sensory input from that scalpel so that it would feel like an 11th finger to her. So it would be as if one of her fingers was a scalpel and she could do the surgery without holding any tools, giving her much finer control over her incisions. An inexperienced surgeon performing a tough operation could bring a couple of her mentors into the scene as she operates to watch her work through her eyes and think instructions or advice to her. And if something goes really wrong, one of them could "take the wheel" and connect their motor cortex to her outputs to take control of her hands."

algorithm, artificial intelligence, machine learning, (15 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.70)

Industry:

Banking & Finance > Trading (1.00)
Materials > Metals & Mining > Steel (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback