AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Predicting In-game Actions From the Language of NBA Players

arXiv.org Artificial IntelligenceOct-24-2019

Sports competitions are widely researched in computer and social science, with the goal of understanding how players act under uncertainty. While there is an abundance of computational work on player metrics prediction based on past performance, very few attempts to incorporate out-of-game signals have been made. Specifically, it was previously unclear whether linguistic signals gathered from players' interviews can add information which does not appear in performance metrics. To bridge that gap, we define text classification tasks of predicting deviations from mean in NBA players' in-game actions, which are associated with strategic choices, player behavior and risk, using their choice of language prior to the game. We collected a dataset of transcripts from key NBA players' pre-game interviews and their in-game performance metrics, totaling in 5,226 interview-metric pairs. We design neural models for players' action prediction based on increasingly more complex aspects of the language signals in their open-ended interviews. Our models can make their predictions based on the textual signal alone, or on a combination with signals from past-performance metrics. Our text-based models outperform strong baselines trained on performance metrics only, demonstrating the importance of language usage for action prediction. Moreover, the models that employ both textual input and past-performance metrics produced the best results. Finally, as neural networks are notoriously difficult to interpret, we propose a method for gaining further insight into what our models have learned. Particularly, we present an LDA-based analysis, where we interpret model predictions in terms of correlated topics. We find that our best performing textual model is most associated with topics that are intuitively related to each prediction task and that better models yield higher correlation with more informative topics.

interview, performance metric, prediction, (16 more...)

arXiv.org Artificial Intelligence

1910.11292

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Basketball (1.00)
Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

#005A Logistic Regression from scratch Master Data Science

#artificialintelligenceOct-23-2019, 09:18:31 GMT

In this post we will talk about applying gradient descent on $m$ training examples. Now the question is how we can define what gradient descent is? A gradient descent is an efficient optimization algorithm that attempts to find a global minimum of a function. It also enables a model to calculate the gradient or direction that the model should take to reduce errors (differences between actual $y$ and predicted $\hat{y}$). Now let's remind ourselves what the cost function is?

gradient descent, logistic regression, mathrm, (11 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Making Bayesian Predictive Models Interpretable: A Decision Theoretic Approach

#artificialintelligenceOct-23-2019, 05:54:05 GMT

A salient approach to interpretable machine learning is to restrict modeling to simple and hence understandable models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models. Fundamentally, however, interpretability is about users' preferences, not the data generation mechanism: it is more natural to formulate interpretability as a utility function. In this work, we propose an interpretability utility, which explicates the trade-off between explanation fidelity and interpretability in the Bayesian framework. The method consists of two steps.

bayesian predictive model interpretable, interpretability, reference model, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.39)

Add feedback

Inference in High-Dimensional Linear Regression via Lattice Basis Reduction and Integer Relation Detection

Gamarnik, David, Kızıldağ, Eren C., Zadik, Ilias

arXiv.org Machine LearningOct-23-2019

We focus on the high-dimensional linear regression problem, where the algorithmic goal is to efficiently infer an unknown feature vector $\beta^*\in\mathbb{R}^p$ from its linear measurements, using a small number $n$ of samples. Unlike most of the literature, we make no sparsity assumption on $\beta^*$, but instead adopt a different regularization: In the noiseless setting, we assume $\beta^*$ consists of entries, which are either rational numbers with a common denominator $Q\in\mathbb{Z}^+$ (referred to as $Q$-rationality); or irrational numbers supported on a rationally independent set of bounded cardinality, known to learner; collectively called as the mixed-support assumption. Using a novel combination of the PSLQ integer relation detection, and LLL lattice basis reduction algorithms, we propose a polynomial-time algorithm which provably recovers a $\beta^*\in\mathbb{R}^p$ enjoying the mixed-support assumption, from its linear measurements $Y=X\beta^*\in\mathbb{R}^n$ for a large class of distributions for the random entries of $X$, even with one measurement $(n=1)$. In the noisy setting, we propose a polynomial-time, lattice-based algorithm, which recovers a $\beta^*\in\mathbb{R}^p$ enjoying $Q$-rationality, from its noisy measurements $Y=X\beta^*+W\in\mathbb{R}^n$, even with a single sample $(n=1)$. We further establish for large $Q$, and normal noise, this algorithm tolerates information-theoretically optimal level of noise. We then apply these ideas to develop a polynomial-time, single-sample algorithm for the phase retrieval problem. Our methods address the single-sample $(n=1)$ regime, where the sparsity-based methods such as LASSO and Basis Pursuit are known to fail. Furthermore, our results also reveal an algorithmic connection between the high-dimensional linear regression problem, and the integer relation detection, randomized subset-sum, and shortest vector problems.

algorithm, assumption, theorem 2, (17 more...)

arXiv.org Machine Learning

1910.1089

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Workflow (0.93)
Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Structure Learning of Gaussian Markov Random Fields with False Discovery Rate Control

Lee, Sangkyun, Sobczyk, Piotr, Bogdan, Malgorzata

arXiv.org Machine LearningOct-23-2019

In this paper, we propose a new estimation procedure for discovering the structure of Gaussian Markov random fields (MRFs) with false discovery rate (FDR) control, making use of the sorted l1-norm (SL1) regularization. A Gaussian MRF is an acyclic graph representing a multivariate Gaussian distribution, where nodes are random variables and edges represent the conditional dependence between the connected nodes. Since it is possible to learn the edge structure of Gaussian MRFs directly from data, Gaussian MRFs provide an excellent way to understand complex data by revealing the dependence structure among many inputs features, such as genes, sensors, users, documents, etc. In learning the graphical structure of Gaussian MRFs, it is desired to discover the actual edges of the underlying but unknown probabilistic graphical model-it becomes more complicated when the number of random variables (features) p increases, compared to the number of data points n. In particular, when p >> n, it is statistically unavoidable for any estimation procedure to include false edges. Therefore, there have been many trials to reduce the false detection of edges, in particular, using different types of regularization on the learning parameters. Our method makes use of the SL1 regularization, introduced recently for model selection in linear regression. We focus on the benefit of SL1 regularization that it can be used to control the FDR of detecting important random variables. Adapting SL1 for probabilistic graphical models, we show that SL1 can be used for the structure learning of Gaussian MRFs using our suggested procedure nsSLOPE (neighborhood selection Sorted L-One Penalized Estimation), controlling the FDR of detecting edges.

estimation, matrix, symmetry 2019, (14 more...)

arXiv.org Machine Learning

doi: 10.3390/symxx010005

1910.1086

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Dropping forward-backward algorithms for feature selection

Nguyen, Thu

arXiv.org Machine LearningOct-23-2019

In this era of big data, feature selection techniques, which have long been proven to simplify the model, makes the model more comprehensible, speed up the process of learning, have become more and more important. Among many developed methods, forward, backward and stepwise feature selection regression remained widely used due to their simplicity and efficiency. However, they are not sufficient enough when it comes to large datasets. In this paper, we analyze the issues associated with those approaches and introduce a novel algorithm that may boost the speed up to 65.77% compared to stepwise while maintaining good performance compared to stepwise selection in terms of the number of selected features and error rates.

algorithm, criterion, selection, (17 more...)

arXiv.org Machine Learning

1910.08007

Country: North America > United States > Louisiana (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Look Out Zillow Here Comes Jestimate!

#artificialintelligenceOct-22-2019, 17:29:25 GMT

As someone with expertise in both real estate and data science, I've always been fascinated by Zillow's Zestimate. In the spirit of competition, I've developed Jim's estimate or Jestimate! The following interactive map contains 2018 home sales in San Francisco by neighborhood. Click on the neighborhood and then click on a home in the data table to see the Jestimate results versus the actual sales price. Zestimate uses a proprietary machine learning formula to estimate the current market value of a home.

jestimate, sale price, san francisco, (13 more...)

#artificialintelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.32)

Industry: Banking & Finance > Real Estate (0.74)

Technology:

Information Technology > Data Science (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression

Zhao, Puning, Lai, Lifeng

arXiv.org Machine LearningOct-22-2019

For both classification and regression problems, existing works have shown that, if the distribution of the feature vector has bounded support and the probability density function is bounded away from zero in its support, the convergence rate of the standard kNN method, in which k is the same for all test samples, is minimax optimal. On the contrary, if the distribution has unbounded support, we show that there is a gap between the convergence rate achieved by the standard kNN method and the minimax bound. To close this gap, we propose an adaptive kNN method, in which different k is selected for different samples. Our selection rule does not require precise knowledge of the underlying distribution of features. The new proposed method significantly outperforms the standard one. We characterize the convergence rate of the proposed adaptive method, and show that it matches the minimax lower bound.

assumption 1, convergence rate, regression, (14 more...)

arXiv.org Machine Learning

1910.10513

Country:

North America > United States > California > Yolo County > Davis (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Orthogonal variance decomposition based feature selection

Kamalov, Firuz

arXiv.org Machine LearningOct-22-2019

Existing feature selection methods fail to properly account for interactions between features when evaluating feature subsets. In this paper, we attempt to remedy this issue by using orthogonal variance decomposition to evaluate features. The orthogonality of the decomposition allows us to directly calculate the total contribution of a feature to the output variance. Thus we obtain an efficient algorithm for feature evaluation which takes into account interactions among features. Numerical experiments demonstrate that our method accurately identifies relevant features and improves the accuracy of numerical models.

relevant feature, selection, sensitivity index, (12 more...)

arXiv.org Machine Learning

1910.09851

Country:

North America > United States > California > Orange County > Irvine (0.04)
Asia > Middle East > UAE (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis

Zhu, Jie, Gallego, Blanca

arXiv.org Machine LearningOct-22-2019

The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individual might benefit from an intervention. The majority of this literature has concentrated on the estimation of the effect of treatment on binary outcomes. However, many medical interventions are evaluated in terms of their effect on future events, which are subject to loss to follow-up. In this study, we describe a framework for the estimation of heterogeneous treatment effect in terms of differences in time-to-event (survival) probabilities. We divide the problem into three phases: (1) estimation of treatment effect conditioned on unique sets of the covariate vector; (2) identification of features important for heterogeneity using an ensemble of non-parametric variable importance methods; and (3) estimation of treatment effect on the reference classes defined by the previously selected features, using one-step Targeted Maximum Likelihood Estimation. We conducted a series of simulation studies and found that this method performs well when either sample size or event rate is high enough and the number of covariates contributing to the effect heterogeneity is moderate. An application of this method to a clinical case study was conducted by estimating the effect of oral anticoagulants on newly diagnosed non-valvular atrial fibrillation patients using data from the UK Clinical Practice Research Datalink.

estimation, survival probability, treatment effect, (14 more...)

arXiv.org Machine Learning

1910.08877

Country:

Europe > United Kingdom (0.14)
Oceania > Australia > New South Wales (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.94)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.88)
Health & Medicine > Health Care Technology > Medical Record (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback