AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Privacy-Preserving Boosting in the Local Setting

arXiv.org Machine LearningFeb-5-2020

In machine learning, boosting is one of the most popular methods that designed to combine multiple base learners to a superior one. The well-known Boosted Decision Tree classifier, has been widely adopted in many areas. In the big data era, the data held by individual and entities, like personal images, browsing history and census information, are more likely to contain sensitive information. The privacy concern raises when such data leaves the hand of the owners and be further explored or mined. Such privacy issue demands that the machine learning algorithm should be privacy aware. Recently, Local Differential Privacy is proposed as an effective privacy protection approach, which offers a strong guarantee to the data owners, as the data is perturbed before any further usage, and the true values never leave the hands of the owners. Thus the machine learning algorithm with the private data instances is of great value and importance. In this paper, we are interested in developing the privacy-preserving boosting algorithm that a data user is allowed to build a classifier without knowing or deriving the exact value of each data samples. Our experiments demonstrate the effectiveness of the proposed boosting algorithm and the high utility of the learned classifiers.

algorithm, classifier, data owner, (16 more...)

arXiv.org Machine Learning

2002.02096

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.14)
South America > Brazil (0.05)
North America > United States > North Carolina (0.04)
(8 more...)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
(2 more...)

Add feedback

Robust Boosting for Regression Problems

Ju, Xiaomeng, Salibián-Barrera, Matías

arXiv.org Machine LearningFeb-5-2020

The gradient boosting algorithm constructs a regression estimator using a linear combination of simple "base learners". In order to obtain a robust non-parametric regression estimator that is scalable to high dimensional problems we propose a robust boosting algorithm based on a two-stage approach, similar to what is done for robust linear regression: we first minimize a robust residual scale estimator, and then improve its efficiency by optimizing a bounded loss function. Unlike previous proposals, our algorithm does not need to compute an ad-hoc residual scale estimator in each step. Since our loss functions are typically non-convex, we propose initializing our algorithm with an $L_1$ regression tree, which is fast to compute. We also introduce a robust variable importance metric for variable selection that is calculated via a permutation procedure. Through simulated and real data experiments, we compare our method against gradient boosting with squared loss and other robust boosting methods in the literature. With clean data, our method works equally well as gradient boosting with the squared loss. With symmetric and asymmetrically contaminated data, we show that our proposed method outperforms in terms of prediction error and variable selection accuracy.

estimator, gradient, loss function, (17 more...)

arXiv.org Machine Learning

2002.02054

Country:

Europe > Austria > Vienna (0.14)
Oceania > Australia > Tasmania (0.04)
North America > Canada > British Columbia (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Sharpe Ratio in High Dimensions: Cases of Maximum Out of Sample, Constrained Maximum, and Optimal Portfolio Choice

Caner, Mehmet, Medeiros, Marcelo, Vasconcelos, Gabriel

arXiv.org Machine LearningFeb-5-2020

In this paper, we analyze maximum Sharpe ratio when the number of assets in a portfolio is larger than its time span. One obstacle in this large dimensional setup is the singularity of the sample covariance matrix of the excess asset returns. To solve this issue, we benefit from a technique called nodewise regression, which was developed by Meinshausen and Buhlmann (2006). It provides a sparse/weakly sparse and consistent estimate of the precision matrix, using the Lasso method. We analyze three issues. One of the key results in our paper is that mean-variance efficiency for the portfolios in large dimensions is established. Then tied to that result, we also show that the maximum out-of-sample Sharpe ratio can be consistently estimated in this large portfolio of assets. Furthermore, we provide convergence rates and see that the number of assets slow down the convergence up to a logarithmic factor. Then, we provide consistency of maximum Sharpe Ratio when the portfolio weights add up to one, and also provide a new formula and an estimate for constrained maximum Sharpe ratio. Finally, we provide consistent estimates of the Sharpe ratios of global minimum variance portfolio and Markowitz's (1952) mean variance portfolio. In terms of assumptions, we allow for time series data. Simulation and out-of-sample forecasting exercise shows that our new method performs well compared to factor and shrinkage based techniques.

matrix, portfolio, sharpe ratio, (14 more...)

arXiv.org Machine Learning

2002.018

Country:

North America > United States > California > Orange County > Irvine (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation

Abeyrathna, K. Darshana, Granmo, Ole-Christoffer, Goodwin, Morten

arXiv.org Machine LearningFeb-4-2020

The Regression Tsetlin Machine (RTM) addresses the lack of interpretability impeding state-of-the-art nonlinear regression models. It does this by using conjunctive clauses in propositional logic to capture the underlying non-linear frequent patterns in the data. These, in turn, are combined into a continuous output through summation, akin to a linear regression function, however, with non-linear components and unity weights. Although the RTM has solved non-linear regression problems with competitive accuracy, the resolution of the output is proportional to the number of clauses employed. This means that computation cost increases with resolution. To reduce this problem, we here introduce integer weighted RTM clauses. Our integer weighted clause is a compact representation of multiple clauses that capture the same sub-pattern-N repeating clauses are turned into one, with an integer weight N. This reduces computation cost N times, and increases interpretability through a sparser representation. We further introduce a novel learning scheme that allows us to simultaneously learn both the clauses and their weights, taking advantage of so-called stochastic searching on the line. We evaluate the potential of the integer weighted RTM empirically using six artificial datasets. The results show that the integer weighted RTM is able to acquire on par or better accuracy using significantly less computational resources compared to regular RTMs. We further show that integer weights yield improved accuracy over real-valued ones.

regression tsetlin machine, rtm, tsetlin machine, (13 more...)

arXiv.org Machine Learning

2002.01245

Country: Europe > Norway (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Profit-oriented sales forecasting: a comparison of forecasting techniques from a business perspective

Van Calster, Tine, Bossche, Filip Van den, Baesens, Bart, Lemahieu, Wilfried

arXiv.org Machine LearningFeb-3-2020

Choosing the technique that is the best at forecasting your data, is a problem that arises in any forecasting application. Decades of research have resulted into an enormous amount of forecasting methods that stem from statistics, econometrics and machine learning (ML), which leads to a very difficult and elaborate choice to make in any forecasting exercise. This paper aims to facilitate this process for high-level tactical sales forecasts by comparing a large array of techniques for 35 times series that consist of both industry data from the Coca-Cola Company and publicly available datasets. However, instead of solely focusing on the accuracy of the resulting forecasts, this paper introduces a novel and completely automated profit-driven approach that takes into account the expected profit that a technique can create during both the model building and evaluation process. The expected profit function that is used for this purpose, is easy to understand and adaptable to any situation by combining forecasting accuracy with business expertise. Furthermore, we examine the added value of ML techniques, the inclusion of external factors and the use of seasonal models in order to ascertain which type of model works best in tactical sales forecasting. Our findings show that simple seasonal time series models consistently outperform other methodologies and that the profit-driven approach can lead to selecting a different forecasting model.

forecast, forecasting, forecasting technique, (16 more...)

arXiv.org Machine Learning

2002.00949

Country:

Europe > United Kingdom (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > France (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.69)
Banking & Finance > Economy (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Stochastic geometry to generalize the Mondrian Process

O'Reilly, Eliza, Tran, Ngoc

arXiv.org Machine LearningFeb-3-2020

The Mondrian process is a stochastic process that produces a recursive partition of space with random axis-aligned cuts. Random forests and Laplace kernel approximations built from the Mondrian process have led to efficient online learning methods and Bayesian optimization. By viewing the Mondrian process as a special case of the stable under iterated tessellation (STIT) process, we utilize tools from stochastic geometry to resolve three fundamental questions concern generalizability of the Mondrian process in machine learning. First, we show that the Mondrian process with general cut directions can be efficiently simulated, but it is unlikely to give rise to better classification or regression algorithms. Second, we characterize all possible kernels that generalizations of the Mondrian process can approximate. This includes, for instance, various forms of the weighted Laplace kernel and the exponential kernel. Third, we give an explicit formula for the density estimator arising from a Mondrian forest. This allows for precise comparisons between the Mondrian forest, the Mondrian kernel and the Laplace kernel in density estimation. Our paper calls for further developments at the novel intersection of stochastic geometry and machine learning.

estimator, mondrian process, tessellation, (15 more...)

arXiv.org Machine Learning

2002.00797

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Overfitting Can Be Harmless for Basis Pursuit: Only to a Degree

Ju, Peizhong, Lin, Xiaojun, Liu, Jia

arXiv.org Machine LearningFeb-2-2020

Recently, there have been significant interests in studying the generalization power of linear regression models in the overparameterized regime, with the hope that such analysis may provide the first step towards understanding why overparameterized deep neural networks generalize well even when they overfit the training data. Studies on min $\ell_2$-norm solutions that overfit the training data have suggested that such solutions exhibit the "double-descent" behavior, i.e., the test error decreases with the number of features $p$ in the overparameterized regime when $p$ is larger than the number of samples $n$. However, for linear models with i.i.d. Gaussian features, for large $p$ the model errors of such min $\ell_2$-norm solutions approach the "null risk," i.e., the error of a trivial estimator that always outputs zero, even when the noise is very low. In contrast, we studied the overfitting solution of min $\ell_1$-norm, which is known as Basis Pursuit (BP) in the compressed sensing literature. Under a sparse true linear model with i.i.d. Gaussian features, we show that for a large range of $p$ up to a limit that grows exponentially with $n$, with high probability the model error of BP is upper bounded by a value that decreases with $p$ and is proportional to the noise level. To the best of our knowledge, this is the first result in the literature showing that, without any explicit regularization in such settings where both $p$ and the dimension of data are much larger than $n$, the test errors of a practical-to-compute overfitting solution can exhibit double-descent and approach the order of the noise level independently of the null risk. Our upper bound also reveals a descent floor for BP that is proportional to the noise level. Further, this descent floor is independent of $n$ and the null risk, but increases with the sparsity level of the true model.

deep learning, neural network, null 2, (20 more...)

arXiv.org Machine Learning

2002.00492

Country: North America > United States (0.27)

Genre: Research Report > New Finding (0.67)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.67)
Energy > Oil & Gas > Midstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

The Sylvester Graphical Lasso (SyGlasso)

Wang, Yu, Jang, Byoungwook, Hero, Alfred

arXiv.org Machine LearningFeb-1-2020

This paper introduces the Sylvester graphical lasso (SyGlasso) that captures multiway dependencies present in tensor-valued data. The model is based on the Sylvester equation that defines a generative model. The proposed model complements the tensor graphical lasso (Greenewald et al., 2019) that imposes a Kronecker sum model for the inverse covariance matrix by providing an alternative Kronecker sum model that is generative and interpretable. A nodewise regression approach is adopted for estimating the conditional independence relationships among variables. The statistical convergence of the method is established, and empirical studies are provided to demonstrate the recovery of meaningful conditional dependency graphs. We apply the SyGlasso to an electroencephalography (EEG) study to compare the brain connectivity of alcoholic and nonalcoholic subjects. We demonstrate that our model can simultaneously estimate both the brain connectivity and its temporal dependencies.

matrix, precision matrix, syglasso, (14 more...)

arXiv.org Machine Learning

2002.00288

Country:

North America > United States > Michigan (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
Europe > Italy > Sicily > Palermo (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Estimation of Z-Thickness and XY-Anisotropy of Electron Microscopy Images using Gaussian Processes

Ambegoda, Thanuja D., Martel, Julien N. P., Adamcik, Jozef, Cook, Matthew, Hahnlose, Richard H. R.

arXiv.org Machine LearningFeb-1-2020

Martel, Jozef Adamcik, Matthew Cook, Richard H. R. Hahnloser Abstract --Serial section electron microscopy (ssEM) is a widely used technique for obtaining volumetric information of biological tissues at nanometer scale. However, accurate 3D reconstructions of identified cellular structures and volumetric quantifications require precise estimates of section thickness and anisotropy (or stretching) along the XY imaging plane. In fact, many image processing algorithms simply assume isotropy within the imaging plane. T o ameliorate this problem, we present a method for estimating thickness and stretching of electron microscopy sections using nonparametric Bayesian regression of image statistics. We verify our thickness and stretching estimates using direct measurements obtained by atomic force microscopy (AFM) and show that our method has a lower estimation error compared to a recent indirect thickness estimation method as well as a relative Z coordinate estimation method. Furthermore, we have made the first dataset of ssSEM images with directly measured section thickness values publicly available for the evaluation of indirect thickness estimation methods. I NTRODUCTION Electron microscopy (EM) has enabled imaging of nano-scale neuroanatomical structures such as synapses. Serial section Scanning Electron Microscopy (ssSEM) and serial section Transmission Electron Microscopy (ssTEM) are used to inspect tissue volumes on the scale of tens to hundreds of micrometers in each dimension. Tissue sections suitable for ssEM typically have a thickness that ranges from 30 nm to 70 nm .

section thickness, thickness, thickness estimate, (16 more...)

arXiv.org Machine Learning

2002.00228

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.05)
Oceania > Fiji (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.89)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Linear regression with gradient descent in R.

#artificialintelligenceJan-31-2020, 21:59:13 GMT

This demonstrates a basic machine learning linear regression. In the outputs, compare the values for intercept and slope from the built-in R lm() method with those that we calculate manually with gradient descent. The plots show how close the red and blue lines overlap.

gradient descent, linear regression, regression

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.92)

Add feedback