AITopics

1511.08327

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Grzegorczyk, Marco, Aderhold, Andrej, Husmeier, Dirk

Targeting Bayes factors with direct-path non-equilibrium thermodynamic integration

arXiv.org Machine LearningMar-21-2017

Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. In the present article, we propose a thermodynamic integration scheme that directly targets the log Bayes factor. The method is based on a modified annealing path between the posterior distributions of the two models compared, which systematically avoids the high variance prior regime. We combine this scheme with the concept of non-equilibrium TI to minimise discretisation errors from numerical integration. Results obtained on Bayesian regression models applied to standard benchmark data, and a complex hierarchical model applied to biopathway inference, demonstrate a significant reduction in estimator variance over state-of-the-art TI methods.

artificial intelligence, bayesian inference, machine learning, (15 more...)

doi: 10.1007/s00180-017-0721-7

1703.07305

Country: North America > United States (0.27)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

#artificialintelligenceMar-19-2017, 23:36:56 GMT

Loss function for Logistic Regression

If we are doing a binary classification using logistic regression, we often use the cross entropy function as our loss function. Question: However, if we are doing linear regression, we often use squared-error as our loss function. Are there any specific reasons for using the cross entropy function instead of using squared-error or the classification error in logistic regression? I read somewhere that, if we use squared-error for binary classification, the resulting loss function would be non-convex. Is this the only reason reason, or is there any other deeper reason which I am missing?

artificial intelligence, loss function, machine learning, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceMar-17-2017, 04:40:18 GMT

Intuitive Machine Learning : Gradient Descent Simplified – YOU CANalytics

They learn the same way as humans. Humans learn from experience and so do machines. For machines, experience is in the form of data. Machines use powerful algorithms to make sense of the data. They identify underlining patterns within the data to learn things about the world.

artificial intelligence, loss function, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)

#artificialintelligenceMar-16-2017, 20:05:20 GMT

Ideas on interpreting machine learning

For more on advances in machine learning, prediction, and technology, check out the Data science and advanced analytics sessions at Strata Hadoop World London, May 22-25, 2017. Early price ends April 7. You've probably heard by now that machine learning algorithms can use big data to predict whether a donor will give to a charity, whether an infant in a NICU will develop sepsis, whether a customer will respond to an ad, and on and on. Machine learning can even drive cars and predict elections. I believe it can, but these recent high-profile hiccups should leave everyone who works with data (big or not) and machine learning algorithms asking themselves some very hard questions: do I understand my data? Do I understand the model and answers my machine learning algorithm is giving me? And do I trust these answers? Unfortunately, the complexity that bestows the extraordinary predictive abilities on machine learning algorithms also makes the answers the algorithms produce hard to ...

artificial intelligence, data mining, machine learning, (20 more...)

#artificialintelligence

AI-Alerts: 2017 > 2017-03 > AAAI AI-Alert for Mar 21, 2017 (1.00)

Genre: Research Report (0.49)

Industry: Banking & Finance (0.93)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Selective Harvesting over Networks

Murai, Fabricio, Rennó, Diogo, Ribeiro, Bruno, Pappa, Gisele L., Towsley, Don, Gile, Krista

Active search (AS) on graphs focuses on collecting certain labeled nodes (targets) given global knowledge of the network topology and its edge weights under a query budget. However, in most networks, nodes, topology and edge weights are all initially unknown. We introduce selective harvesting, a variant of AS where the next node to be queried must be chosen among the neighbors of the current queried node set; the available training data for deciding which node to query is restricted to the subgraph induced by the queried set (and their node attributes) and their neighbors (without any node or edge attributes). Therefore, selective harvesting is a sequential decision problem, where we must decide which node to query at each step. A classifier trained in this scenario suffers from a tunnel vision effect: without recourse to independent sampling, the urge to query promising nodes forces classifiers to gather increasingly biased training data, which we show significantly hurts the performance of AS methods and standard classifiers. We find that it is possible to collect a much larger set of targets by using multiple classifiers, not by combining their predictions as an ensemble, but switching between classifiers used at each step, as a way to ease the tunnel vision effect. We discover that switching classifiers collects more targets by (a) diversifying the training data and (b) broadening the choices of nodes that can be queried next. This highlights an exploration, exploitation, and diversification trade-off in our problem that goes beyond the exploration and exploitation duality found in classic sequential decision problems. From these observations we propose D3TS, a method based on multi-armed bandits for non-stationary stochastic processes that enforces classifier diversity, matching or exceeding the performance of competing methods on seven real network datasets in our evaluation.

bayesian inference, classifier, upstream oil & gas, (18 more...)

1703.05082

Country:

South America > Brazil (0.14)
North America > United States > Wisconsin (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.67)
Energy > Oil & Gas > Upstream (0.54)
Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
(2 more...)

Kallummil, Sreejith, Kalyani, Sheetal

Tuning Free Orthogonal Matching Pursuit

Orthogonal matching pursuit (OMP) is a widely used compressive sensing (CS) algorithm for recovering sparse signals in noisy linear regression models. The performance of OMP depends on its stopping criteria (SC). SC for OMP discussed in literature typically assumes knowledge of either the sparsity of the signal to be estimated $k_0$ or noise variance $\sigma^2$, both of which are unavailable in many practical applications. In this article we develop a modified version of OMP called tuning free OMP or TF-OMP which does not require a SC. TF-OMP is proved to accomplish successful sparse recovery under the usual assumptions on restricted isometry constants (RIC) and mutual coherence of design matrix. TF-OMP is numerically shown to deliver a highly competitive performance in comparison with OMP having \textit{a priori} knowledge of $k_0$ or $\sigma^2$. Greedy algorithm for robust de-noising (GARD) is an OMP like algorithm proposed for efficient estimation in classical overdetermined linear regression models corrupted by sparse outliers. However, GARD requires the knowledge of inlier noise variance which is difficult to estimate. We also produce a tuning free algorithm (TF-GARD) for efficient estimation in the presence of sparse outliers by extending the operating principle of TF-OMP to GARD. TF-GARD is numerically shown to achieve a performance comparable to that of the existing implementation of GARD.

artificial intelligence, machine learning, tf-omp, (18 more...)

1703.0508

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Zachariah, Dave, Stoica, Petre, Schön, Thomas B.

Online Learning for Distribution-Free Prediction

We develop an online learning method for prediction, which is important in problems with large and/or streaming data sets. We formulate the learning approach using a covariance-fitting methodology, and show that the resulting predictor has desirable computational and distribution-free properties: It is implemented online with a runtime that scales linearly in the number of samples; has a constant memory requirement; avoids local minima problems; and prunes away redundant feature dimensions without relying on restrictive assumptions on the data distribution. In conjunction with the split conformal approach, it also produces distribution-free prediction confidence intervals in a computationally efficient manner. The method is demonstrated on both real and synthetic datasets.

artificial intelligence, machine learning, predictor, (17 more...)

1703.0506

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.62)

mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions

Bischl, Bernd, Richter, Jakob, Bossek, Jakob, Horn, Daniel, Thomas, Janek, Lang, Michel

We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization (MBO), also known as Bayesian optimization, which addresses the problem of expensive black-box optimization by approximating the given objective function through a surrogate regression model. It is designed for both single- and multi-objective optimization with mixed continuous, categorical and conditional parameters. Additional features include multi-point batch proposal, parallelization, visualization, logging and error-handling. mlrMBO is implemented in a modular fashion, such that single components can be easily replaced or adapted by the user for specific use cases, e.g., any regression learner from the mlr toolbox for machine learning can be used, and infill criteria and infill optimizers are easily exchangeable. We empirically demonstrate that mlrMBO provides state-of-the-art performance by comparing it on different benchmark scenarios against a wide range of other optimizers, including DiceOptim, rBayesianOptimization, SPOT, SMAC, Spearmint, and Hyperopt.

evolutionary algorithm, machine learning, optimization, (17 more...)

1703.03373

Country: Europe > Germany (0.28)

Genre: Research Report (0.82)

Industry: Transportation > Air (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

@machinelearnbotMar-10-2017, 20:30:07 GMT

"Multicollinearity" a Problem or an Opportunity?

Multicollinearity (Collinearity) is not a new term especially when dealing with multiple regression models. This phenomenon of relationship in between one response variable with the set of predictor variables also include models like classification and regression trees as well as neural networks. Collinearity is infamously famous for inflating the variance of at least one estimated regression coefficient, which can cause the model to predict erroneously and in a business setup it can have an unrepairable consequence. So, the next logical question is how to identify collinearity? In this article we will only talk about the Variance Inflation Factor(VIF) identification technique which is very useful for identify high multicollinearity among the predictor variables when working with MLR (Multiple Linear Regression Models).

artificial intelligence, machine learning, multicollinearity, (4 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)