AITopics

1908.09874

Country: North America > United States (0.93)

Genre: Research Report (0.65)

Industry: Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
(3 more...)

Pichler, Maximilian, Boreux, Virginie, Klein, Alexandra-Maria, Schleuning, Matthias, Hartig, Florian

Machine learning algorithms to infer trait matching and predict species interactions in ecological networks

arXiv.org Machine LearningAug-26-2019

Ecologists have long suspected that species are more likely to interact if their traits match in a particular way. For example, a pollination interaction may be particularly likely if the proportions of a bee's tongue match flower shape in a beneficial way. Empirical evidence for trait matching, however, varies significantly in strength among different types of ecological networks. Here, we show that ambiguity among empirical trait matching studies may have arisen at least in parts from using overly simple statistical models. Using simulated and real data, we contrast conventional regression models with Machine Learning (ML) models (Random Forest, Boosted Regression Trees, Deep Neural Networks, Convolutional Neural Networks, Support Vector Machines, naive Bayes, and k-Nearest-Neighbor), testing their ability to predict species interactions based on traits, and infer trait combinations causally responsible for species interactions. We find that the best ML models can successfully predict species interactions in plant-pollinator networks (up to 0.93 AUC) and outperform conventional regression models. Our results also demonstrate that ML models can better identify the causally responsible trait matching combinations than GLMs. In two case studies, the best ML models could successfully predict species interactions in a global plant-pollinator database and infer ecologically plausible trait matching rules for a plant-hummingbird network from Costa Rica, without any prior assumptions about the system. We conclude that flexible ML models offer many advantages over traditional regression models for understanding interaction networks. We anticipate that these results extrapolate to other network types, such as trophic or competitive networks. More generally, our results highlight the potential of ML and artificial intelligence for inference beyond standard tasks such as pattern recognition.

artificial intelligence, deep learning, machine learning, (18 more...)

1908.09853

Country:

Europe > Germany (0.28)
Europe > Austria (0.28)
North America > Costa Rica (0.25)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

#artificialintelligenceAug-25-2019, 23:50:04 GMT

Regression Analysis in One Picture

The basic idea behind regression analysis is to take a set of data and use that data to make predictions. A useful first step is to make a scatter plot to see the rough shape of your data. Then, choose a regression method to find the line of best fit. Which method you choose depends upon the shape the scatter plot reveals (is it a line, a curve, or a parabola?) The following image shows an overview of regression; See below for links to more detail.

artificial intelligence, machine learning, regression analysis

Genre:

Research Report > New Finding (0.70)
Research Report > Experimental Study (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceAug-25-2019, 10:07:38 GMT

Understanding Neural Networks within Data Science

Moving forward, let's start with our basic imports: Let's say you want to make a model that is either a classification or regression based. How would you know which is the best model & which should you apply to your data set. In order to answer this, you need to fully understand what data you're trying to apply data science concepts to. My Cybersecurity data science project was a unbalanced classification problem. So I would decide to use a classification neural network model on the data.

artificial intelligence, machine learning, neural network, (17 more...)

Industry: Education > Curriculum > Subject-Specific Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

#artificialintelligenceAug-25-2019, 04:52:27 GMT

How Concerned Should You be About Predictor Collinearity? It Depends…

This past Northern Hemisphere summer, I gave several talks (some in the Southern Hemisphere) in which one of the Q&A topics was the problem of collinearity between predictor variables (also known as multicollinearity). My stock response to a question on this topic was (and is) to reply with the clarifying question, "How many rows do you have to develop the model?" If the follow-up response was in the tens of thousands, my counter-response was "Don't worry about collinearity." In contrast, if the audience member's response was a few hundred rows or less, my response was "Very!" While these two different responses may seem contradictory, they actually are not.

artificial intelligence, data mining, machine learning, (18 more...)

Technology:

Information Technology > Data Science > Data Mining (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

Bechberger, Lucas, Kühnberger, Kai-Uwe

Generalizing Psychological Similarity Spaces to Unseen Stimuli

arXiv.org Machine LearningAug-25-2019

Generalizing Psychological Similarity Spaces to Unseen Stimuli Combining Multidimensional Scaling with Artificial Neural Networks Lucas Bechberger and Kai-Uwe Kühnberger Abstract The cognitive framework of conceptual spaces proposes to represent concepts as regions in psychological similarity spaces. These similarity spaces are typically obtained through multidimensional scaling (MDS), which converts human dissimilarity ratings for a fixed set of stimuli into a spatial representation. One can distinguish metric MDS (which assumes that the dissimilarity ratings are interval or ratio scaled) from nonmetric MDS (which only assumes an ordinal scale). In our first study, we show that despite its additional assumptions, metric MDS does not necessarily yield better solutions than nonmetric MDS. In this chapter, we furthermore propose to learn a mapping from raw stimuli into the similarity space using artificial neural networks (ANNs) in order to generalize the similarity space to unseen inputs. In our second study, we show that a linear regression from the activation vectors of a convolutional ANN to similarity spaces obtained by MDS can be successful and that the results are sensitive to the number of dimensions of the similarity space. 1 Introduction The cognitive framework of conceptual spaces [Gärdenfors, 2000] proposes a geometric representation of conceptual structures: Instances are represented as points and concepts are represented as regions in psychological similarity spaces. Based on this representation, one can explain a range of cognitive phenomena from oneshotLucas Bechberger Institute of Cognitive Science, Osnabrück University email: lucas.bechberger@ The research presented in this paper is an updated, corrected, and significantly extended version of research reported in [Bechberger and Kypridemou, 2018]. 1 arXiv:1908.09260v1 In principle, there are three ways of obtaining the dimensions of a conceptual space: If the domain of interest is well understood, one can manually define the dimensions and thus the overall similarity space. A second approach is based on machine learning algorithms for dimensionality reduction. For instance, unsupervised artificial neural networks (ANNs) such as autoencoders or self-organizing maps can be used to find a compressed representation for a given set of input stimuli. This task is typically solved by optimizing a mathematical error function which may be not satisfactory from a psychological point of view. A third way of obtaining the dimensions of a conceptual space is based on dissimilarity ratings obtained from human subjects. The technique of "multidimensional scaling" (MDS) takes as an input these pairwise dissimilarities as well as the desired number t of dimensions. It then represents each stimulus as a point in an t -dimensional space in such a way that the distances between points in this space reflect the dissimilarities of their corresponding stimuli.

artificial intelligence, machine learning, similarity space, (16 more...)

1908.0926

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

#artificialintelligenceAug-24-2019, 20:22:33 GMT

Bias, Variance, and Regularization in Linear Regression: Lasso, Ridge, and Elastic Net -- Differences and uses

Regression is an incredibly popular and common machine learning technique. Often the starting point in learning machine learning, linear regression is an intuitive algorithm for easy-to-understand problems. It can generally be used whenever you're trying to predict a continuous variable (a variable that can take any value in some numeric range), linear regressions and its relatives are often strong options, and are almost always the best place to start. This blog assumes a functional knowledge of ordinary least squares (OLS) linear regression. You can read more about OLS linear regression here, here, or here.

coefficient, lasso, predictor, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Akbari, Mohammad, Chunara, Rumi

Using Contextual Information to Improve Blood Glucose Prediction

arXiv.org Machine LearningAug-24-2019

Blood glucose value prediction is an important task in diabetes management. While it is reported that glucose concentration is sensitive to social context such as mood, physical activity, stress, diet, alongside the influence of diabetes pathologies, we need more research on data and methodologies to incorporate and evaluate signals about such temporal context into prediction models. Person-generated data sources, such as actively contributed surveys as well as passively mined data from social media offer opportunity to capture such context, however the self-reported nature and sparsity of such data mean that such data are noisier and less specific than physiological measures such as blood glucose values themselves. Therefore, here we propose a Gaussian Process model to both address these data challenges and combine blood glucose and latent feature representations of contextual data for a novel multi-signal blood glucose prediction task. We find this approach outperforms common methods for multi-variate data, as well as using the blood glucose values in isolation. Given a robust evaluation across two blood glucose datasets with different forms of contextual information, we conclude that multi-signal Gaussian Processes can improve blood glucose prediction by using contextual information and may provide a significant shift in blood glucose prediction research and practice.

blood glucose value, contextual information, information, (10 more...)

1909.01735

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Nagasaki Prefecture > Nagasaki (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(3 more...)

arXiv.org Machine LearningAug-23-2019

Consistent Classification with Generalized Metrics

Wang, Xiaoyan, Li, Ran, Yan, Bowei, Koyejo, Oluwasanmi

We propose a framework for constructing and analyzing multiclass and multioutput classification metrics, i.e., involving multiple, possibly correlated multiclass labels. Our analysis reveals novel insights on the geometry of feasible confusion tensors -- including necessary and sufficient conditions for the equivalence between optimizing an arbitrary non-decomposable metric and learning a weighted classifier. Further, we analyze averaging methodologies commonly used to compute multioutput metrics and characterize the corresponding Bayes optimal classifiers. We show that the plug-in estimator based on this characterization is consistent and is easily implemented as a post-processing rule. Empirical results on synthetic and benchmark datasets support the theoretical findings.

classifier, machine learning, natural language, (15 more...)

1908.09057

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Kuhn, Daniel, Esfahani, Peyman Mohajerin, Nguyen, Viet Anh, Shafieezadeh-Abadeh, Soroosh

Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

arXiv.org Machine LearningAug-23-2019

Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many training samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the same distribution---especially if the dimension of the uncertainty is large relative to the training sample size. Wasserstein distributionally robust optimization seeks data-driven decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples. In this tutorial we will argue that this approach has many conceptual and computational benefits. Most prominently, the optimal decisions can often be computed by solving tractable convex optimization problems, and they enjoy rigorous out-of-sample and asymptotic consistency guarantees. We will also show that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1908.08729

Country: Europe (0.92)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)