Goto

Collaborating Authors

 Education


Trading-off variance and complexity in stochastic gradient descent

arXiv.org Machine Learning

Stochastic gradient descent is the method of choice for large-scale machine learning problems, by virtue of its light complexity per iteration. However, it lags behind its non-stochastic counterparts with respect to the convergence rate, due to high variance introduced by the stochastic updates. The popular Stochastic Variance-Reduced Gradient (SVRG) method mitigates this shortcoming, introducing a new update rule which requires infrequent passes over the entire input dataset to compute the full-gradient. In this work, we propose CheapSVRG, a stochastic variance-reduction optimization scheme. Our algorithm is similar to SVRG but instead of the full gradient, it uses a surrogate which can be efficiently computed on a small subset of the input data. It achieves a linear convergence rate ---up to some error level, depending on the nature of the optimization problem---and features a trade-off between the computational complexity and the convergence rate. Empirical evaluation shows that CheapSVRG performs at least competitively compared to the state of the art.


Patterns of Scalable Bayesian Inference

arXiv.org Machine Learning

Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward.


Comparing Human and Automated Evaluation of Open-Ended Student Responses to Questions of Evolution

arXiv.org Artificial Intelligence

Written responses can provide a wealth of data in understanding student reasoning on a topic. Yet they are time- and labor-intensive to score, requiring many instructors to forego them except as limited parts of summative assessments at the end of a unit or course. Recent developments in Machine Learning (ML) have produced computational methods of scoring written responses for the presence or absence of specific concepts. Here, we compare the scores from one particular ML program -- EvoGrader -- to human scoring of responses to structurally- and content-similar questions that are distinct from the ones the program was trained on. We find that there is substantial inter-rater reliability between the human and ML scoring. However, sufficient systematic differences remain between the human and ML scoring that we advise only using the ML scoring for formative, rather than summative, assessment of student reasoning.


Free Webinar: Building A Scalable Data Science Platform with R and Hadoop

#artificialintelligence

Cloud computing is famously scalable. But what if we seamlessly combined Hadoop with the cloud and R to create a scalable data science platform? Imagine exploring, transforming, modeling, and scoring data at any scale from the comfort of your favorite R environment. Now, imagine calling a simple R function to operationalize your predictive model as a scalable, cloud-based web service. Learn how to leverage the magic of Hadoop on-premises or in the cloud to run your R code, with thousands of open source R extension packages, and distributed implementations of the most popular machine learning algorithms, at scale. Click here or on the image below to register for this free webinar.


The Best AI Still Flunks 8th Grade Science

#artificialintelligence

In 2012, IBM Watson went to medical school. So said The New York Times, announcing that the tech giant's artificially intelligent question-and-answer machine had begun a "stint as a medical student" at the Cleveland Clinic Lerner College of Medicine. This was just a metaphor. Clinicians were helping IBM train Watson for use in medical research. But as metaphors go, it wasn't a very good one.


Concept as Abstraction. A hindrance in developing intelligence? (Addenda 2/7/16)

#artificialintelligence

I used to teach the calculus in a private high school where I was headmaster and would occasionally play a game called "WFF'N Proof" with my students. They had to roll a set of six dice that had sides marked with the following: p, q, r, C, A, K, E or N. After a throw they had to see if they could construct a WFF, a well-formed-formula, by arranging the outcomes in various combinations. A WFF was defined as any of the following: p, q or r. (Addendum 1, 2/7/16: Note that there is no indication of p, q, or r having common features.) Also WFF's were combinations of the letters C, A, K or E followed by two WFF's. Or N followed by a single WFF. So beginning with p, q, or r, one could possibly construct the new WFF's, Np or Nq or Nr. Continuing in this vein, one might construct Apq, or ANpp, or CAqKqrr, … (Addendum 2, 2/7/16: Note also that no two complex WFF's need have common features, e.g.


Why Machine Learning Beginners Shouldn't Avoid the Math

#artificialintelligence

In this post I consider three learning approaches and argue that it could be a bad idea to avoid the mathematics and theory when starting out with machine learning. There are three approaches to starting out in machine learning that I have seen practiced. One is a bottom-up approach, in which the student starts with the mathematics and theory and then puts it into practice in either a high-level programming language -- such as Matlab, Python, R or Octave -- or by coding from scratch in a 3GL like Java, C# or C . The second is the top-down approach, in which machine learning tools and/or libraries are used to shelter the student from the coding, mathematics and theory. S/he is instructed to worry about how it all works later and to instead practice working with datasets.


Why haven't we met aliens yet? Because they've evolved into AI. - RBLS.

#artificialintelligence

While traveling in Western Samoa many years ago, I met a young Harvard University graduate student researching ants. He invited me on a hike into the jungles to assist with his search for the tiny insect. He told me his goal was to discover a new species of ant, in hopes it might be named after him one day. Whenever I look up at the stars at night pondering the cosmos, I think of my ant collector friend, kneeling in the jungle with a magnifying glass, scouring the earth. I think of him, because I believe in aliens--and I've often wondered if aliens are doing the same to us.


AI & The Future Of Civilization

#artificialintelligence

There is no meaningful sense in which there is an abstract notion of purpose. That is, purpose is something that comes from history. One of the things that might be true about computation, might be true about our world, that would be disappointing, is maybe we go through all this history and biology and civilization and so on, and at the end of the day, the answer is 42 or something. That's the end, so to speak. We got to the answer.


Free Resources to Learn Machine Learning for Trading

#artificialintelligence

While being a vibrant subfield of computer science, machine learning is used for drawing models and methods from statistics, algorithms, computational complexity, control theory and artificial intelligence. It focuses on efficient algorithms for inferring good predictive models from large data sets and is natural candidate for problems arising in HFT – both trade execution & alpha generation. In quantitative finance inference of models of predictive nature using historical data is obviously not new. Some examples include the coefficient estimation for CAPM, Fama and French factors. The granularity of data arising in HFT poses special challenges for machine learning. Often data microstructure at the resolution of individual orders, executions, hidden liquidity and cancellation including lack of understanding of how such granular data relates to actionable circumstances, namely profitably buying or selling shares, optimally executing a large order, etc.