Education
Securing safe water through Cortana Intelligence Suite
Jacob Katuva used to get up at dawn to cycle 12 miles from his village to collect water with his uncles and cousins when he was growing up in Kenya. Now he is part of a research team at the University of Oxford using cloud computing and mobile sensors to monitor water wells and help ensure that thousands of villages in rural Africa and Asia have a safe, secure supply of water. The time spent finding and carrying water, if local wells are not reliable, steals precious time from farming, making a living or going to school. It can even force people to revert to unsanitary water sources shared with animals. Water issues are tied to a cycle of poverty.
The Data Structures and Algorithms Learning Problem - DZone Big Data
There was more about Foundations of Multidimensional and Metric Data Structures by Hanan Samet being too detailed, Stack Overflow being too high-level, and more hand-wringing after that, too. The email was pleading for some book or series of blog posts that would somehow educate data science folks on more fundamental issues of data structures and algorithms. Perhaps getting them to drop some dimensions when doing k-NN problems or perhaps exploit some other data structure that didn't involve 100's of columns. I'm guessing because -- like a lot of hand-waving emails -- it didn't involve code. If there is a lack of awareness of appropriate data structures, the real place to start is The Algorithm Design Manual by Steven Skiena.
Data, not algorithms, is key to machine learning success
There has been an explosion in machine learning activity, and Shivon Zilis recently mapped out the current machine intelligence ecosystem as we enter 2016. This is one of the key areas that we'll be following this year. While the opportunities here are tremendous, the exuberance surrounding machine learning distracts startups from a key hurdle: it's data, not algorithms, that will dictate who wins in this space. Algorithms have largely been commoditized by now, so a machine learning company built around publicly accessible data isn't defensible. But, startups face a serious chicken and egg problem: they have to convince people to give them data, but the machine intelligence service won't be useful until people (and a lot of people) are actually using the service and sharing their data.
Singularity University: meet the people who are building our future
It's day one at the Singularity University: the opening address has just been delivered by a hologram. Craig Venter, who was one of the first scientists to sequence the human genome and created the first synthetic life form, is up next. And later, we will see two people, paralysed from the waist down, use robotic exoskeletons to rise up and walk. But first, the co-founder of the Singularity University, Peter Diamandis, gives us our instructions for the day. Your task, he says, is to pick one of the "grand challenges of humanity" โ the lack of clean drinking water, say. And then come up with an idea that "can positively impact the lives of a billion people". Some of us haven't even had coffee yet. There's about 50 of us present and the room has been divided up into tables, one for education, another for poverty, another for water, and I'm not sure where I should sit. Diane Murphy, the university's PR executive, hesitates for a moment and then directs me over to the table marked "food". "Tell you what," she says.
Root Is a Little Robot on a Mission to Teach Kids to Code
Computing jobs are growing at twice the national rate of other types of employment. By 2020, the Bureau of Labor Statistics says, the US will have 1 million more computer science-related jobs than graduates qualified to fill them. In December, President Obama announced the Computer Science for All Initiative, pledging 4 billion in funding for computer science education in the nation's schools. Yet all kinds of dysfunction keeps the country from closing the deficit in computer science talent, according to a survey by Google and Gallup. Yes, school budgets are a problem, and teachers have a limited time to devote to additional classes.
Be kind to artificial intelligence
Mike Finley is a co-founder of AnswerRocket in charge of natural language processing and machine learning. Big innovations come in unexpected bursts. We grow accustomed to life and work as we know it, until something apparently simple brings about bold change. For example, we used phones for 100 years, but making them mobile transformed the world; we had the Internet for decades before the Web browser put digital education, entertainment and shopping in the hands of billions; and we documented our lives with physical pictures, paper records, CD-ROMs and thumb drives until Jeff Bezos brought us "the cloud." When individual creativity is enhanced by technical ingenuity, new behaviors and capabilities emerge.
'Exam factory' schools urged to shift emphasis to online learning
High-quality, low-cost online courses could be used to shift schools away from being "exam factories" and help students keep pace with the threat of automation, according to a new report by the Institute of Directors. The report argues that the internet allows schools to be more flexible and adapt learning towards "a future in which more and more work is taken over by robots or computers". Related: Welcome to the robot-based workforce: will your job become automated too? "The cost savings, convenience and flexibility that online learning offers has the potential to revolutionise education provision, but only if businesses and the education sector work together to capitalise on the potential of computer-based teaching applications to support employees in their pursuit of lifelong learning," the report said. Last year the CBI's director general also called for GCSEs to be scrapped and A-levels to be augmented by vocational courses. The report also calls for new tax incentives to encourage people to return to education, and to make it easier for employers to invest in their staff.
Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions
Simon-Gabriel, Carl-Johann, Schรถlkopf, Bernhard
Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures $\mu$ from some set $M$ to functions in a reproducing kernel Hilbert space (RKHS) with kernel $k$. The RKHS distance of two mapped measures is a semi-metric $d_k$ over $M$. We study three questions. (I) For a given kernel, what sets $M$ can be embedded? (II) When is the embedding injective over $M$ (in which case $d_k$ is a metric)? (III) How does the $d_k$-induced topology compare to other topologies on $M$? The existing machine learning literature has addressed these questions in cases where $M$ is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on measures only. In particular, we systemise and extend various (partly known) equivalences between different notions of universal, characteristic and strictly positive definite kernels, and show that on an underlying locally compact Hausdorff space, $d_k$ metrises the weak convergence of probability measures if and only if $k$ is continuous and characteristic.
Learning Sparse Low-Threshold Linear Classifiers
Sabato, Sivan, Shalev-Shwartz, Shai, Srebro, Nathan, Hsu, Daniel, Zhang, Tong
We consider the problem of learning a non-negative linear classifier with a $1$-norm of at most $k$, and a fixed threshold, under the hinge-loss. This problem generalizes the problem of learning a $k$-monotone disjunction. We prove that we can learn efficiently in this setting, at a rate which is linear in both $k$ and the size of the threshold, and that this is the best possible rate. We provide an efficient online learning algorithm that achieves the optimal rate, and show that in the batch case, empirical risk minimization achieves this rate as well. The rates we show are tighter than the uniform convergence rate, which grows with $k^2$.
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis
Damianou, Andreas, Lawrence, Neil D., Ek, Carl Henrik
Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce.