Earlier this year, artificial intelligence yielded a practical insight: people like to drink coffee in the morning, so workplaces should find efficient ways to serve coffee. That raised a question that's surprisingly deep -- and can cost serious money to ignore: Is AI actually necessary for this problem? is a question that remains largely unasked in Silicon Valley today. We think it's worth asking. To be sure, modern data products owe a lot of their success to artificial intelligence. Well-considered AI unlocks entirely new types of data-driven insights and cuts the time and money needed for manual data analysis.
There is a famous scene in the movie "Harry Potter and the Half‐Blood Prince": A student has been cursed, investigations are under way. All at once, Harry shouts "It was Malfoy." McGonagall replies "This is a very serious accusation, Potter." "Indeed," agrees Snape and continues "Your evidence?" Harry immediately responds, "I just know."
Recent developments in high throughput profiling of individual neurons have spurred data driven exploration of the idea that there exist natural groupings of neurons referred to as cell types. The promise of this idea is that the immense complexity of brain circuits can be reduced, and effectively studied by means of interactions between cell types. While clustering of neuron populations based on a particular data modality can be used to define cell types, such definitions are often inconsistent across different characterization modalities. We pose this issue of cross-modal alignment as an optimization problem and develop an approach based on coupled training of autoencoders as a framework for such analyses. We apply this framework to a Patch-seq dataset consisting of transcriptomic and electrophysiological profiles for the same set of neurons to study consistency of representations across modalities, and evaluate cross-modal data prediction ability.
We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) which specifically aims to efficiently support fully-general AD for array programming. Unlike existing dataflow programming representations in ML frameworks, our IR naturally supports function calls, higher-order functions and recursion, making ML models easier to implement. The ability to represent closures allows us to perform AD using ST without a tape, making the resulting derivative (adjoint) program amenable to ahead-of-time optimization using tools from functional language compilers, and enabling higher-order derivatives. Lastly, we introduce a proof of concept compiler toolchain called Myia which uses a subset of Python as a front end.
The design of neural network architectures is an important component for achieving state-of-the-art performance with machine learning systems across a broad array of tasks. Much work has endeavored to design and build architectures automatically through clever construction of a search space paired with simple learning algorithms. Recent progress has demonstrated that such meta-learning methods may exceed scalable human-invented architectures on image classification tasks. An open question is the degree to which such methods may generalize to new domains. In this work we explore the construction of meta-learning techniques for dense image prediction focused on the tasks of scene parsing, person-part segmentation, and semantic image segmentation.
We have all heard about image style transfer: extracting the style from a famous painting and applying it to another image is a task that has been achcieved with a number of different methods. Generative Adversarial Networks (GANs in short) are also being used on images for generation, image-to-image translation and more. On the surface, you might think that audio is completely different from images, and that all the different techniques that have been explored for image-related tasks can't also be applied to sounds. But what if we could find a way to convert audio signals to image-like 2-dimensional representations? This kind of sound representation is what we call "Spectrogram", and it is the key that will allow us to make use of algorithms specifically designed to work with images for our audio-related task.
Learning and memory in the brain are implemented by complex, time-varying changes in neural circuitry. The computational rules according to which synaptic weights change over time are the subject of much research, and are not precisely understood. Until recently, limitations in experimental methods have made it challenging to test hypotheses about synaptic plasticity on a large scale. However, as such data become available and these barriers are lifted, it becomes necessary to develop analysis techniques to validate plasticity models. Here, we present a highly extensible framework for modeling arbitrary synaptic plasticity rules on spike train data in populations of interconnected neurons.
We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time. This setup arises in many practical applications, e.g. when pharmaceutical companies test new treatment options against control pills for different diseases, or when internet companies test their default webpages versus various alternatives over time. Our framework proposes to replace a sequence of A/B tests by a sequence of best-arm MAB instances, which can be continuously monitored by the data scientist. When interleaving the MAB tests with an online false discovery rate (FDR) algorithm, we can obtain the best of both worlds: low sample complexity and any time online FDR control. Our main contributions are: (i) to propose reasonable definitions of a null hypothesis for MAB instances; (ii) to demonstrate how one can derive an always-valid sequential p-value that allows continuous monitoring of each MAB test; and (iii) to show that using rejection thresholds of online-FDR algorithms as the confidence levels for the MAB algorithms results in both sample-optimality, high power and low FDR at any point in time.
Technology has upended one business after another across the United States. To cite only the most recent developments: Lyft and others have utterly changed personal transportation, and Airbnb has done the same for hospitality. And in January 2018, the first Amazon Go store opened, sans checkout clerks, promising similar upheaval for grocers. What is happening is fairly well understood, if initially underestimated. Digitization and other technological advances are exposing the vulnerabilities in every industry, particularly retail. And now, logistics companies are starting to feel the heat. Our new research has turned up five trends that offer startling indicators of impending change for the trucking, rail, warehousing, and logistics companies that move America's merchandise. Start with autonomous trucks (ATs), which will change the cost structure and utilization of trucking--and with that, the cost of consumer goods. Sixty-five percent of the nation's consumable goods are trucked to market.