Overview
Open Source Stories: The People Behind OpenAI
You might think, based on the type of research they're doing, that the OpenAI office would be full of gadgets, full of wonder, full of weird experiments. There are no Faraday cages. Well, okay, there is a robot. And it's tucked away in a side room. It's surrounded by cobbled-together protective material so that it doesn't smash into itself if it starts flailing about due to a programming error.
How Artificial Intelligence Is Disrupting Finance
General purpose technology is a term economists reserve for technologies that spur protracted economic growth and societal advancements, revolutionizing the operations of households and corporations alike. A sample general purpose technology is electricity. Electricity spawned a multitude of products and sectors, including refrigerators, washing machines, trains and, of course, computers. The advent of electricity radically transformed the world. A recent Harvard Business Review article designates artificial intelligence (AI) as the most important general purpose technology of our era. A car that can parallel park itself. Devices that respond with tomorrow's weather when we ask.
Data Science
While data science has emerged as an ambitious new scientific field, related debates and discussions have sought to address why science in general needs data science and what even makes data science a science. Following a comprehensive literature review,5,6,10,11,12,15,18 I offer a number of observations concerning big data and the data science debate. For example, discussion has covered not only data-related disciplines and domains like statistics, computing, and informatics but traditionally less data-related fields and areas like social science and business management as well. Data science has thus emerged as a new inter- and cross-disciplinary field. Although many publications are available, most (likely over 95%) concern existing concepts and topics in statistics, data mining, machine learning, and broad data analytics. This limited view demonstrates how data science has emerged from existing core disciplines, particularly statistics, computing, and informatics. The abuse, misuse, and overuse of the term "data science" is ubiquitous, contributing to the hype, and myths and pitfalls are common.4 While specific challenges have been covered,13,16 few scholars have addressed the low-level complexities and problematic nature of data science or contributed deep insight about the intrinsic challenges, directions, and opportunities of data science as an emerging field. Data science promises new opportunities for scientific research, addressing, say, "What can I do now but could not do before, as when processing large-scale data?"; "What did I do before that does not work now, as in methods that view data objects as independent and identically distributed variables (IID)?"; "What problems not solved well previously are becoming even more complex, as when quantifying complex behavioral data?"; and "What could I not do better before, as in deep analytics and learning?"
Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities
One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. Prior works in this domain operate on a static snapshot of the community, making strong assumptions about the structure of the data (e.g., relational tables), or consider only shallow features for text classification. To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation. To this end, we devise new models based on Conditional Random Fields for different settings like incorporating partial expert knowledge for semi-supervised learning, and handling discrete labels as well as numeric ratings for fine-grained analysis. This enables applications such as extracting reliable side-effects of drugs from user-contributed posts in healthforums, and identifying credible content in news communities. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This also enables applications such as identifying helpful product reviews, and detecting fake and anomalous reviews with limited information.
Convolutional Experts Constrained Local Model for Facial Landmark Detection
Zadeh, Amir, Baltrušaitis, Tadas, Morency, Louis-Philippe
Constrained Local Models (CLMs) are a well-established family of methods for facial landmark detection. However, they have recently fallen out of favor to cascaded regression-based approaches. This is in part due to the inability of existing CLM local detectors to model the very complex individual landmark appearance that is affected by expression, illumination, facial hair, makeup, and accessories. In our work, we present a novel local detector -- Convolutional Experts Network (CEN) -- that brings together the advantages of neural architectures and mixtures of experts in an end-to-end framework. We further propose a Convolutional Experts Constrained Local Model (CE-CLM) algorithm that uses CEN as local detectors. We demonstrate that our proposed CE-CLM algorithm outperforms competitive state-of-the-art baselines for facial landmark detection by a large margin on four publicly-available datasets. Our approach is especially accurate and robust on challenging profile images.
The Challenge of Non-Technical Loss Detection using Artificial Intelligence: A Survey
Glauner, Patrick, Meira, Jorge Augusto, Valtchev, Petko, State, Radu, Bettinger, Franck
Detection of non-technical losses (NTL) which include electricity theft, faulty meters or billing errors has attracted increasing attention from researchers in electrical engineering and computer science. NTLs cause significant harm to the economy, as in some countries they may range up to 40% of the total electricity distributed. The predominant research direction is employing artificial intelligence to predict whether a customer causes NTL. This paper first provides an overview of how NTLs are defined and their impact on economies, which include loss of revenue and profit of electricity providers and decrease of the stability and reliability of electrical power grids. It then surveys the state-of-the-art research efforts in a up-to-date and comprehensive review of algorithms, features and data sets used. It finally identifies the key scientific and engineering challenges in NTL detection and suggests how they could be addressed in the future.
AI-Driven Personal Assistant Apps Shaping Digital Consumer Habits
Over the past year, Verto Analytics has published critical research identifying and tracking rapidly-emerging trends in consumer behavior, particularly on mobile devices. Hannu Verkasalo, CEO, Verto Analytics, introduces the conditions under which AI driven apps are influential, from the rise of multitasking to the continued influence of cross-device behavior on digital usage. One thing is clear, says the report: "consumer habits are changing faster than before, aided by an increasingly novel technologies and shorter device innovation cycles. The prevalence of Internet access, the rise of social media, e-commerce, and most recently, mobile apps, have all shaped how consumers behave with digital devices. Within the past few years, we've witnessed the rise of AI-powered apps, which harness cloud-based natural language processing (NLP) and machine learning to power a more sophisticated wave of apps and services," continues the introduction.
Health Analytics: a systematic review of approaches to detect phenotype cohorts using electronic health records
Hiob, Norman, Lessmann, Stefan
The paper presents a systematic review of state-of-the-art approaches to identify patient cohorts using electronic health records. It gives a comprehensive overview of the most commonly de-tected phenotypes and its underlying data sets. Special attention is given to preprocessing of in-put data and the different modeling approaches. The literature review confirms natural language processing to be a promising approach for electronic phenotyping. However, accessibility and lack of natural language process standards for medical texts remain a challenge. Future research should develop such standards and further investigate which machine learning approaches are best suited to which type of medical data.
Off the Map: The Rough Road Ahead for Self-Driving Cars in China
China is creating roadblocks for U.S. auto makers and tech companies to bringing self-driving cars to the world's largest auto market. Citing national security concerns, China is limiting the amount of mapping that can be done by foreign companies, as General Motors Co., Ford Motor Co., Alphabet Inc. and Apple Inc. rush to develop self-driving cars or the software behind them. High-definition maps are crucial for autonomous cars to help them discern their exact location, navigate tricky intersections and avoid fixed objects such as buildings. Global car makers already need to form a partnership with a local company to open factories in China, but some are skeptical they will be able to find a way to operate their autonomous-car software in China because of the mapping restrictions. Brian McClendon, an industry pioneer who helped created Google Maps and later led Uber Technologies Inc.'s self-driving effort, said he doubted U.S. software would ever be adopted for self-driving cars in China.
Intel Democratizes Deep Learning Application Development with Launch of Movidius Neural Compute Stick Intel Newsroom
Today, Intel launched the Movidius Neural Compute Stick, the world's first USB-based deep learning inference kit and self-contained artificial intelligence (AI) accelerator that delivers dedicated deep neural network processing capabilities to a wide range of host devices at the edge. Designed for product developers, researchers and makers, the Movidius Neural Compute Stick aims to reduce barriers to developing, tuning and deploying AI applications by delivering dedicated high-performance deep-neural network processing in a small form factor. As more developers adopt advanced machine learning approaches to build innovative applications and solutions, Intel is committed to providing the most comprehensive set of development tools and resources to ensure developers are retooling for an AI-centric digital economy. Whether it is training artificial neural networks on the Intel Nervana cloud, optimizing for emerging workloads such as artificial intelligence, virtual and augmented reality, and automated driving with Intel Xeon Scalable processors, or taking AI to the edge with Movidius vision processing unit (VPU) technology, Intel offers a comprehensive AI portfolio of tools, training and deployment options for the next generation of AI-powered products and services. "The Myriad 2 VPU housed inside the Movidius Neural Compute Stick provides powerful, yet efficient performance – more than 100 gigaflops of performance within a 1W power envelope – to run real-time deep neural networks directly from the device," said Remi El-Ouazzane, vice president and general manager of Movidius, an Intel company.