Overview
Geometric Foundations of Data Reduction
The purpose of this paper is to write a complete survey of the (spectral) manifold learning methods and nonlinear dimensionality reduction (NLDR) in data reduction. The first two NLDR methods in history were respectively published in Science in 2000 in which they solve the similar reduction problem of high-dimensional data endowed with the intrinsic nonlinear structure. The intrinsic nonlinear structure is always interpreted as a concept in manifolds from geometry and topology in theoretical mathematics by computer scientists and theoretical physicists. In 2001, the concept of Manifold Learning first appears as an NLDR method called Laplacian Eigenmaps purposed by Belkin and Niyogi. In the typical manifold learning setup, the data set, also called the observation set, is distributed on or near a low dimensional manifold $M$ embedded in $\mathbb{R}^D$, which yields that each observation has a $D$-dimensional representation. The goal of (spectral) manifold learning is to reduce these observations as a compact lower-dimensional representation based on the geometric information. The reduction procedure is called the (spectral) manifold learning method. In this paper, we derive each (spectral) manifold learning method with the matrix and operator representation, and we then discuss the convergence behavior of each method in a geometric uniform language. Hence, we name the survey Geometric Foundations of Data Reduction.
AI and Machine Learning: A Primer
We've been hearing about the potential impacts Artificial Intelligence (AI) could have on the legal profession for several years now. Naysayers warn of waves of legal administration job losses and that IBM's Watson could supplant lawyers. Proponents laud AI's ability to transform the legal industry to a model of effectiveness and profitability never seen before. Neither side provides clear and direct evidence of how these things might actually come to pass. The AI discussion outside of legal operates on the same basic premises, but at a macro level.
Compositional Generalization via Neural-Symbolic Stack Machines
Chen, Xinyun, Liang, Chen, Yu, Adams Wei, Song, Dawn, Zhou, Denny
Despite achieving tremendous success, existing deep learning models have exposed limitations in compositional generalization, the capability to learn compositional rules and apply them to unseen cases in a systematic manner. To tackle this issue, we propose the Neural-Symbolic Stack Machine (NeSS). It contains a neural network to generate traces, which are then executed by a symbolic stack machine enhanced with sequence manipulation operations. NeSS combines the expressive power of neural sequence models with the recursion supported by the symbolic stack machine. Without training supervision on execution traces, NeSS achieves 100% generalization performance in three domains: the SCAN benchmark of language-driven navigation tasks, the compositional machine translation benchmark, and context-free grammar parsing tasks.
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation
Goel, Karan, Gu, Albert, Li, Yixuan, Rรฉ, Christopher
Classifiers in machine learning are often brittle when deployed. Particularly concerning are models with inconsistent performance on specific subgroups of a class, e.g., exhibiting disparities in skin cancer classification in the presence or absence of a spurious bandage. To mitigate these performance differences, we introduce model patching, a two-stage framework for improving robustness that encourages the model to be invariant to subgroup differences, and focus on class information shared by subgroups. Model patching first models subgroup features within a class and learns semantic transformations between them, and then trains a classifier with data augmentations that deliberately manipulate subgroup features. We instantiate model patching with CAMEL, which (1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and (2) balances subgroup performance using a theoretically-motivated subgroup consistency regularizer, accompanied by a new robust objective. We demonstrate CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error of up to 33% relative to the best baseline. Lastly, CAMEL successfully patches a model that fails due to spurious features on a real-world skin cancer dataset.
Personality in Healthcare Human Robot Interaction (H-HRI): A Literature Review and Brief Critique
Esterwood, Connor, Robert, Lionel P.
Robots are becoming an important way to deliver health care, and personality is vital to understanding their effectiveness. Despite this, there is a lack of a systematic overarching understanding of personality in health care human robot interaction (H-HRI). To address this, the authors conducted a review that identified 18 studies on personality in H-HRI. This paper presents the results of that systematic literature review. Insights are derived from this review regarding the methodologies, outcomes, and samples utilized. The authors of this review discuss findings across this literature while identifying several gaps worthy of attention. Overall, this paper is an important starting point in understanding personality in H-HRI.
OR-Gym: A Reinforcement Learning Library for Operations Research Problem
Hubbs, Christian D., Perez, Hector D., Sarwar, Owais, Sahinidis, Nikolaos V., Grossmann, Ignacio E., Wassick, John M.
Reinforcement learning (RL) has been widely applied to game-playing and surpassed the best human-level performance in many domains, yet there are few use-cases in industrial or commercial settings. We introduce OR-Gym, an open-source library for developing reinforcement learning algorithms to address operations research problems. In this paper, we apply reinforcement learning to the knapsack, multi-dimensional bin packing, multi-echelon supply chain, and multi-period asset allocation model problems, as well as benchmark the RL solutions against MILP and heuristic models. These problems are used in logistics, finance, engineering, and are common in many business operation settings. We develop environments based on prototypical models in the literature and implement various optimization and heuristic models in order to benchmark the RL results. By re-framing a series of classic optimization problems as RL tasks, we seek to provide a new tool for the operations research community, while also opening those in the RL community to many of the problems and challenges in the OR field.
Stanford's 2020 AIMI Symposium: A Brief Summary
Session 1 was titled Democratizing Healthcare with AI. This was by far the most intriguing and interesting session, as it had lots of great insights on how to make high-level research available to all. Session 1 began by highlighting the gap between high-level research and the people who actually can benefit from the research. All speakers stressed the importance of developing products from the research that can truly help the field. Some advancements have been made, and some presented include smartphone-powered ultrasounds, smartwatch-based diagnosis of Afib, and even AI-powered dieting apps.
Creativity in the era of artificial intelligence
Esling, Philippe, Devis, Ninon
Creativity is a deeply debated topic, as this concept is arguably quintessential to our humanity. Across different epochs, it has been infused with an extensive variety of meanings relevant to that era. Along these, the evolution of technology have provided a plurality of novel tools for creative purposes. Recently, the advent of Artificial Intelligence (AI), through deep learning approaches, have seen proficient successes across various applications. The use of such technologies for creativity appear in a natural continuity to the artistic trend of this century. However, the aura of a technological artefact labeled as intelligent has unleashed passionate and somewhat unhinged debates on its implication for creative endeavors. In this paper, we aim to provide a new perspective on the question of creativity at the era of AI, by blurring the frontier between social and computational sciences. To do so, we rely on reflections from social science studies of creativity to view how current AI would be considered through this lens. As creativity is a highly context-prone concept, we underline the limits and deficiencies of current AI, requiring to move towards artificial creativity. We argue that the objective of trying to purely mimic human creative traits towards a self-contained ex-nihilo generative machine would be highly counterproductive, putting us at risk of not harnessing the almost unlimited possibilities offered by the sheer computational power of artificial agents.
Considerations, Good Practices, Risks and Pitfalls in Developing AI Solutions Against COVID-19
Luccioni, Alexandra, Bullock, Joseph, Pham, Katherine Hoffmann, Lam, Cynthia Sin Nga, Luengo-Oroz, Miguel
The COVID-19 pandemic has been a major challenge to humanity, with 12.7 million confirmed cases as of July 13th, 2020 [1]. In previous work, we described how Artificial Intelligence can be used to tackle the pandemic with applications at the molecular, clinical, and societal scales [2]. In the present follow-up article, we review these three research directions, and assess the level of maturity and feasibility of the approaches used, as well as their potential for operationalization. We also summarize some commonly encountered risks and practical pitfalls, as well as guidelines and best practices for formulating and deploying AI applications at different scales.
Compression of Deep Learning Models for Text: A Survey
Gupta, Manish, Agrawal, Puneet
In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanks to deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTMs) networks, and Transformer based models like Bidirectional Encoder Representations from Transformers (BERT). But these models are humongous in size. On the other hand, real world applications demand small model size, low response times and low computational power wattage. In this survey, we discuss six different types of methods (Pruning, Quantization, Knowledge Distillation, Parameter Sharing, Tensor Decomposition, and Linear Transformer based methods) for compression of such models to enable their deployment in real industry NLP projects. Given the critical need of building applications with efficient and small models, and the large amount of recently published work in this area, we believe that this survey organizes the plethora of work done by the 'deep learning for NLP' community in the past few years and presents it as a coherent story.