Overview
Dynamic Neural Network Architectural and Topological Adaptation and Related Methods -- A Survey
Training and inference in deep neural networks (DNNs) has, due to a steady increase in architectural complexity and data set size, lead to the development of strategies for reducing time and space requirements of DNN training and inference, which is of particular importance in scenarios where training takes place in resource constrained computation environments or inference is part of a time critical application. In this survey, we aim to provide a general overview and categorization of state-of-the-art (SOTA) of techniques to reduced DNN training and inference time and space complexities with a particular focus on architectural adaptions.
Learning User-Interpretable Descriptions of Black-Box AI System Capabilities
Verma, Pulkit, Marpally, Shashank Rao, Srivastava, Siddharth
Several approaches have been developed to answer specific questions that a user may have about an AI system that can plan and act. However, the problems of identifying which questions to ask and that of computing a user-interpretable symbolic description of the overall capabilities of the system have remained largely unaddressed. This paper presents an approach for addressing these problems by learning user-interpretable symbolic descriptions of the limits and capabilities of a black-box AI system using low-level simulators. It uses a hierarchical active querying paradigm to generate questions and to learn a user-interpretable model of the AI system based on its responses. In contrast to prior work, we consider settings where imprecision of the user's conceptual vocabulary precludes a direct expression of the agent's capabilities. Furthermore, our approach does not require assumptions about the internal design of the target AI system or about the methods that it may use to compute or learn task solutions. Empirical evaluation on several game-based simulator domains shows that this approach can efficiently learn symbolic models of AI systems that use a deterministic black-box policy in fully observable scenarios.
Generalizing Fairness: Discovery and Mitigation of Unknown Sensitive Attributes
Paul, William, Burlina, Philippe
When deploying artificial intelligence (AI) in the real world, being able to trust the operation of the AI by characterizing how it performs is an ever-present and important topic. An important and still largely unexplored task in this characterization is determining major factors within the real world that affect the AI's behavior, such as weather conditions or lighting, and either a) being able to give justification for why it may have failed or b) eliminating the influence the factor has. Determining these sensitive factors heavily relies on collected data that is diverse enough to cover numerous combinations of these factors, which becomes more onerous when having many potential sensitive factors or operating in complex environments. This paper investigates methods that discover and separate out individual semantic sensitive factors from a given dataset to conduct this characterization as well as addressing mitigation of these factors' sensitivity. We also broaden remediation of fairness, which normally only addresses socially relevant factors, and widen it to deal with the desensitization of AI with regard to all possible aspects of variation in the domain. The proposed methods which discover these major factors reduce the potentially onerous demands of collecting a sufficiently diverse dataset. In experiments using the road sign (GTSRB) and facial imagery (CelebA) datasets, we show the promise of using this scheme to perform this characterization and remediation and demonstrate that our approach outperforms state of the art approaches.
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Liu, Pengfei, Yuan, Weizhe, Fu, Jinlan, Jiang, Zhengbao, Hayashi, Hiroaki, Neubig, Graham
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning". Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly. To use these models to perform prediction tasks, the original input x is modified using a template into a textual string prompt x' that has some unfilled slots, and then the language model is used to probabilistically fill the unfilled information to obtain a final string x, from which the final output y can be derived. This framework is powerful and attractive for a number of reasons: it allows the language model to be pre-trained on massive amounts of raw text, and by defining a new prompting function the model is able to perform few-shot or even zero-shot learning, adapting to new scenarios with few or no labeled data. In this paper we introduce the basics of this promising paradigm, describe a unified set of mathematical notations that can cover a wide variety of existing work, and organize existing work along several dimensions, e.g.the choice of pre-trained models, prompts, and tuning strategies. To make the field more accessible to interested beginners, we not only make a systematic review of existing works and a highly structured typology of prompt-based concepts, but also release other resources, e.g., a website http://pretrain.nlpedia.ai/ including constantly-updated survey, and paperlist.
Towards Neural Schema Alignment for OpenStreetMap and Knowledge Graphs
Dsouza, Alishiba, Tempelmeier, Nicolas, Demidova, Elena
OpenStreetMap (OSM) is one of the richest openly available sources of volunteered geographic information. Although OSM includes various geographical entities, their descriptions are highly heterogeneous, incomplete, and do not follow any well-defined ontology. Knowledge graphs can potentially provide valuable semantic information to enrich OSM entities. However, interlinking OSM entities with knowledge graphs is inherently difficult due to the large, heterogeneous, ambiguous, and flat OSM schema and the annotation sparsity. This paper tackles the alignment of OSM tags with the corresponding knowledge graph classes holistically by jointly considering the schema and instance layers. We propose a novel neural architecture that capitalizes upon a shared latent space for tag-to-class alignment created using linked entities in OSM and knowledge graphs. Our experiments performed to align OSM datasets for several countries with two of the most prominent openly available knowledge graphs, namely, Wikidata and DBpedia, demonstrate that the proposed approach outperforms the state-of-the-art schema alignment baselines by up to 53 percentage points in terms of F1-score. The resulting alignment facilitates new semantic annotations for over 10 million OSM entities worldwide, which is more than a 400% increase compared to the existing semantic annotations in OSM.
Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent
Jing, Gangshan, Bai, He, George, Jemin, Chakrabortty, Aranya, Sharma, Piyush K.
Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale networks. In this paper, we propose a novel distributed zeroth-order algorithm by leveraging the network structure inherent in the optimization objective, which allows each agent to estimate its local gradient by local cost evaluation independently, without use of any consensus protocol. The proposed algorithm exhibits an asynchronous update scheme, and is designed for stochastic non-convex optimization with a possibly non-convex feasible domain based on the block coordinate descent method. The algorithm is later employed as a distributed model-free RL algorithm for distributed linear quadratic regulator design, where a learning graph is designed to describe the required interaction relationship among agents in distributed learning. We provide an empirical validation of the proposed algorithm to benchmark its performance on convergence rate and variance against a centralized ZOO algorithm.
Council Post: An AI Road Map Starts With Data
CTO of Infostretch, a digital engineering services company that helps enterprises prosper in the digital age. Several years into what many people expected to be an AI revolution, there is a nagging sense that we are at a crossroads. Artificial intelligence is an evolutionary step forward for business optimization strategies -- and rightly so -- but the companies that saw AI as the path to the promised land could be forgiven for thinking that the hype has outweighed successful implementation. Granted, there are numerous organizations that have integrated AI into their business processes, and it is already a routine part of software development, cybersecurity, natural language processing and robotic process automation (RPA). And yes, making AI a priority in terms of scalability and an accelerated time to market has shown a modicum of success.
Real Estate pricing with Machine Learning & non-traditional data sources
This blog post has been written with the collaboration of Guillermo Etchebarne. Real estate is the world's largest asset class, worthing $277 trillion (that's 277 followed by 12 zeros, in case you were wondering), three times the total value of all publicly traded companies. And Machine Learning applications have been accompanying its sector's growth. The main technology trend disrupting real estate today is ML. The publication "Emerging Trends in Real Estate 2021" by PwC reached a similar conclusion: Artificial Intelligence is among the main industry disrupters. With that said, one of the most popular AI applications in the industry is intelligent investing. Traditionally, experts answer these questions by reviewing at different data sources . The problem arises when there are thousands or even hundreds of thousands of data points to analyze.
What's the future of mobile app development?
The mobile industry is constantly evolving -- what was a groundbreaking technology yesterday may be a forgotten thing of the past today. That's why you need to continuously observe the mobile market so that you don't miss the opportunity for business growth. But you can't just analyse what's trending right now, that would be too easy. The mobile app market is changing at break-neck speed, so if you want to reach a much wider audience, improve user retention and get more app downloads, you need to predict future trends. That's what I'm here to help with.
Core Challenges in Embodied Vision-Language Planning
Francis, Jonathan, Kitamura, Nariaki, Labelle, Felix, Lu, Xiaopeng, Navarro, Ingrid, Oh, Jean
Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.