Goto

Collaborating Authors

 Overview


Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond

arXiv.org Artificial Intelligence

Machine reading comprehension (MRC) aims to teach machines to read and comprehend human languages, which is a long-standing goal of natural language processing (NLP). With the burst of deep neural networks and the evolution of contextualized language models (CLMs), the research of MRC has experienced two significant breakthroughs. MRC and CLM, as a phenomenon, have a great impact on the NLP community. In this survey, we provide a comprehensive and comparative review on MRC covering overall research topics about 1) the origin and development of MRC and CLM, with a particular focus on the role of CLMs; 2) the impact of MRC and CLM to the NLP community; 3) the definition, datasets, and evaluation of MRC; 4) general MRC architecture and technical methods in the view of two-stage Encoder-Decoder solving architecture from the insights of the cognitive process of humans; 5) previous highlights, emerging topics, and our empirical analysis, among which we especially focus on what works in different periods of MRC researches. We propose a full-view categorization and new taxonomies on these topics. The primary views we have arrived at are that 1) MRC boosts the progress from language processing to understanding; 2) the rapid improvement of MRC systems greatly benefits from the development of CLMs; 3) the theme of MRC is gradually moving from shallow text matching to cognitive reasoning.


A computational model implementing subjectivity with the 'Room Theory'. The case of detecting Emotion from Text

arXiv.org Machine Learning

This work introduces a new method to consider subjectivity and general context dependency in text analysis and uses as example the detection of emotions conveyed in text. The proposed method takes into account subjectivity using a computational version of the Framework Theory by Marvin Minsky (1974) leveraging on the Word2Vec approach to text vectorization by Mikolov et al. (2013), used to generate distributed representation of words based on the context where they appear. Our approach is based on three components: 1. a framework/"room" representing the point of view; 2. a benchmark representing the criteria for the analysis - in this case the emotion classification, from a study of human emotions by Robert Plutchik (1980); and 3. the document to be analyzed. By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark - intensities of emotions in our case study - for the document to be analyzed. Our method provides a measure that take into account the point of view of the entity reading the document. This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text. Subjectivity can be not limited to human reactions, but it could be used to provide a text with an interpretation related to a given domain ("room"). To evaluate our method, we used a test case in the political domain.


Deep Learning Techniques for Inverse Problems in Imaging

arXiv.org Machine Learning

Recent work in machine learning shows that deep neural networks can be used to solve a wide variety of inverse problems arising in computational imaging. We explore the central prevailing themes of this emerging area and present a taxonomy that can be used to categorize different problems and reconstruction methods. Our taxonomy is organized along two central axes: (1) whether or not a forward model is known and to what extent it is used in training and testing, and (2) whether or not the learning is supervised or unsupervised, i.e., whether or not the training relies on access to matched ground truth image and measurement pairs. We also discuss the trade-offs associated with these different reconstruction approaches, caveats and common failure modes, plus open problems and avenues for future work.


Goal Recognition over Imperfect Domain Models

arXiv.org Artificial Intelligence

Goal recognition is the problem of recognizing the intended goal of autonomous agents or humans by observing their behavior in an environment. Over the past years, most existing approaches to goal and plan recognition have been ignoring the need to deal with imperfections regarding the domain model that formalizes the environment where autonomous agents behave. In this thesis, we introduce the problem of goal recognition over imperfect domain models, and develop solution approaches that explicitly deal with two distinct types of imperfect domains models: (1) incomplete discrete domain models that have possible, rather than known, preconditions and effects in action descriptions; and (2) approximate continuous domain models, where the transition function is approximated from past observations and not well-defined. We develop novel goal recognition approaches over imperfect domains models by leveraging and adapting existing recognition approaches from the literature. Experiments and evaluation over these two types of imperfect domains models show that our novel goal recognition approaches are accurate in comparison to baseline approaches from the literature, at several levels of observability and imperfections.


Announcing Confetti: A Vision for the Future of Artificial Intelligence in the Real World

#artificialintelligence

Nowadays the term artificial intelligence (AI) has become synonymous with "technology of the future." Since 2012, when the neural networks trounced the ImageNet image classification challenge, machine learning has enabled extraordinary advances across diverse domains such as vision, translation, and speech recognition. We have seen a widespread democratization of the knowledge needed to get started in AI. Cheap consumer hardware, easy access to datasets, and the prevalence of powerful open-source frameworks such as PyTorch and TensorFlow have significantly reduced the barrier to entry. It has become clear that AI is going to transform the fabric of society in ways never seen before.


System-Level Predictive Maintenance: Review of Research Literature and Gap Analysis

arXiv.org Artificial Intelligence

This paper reviews current literature in the field of predictive maintenance from the system point of view. We differentiate the existing capabilities of condition estimation and failure risk forecasting as currently applied to simple components, from the capabilities needed to solve the same tasks for complex assets. System-level analysis faces more complex latent degradation states, it has to comprehensively account for active maintenance programs at each component level and consider coupling between different maintenance actions, while reflecting increased monetary and safety costs for system failures. As a result, methods that are effective for forecasting risk and informing maintenance decisions regarding individual components do not readily scale to provide reliable sub-system or system level insights. A novel holistic modeling approach is needed to incorporate available structural and physical knowledge and naturally handle the complexities of actively fielded and maintained assets.


ML Engineer, Data Scientist, Research Scientist: What's the Difference?

#artificialintelligence

If you have to write an artificial intelligence (AI) or machine learning (ML) job description, it can be difficult to convey precisely what kind of new employee you want to hire. Doing so requires using the right language, plus understanding what type of role is most appropriate for what you want to achieve. To guide you through the challenging process of recruiting top AI talent, we'll start by looking at the differences between different AI & ML roles. Then, we'll discuss who should be your first hires depending on the approach you choose for your ML projects. We also recommend you make sure that you don't do these seven things to scare off the AI talent you're trying to hire.


Pruning Algorithms to Accelerate Convolutional Neural Networks for Edge Applications: A Survey

arXiv.org Machine Learning

With the general trend of increasing Convolutional Neural Network (CNN) model sizes, model compression and acceleration techniques have become critical for the deployment of these models on edge devices. In this paper, we provide a comprehensive survey on Pruning, a major compression strategy that removes non-critical or redundant neurons from a CNN model. The survey covers the overarching motivation for pruning, different strategies and criteria, their advantages and drawbacks, along with a compilation of major pruning techniques. We conclude the survey with a discussion on alternatives to pruning and current challenges for the model compression community.


Towards Knowledgeable Supervised Lifelong Learning Systems

Journal of Artificial Intelligence Research

Learning a sequence of tasks is a long-standing challenge in machine learning. This setting applies to learning systems that observe examples of a range of tasks at different points in time. A learning system should become more knowledgeable as more related tasks are learned. Although the problem of learning sequentially was acknowledged for the first time decades ago, the research in this area has been rather limited. Research in transfer learning, multitask learning, metalearning and deep learning has studied some challenges of these kinds of systems. Recent research in lifelong machine learning and continual learning has revived interest in this problem. We propose Proficiente, a full framework for long-term learning systems. Proficiente relies on knowledge transferred between hypotheses learned with Support Vector Machines. The first component of the framework is focused on transferring forward selectively from a set of existing hypotheses or functions representing knowledge acquired during previous tasks to a new target task. A second component of Proficiente is focused on transferring backward, a novel ability of long-term learning systems that aim to exploit knowledge derived from recent tasks to encourage refinement of existing knowledge. We propose a method that transfers selectively from a task learned recently to existing hypotheses representing previous tasks. The method encourages retention of existing knowledge whilst refining. We analyse the theoretical properties of the proposed framework. Proficiente is accompanied by an agnostic metric that can be used to determine if a long-term learning system is becoming more knowledgeable. We evaluate Proficiente in both synthetic and real-world datasets, and demonstrate scenarios where knowledgeable supervised learning systems can be achieved by means of transfer.


Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

arXiv.org Machine Learning

Generative Adversarial Networks (GANs) is a novel class of deep generative models which has recently gained significant attention. GANs learns complex and high-dimensional distributions implicitly over images, audio, and data. However, there exists major challenges in training of GANs, i.e., mode collapse, non-convergence and instability, due to inappropriate design of network architecture, use of objective function and selection of optimization algorithm. Recently, to address these challenges, several solutions for better design and optimization of GANs have been investigated based on techniques of re-engineered network architectures, new objective functions and alternative optimization algorithms. To the best of our knowledge, there is no existing survey that has particularly focused on broad and systematic developments of these solutions. In this study, we perform a comprehensive survey of the advancements in GANs design and optimization solutions proposed to handle GANs challenges. We first identify key research issues within each design and optimization technique and then propose a new taxonomy to structure solutions by key research issues. In accordance with the taxonomy, we provide a detailed discussion on different GANs variants proposed within each solution and their relationships. Finally, based on the insights gained, we present the promising research directions in this rapidly growing field.