Goto

Collaborating Authors

 Overview


Monotonic Gaussian process for physics-constrained machine learning with materials science applications

arXiv.org Artificial Intelligence

Physics-constrained machine learning is emerging as an important topic in the field of machine learning for physics. One of the most significant advantages of incorporating physics constraints into machine learning methods is that the resulting model requires significantly less data to train. By incorporating physical rules into the machine learning formulation itself, the predictions are expected to be physically plausible. Gaussian process (GP) is perhaps one of the most common methods in machine learning for small datasets. In this paper, we investigate the possibility of constraining a GP formulation with monotonicity on three different material datasets, where one experimental and two computational datasets are used. The monotonic GP is compared against the regular GP, where a significant reduction in the posterior variance is observed. The monotonic GP is strictly monotonic in the interpolation regime, but in the extrapolation regime, the monotonic effect starts fading away as one goes beyond the training dataset. Imposing monotonicity on the GP comes at a small accuracy cost, compared to the regular GP. The monotonic GP is perhaps most useful in applications where data is scarce and noisy, and monotonicity is supported by strong physical evidence.


Predicting Customer Lifetime Value in Free-to-Play Games

arXiv.org Artificial Intelligence

Customer lifetime value (CLV or LTV) refers broadly to the revenue that a company can attribute to one or more customer over the length of their relationship with the company [55]. The process of predicting the lifetime value consists in producing one or more monetary values that correspond to the sum of all the different types of revenues that a specific customer, or a specific cohort, will generate in the future. The purposes of this prediction are manifold: for example, having an early estimation of a customer's potential value allows more accurate budgeting for future investment; moreover, monitoring the remaining potential revenue from an established customer could permit preemptive actions in case of decreased engagement. Predicting customer lifetime value is a complex challenge and, to date, there is no single established practice. Furthermore, due to its wide potential impact in different business aspects, the problem is being researched in different communities using a plethora of different techniques, varying from parametric statistical models to deep learning [28, 70].


Monolingual alignment of word senses and definitions in lexicographical resources

arXiv.org Artificial Intelligence

The focus of this thesis is broadly on the alignment of lexicographical data, particularly dictionaries. In order to tackle some of the challenges in this field, two main tasks of word sense alignment and translation inference are addressed. The first task aims to find an optimal alignment given the sense definitions of a headword in two different monolingual dictionaries. This is a challenging task, especially due to differences in sense granularity, coverage and description in two resources. After describing the characteristics of various lexical semantic resources, we introduce a benchmark containing 17 datasets of 15 languages where monolingual word senses and definitions are manually annotated across different resources by experts. In the creation of the benchmark, lexicographers' knowledge is incorporated through the annotations where a semantic relation, namely exact, narrower, broader, related or none, is selected for each sense pair. This benchmark can be used for evaluation purposes of word-sense alignment systems. The performance of a few alignment techniques based on textual and non-textual semantic similarity detection and semantic relation induction is evaluated using the benchmark. Finally, we extend this work to translation inference where translation pairs are induced to generate bilingual lexicons in an unsupervised way using various approaches based on graph analysis. This task is of particular interest for the creation of lexicographical resources for less-resourced and under-represented languages and also, assists in increasing coverage of the existing resources. From a practical point of view, the techniques and methods that are developed in this thesis are implemented within a tool that can facilitate the alignment task.


DC-MRTA: Decentralized Multi-Robot Task Allocation and Navigation in Complex Environments

arXiv.org Artificial Intelligence

We present a novel reinforcement learning (RL) based task allocation and decentralized navigation algorithm for mobile robots in warehouse environments. Our approach is designed for scenarios in which multiple robots are used to perform various pick up and delivery tasks. We consider the problem of joint decentralized task allocation and navigation and present a two level approach to solve it. At the higher level, we solve the task allocation by formulating it in terms of Markov Decision Processes and choosing the appropriate rewards to minimize the Total Travel Delay (TTD). At the lower level, we use a decentralized navigation scheme based on ORCA that enables each robot to perform these tasks in an independent manner, and avoid collisions with other robots and dynamic obstacles. We combine these lower and upper levels by defining rewards for the higher level as the feedback from the lower level navigation algorithm. We perform extensive evaluation in complex warehouse layouts with large number of agents and highlight the benefits over state-of-the-art algorithms based on myopic pickup distance minimization and regret-based task selection. We observe improvement up to 14% in terms of task completion time and up-to 40% improvement in terms of computing collision-free trajectories for the robots.


Use and Misuse of Machine Learning in Anthropology

arXiv.org Artificial Intelligence

Machine learning (ML), being now widely accessible to the research community at large, has fostered a proliferation of new and striking applications of these emergent mathematical techniques across a wide range of disciplines. In this paper, we will focus on a particular case study: the field of paleoanthropology, which seeks to understand the evolution of the human species based on biological and cultural evidence. As we will show, the easy availability of ML algorithms and lack of expertise on their proper use among the anthropological research community has led to foundational misapplications that have appeared throughout the literature. The resulting unreliable results not only undermine efforts to legitimately incorporate ML into anthropological research, but produce potentially faulty understandings about our human evolutionary and behavioral past. The aim of this paper is to provide a brief introduction to some of the ways in which ML has been applied within paleoanthropology; we also include a survey of some basic ML algorithms for those who are not fully conversant with the field, which remains under active development. We discuss a series of missteps, errors, and violations of correct protocols of ML methods that appear disconcertingly often within the accumulating body of anthropological literature. These mistakes include use of outdated algorithms and practices; inappropriate train/test splits, sample composition, and textual explanations; as well as an absence of transparency due to the lack of data/code sharing, and the subsequent limitations imposed on independent replication. We assert that expanding samples, sharing data and code, re-evaluating approaches to peer review, and, most importantly, developing interdisciplinary teams that include experts in ML are all necessary for progress in future research incorporating ML within anthropology.


Handcrafted Feature Selection Techniques for Pattern Recognition: A Survey

arXiv.org Artificial Intelligence

The accuracy of a classifier, when performing Pattern recognition, is mostly tied to the quality and representativeness of the input feature vector. Feature Selection is a process that allows for representing information properly and may increase the accuracy of a classifier. This process is responsible for finding the best possible features, thus allowing us to identify to which class a pattern belongs. Feature selection methods can be categorized as Filters, Wrappers, and Embed. This paper presents a survey on some Filters and Wrapper methods for handcrafted feature selection. Some discussions, with regard to the data structure, processing time, and ability to well represent a feature vector, are also provided in order to explicitly show how appropriate some methods are in order to perform feature selection. Therefore, the presented feature selection methods can be accurate and efficient if applied considering their positives and negatives, finding which one fits best the problem's domain may be the hardest task.


Comparing Methods for Extractive Summarization of Call Centre Dialogue

arXiv.org Artificial Intelligence

This paper provides results of evaluating some text summarisation techniques for the purpose of producing call summaries for contact centre solutions. We specifically focus on extractive summarisation methods, as they do not require any labelled data and are fairly quick and easy to implement for production use. We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively (using ROUGE-L) and subjectively (by aggregating the judgements of several annotators). We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations. The results demonstrate that even such simple heuristics-based methods like Lead-N ca n produce meaningful and useful summaries of call centre dialogues.


Understanding the Behavior of Belief Propagation

arXiv.org Artificial Intelligence

Probabilistic graphical models are a powerful concept for modeling high-dimensional distributions. Besides modeling distributions, probabilistic graphical models also provide an elegant framework for performing statistical inference; because of the high-dimensional nature, however, one must often use approximate methods for this purpose. Belief propagation performs approximate inference, is efficient, and looks back on a long success-story. Yet, in most cases, belief propagation lacks any performance and convergence guarantees. Many realistic problems are presented by graphical models with loops, however, in which case belief propagation is neither guaranteed to provide accurate estimates nor that it converges at all. This thesis investigates how the model parameters influence the performance of belief propagation. We are particularly interested in their influence on (i) the number of fixed points, (ii) the convergence properties, and (iii) the approximation quality.


A Comprehensive Review of Deep Learning-based Single Image Super-resolution

arXiv.org Artificial Intelligence

Image super-resolution (SR) is one of the vital image processing methods that improve the resolution of an image in the field of computer vision. In the last two decades, significant progress has been made in the field of super-resolution, especially by utilizing deep learning methods. This survey is an effort to provide a detailed survey of recent progress in single-image super-resolution in the perspective of deep learning while also informing about the initial classical methods used for image super-resolution. The survey classifies the image SR methods into four categories, i.e., classical methods, supervised learning-based methods, unsupervised learning-based methods, and domain-specific SR methods. We also introduce the problem of SR to provide intuition about image quality metrics, available reference datasets, and SR challenges. Deep learning-based approaches of SR are evaluated using a reference dataset. Some of the reviewed state-of-the-art image SR methods include the enhanced deep SR network (EDSR), cycle-in-cycle GAN (CinCGAN), multiscale residual network (MSRN), meta residual dense network (Meta-RDN), recurrent back-projection network (RBPN), second-order attention network (SAN), SR feedback network (SRFBN) and the wavelet-based residual attention network (WRAN). Finally, this survey is concluded with future directions and trends in SR and open problems in SR to be addressed by the researchers.


Cognitive Architecture for Co-Evolutionary Hybrid Intelligence

arXiv.org Artificial Intelligence

This paper questions the feasibility of a strong (general) data-centric artificial intelligence (AI). The disadvantages of this type of intelligence are discussed. As an alternative, the concept of co-evolutionary hybrid intelligence is proposed. It is based on the cognitive interoperability of man and machine. An analysis of existing approaches to the construction of cognitive architectures is given. An architecture seamlessly incorporates a human into the loop of intelligent problem solving is considered. The article is organized as follows. The first part contains a critique of data-centric intelligent systems. The reasons why it is impossible to create a strong artificial intelligence based on this type of intelligence are indicated. The second part briefly presents the concept of co-evolutionary hybrid intelligence and shows its advantages. The third part gives an overview and analysis of existing cognitive architectures. It is concluded that many do not consider humans part of the intelligent data processing process. The next part discusses the cognitive architecture for co-evolutionary hybrid intelligence, providing integration with humans. It finishes with general conclusions about the feasibility of developing intelligent systems with humans in the problem-solving loop.