Goto

Collaborating Authors

 Edmonton


Explainable Artificial Intelligence: a Systematic Review

arXiv.org Artificial Intelligence

This has led to the development of a plethora of domain-dependent and context-specific methods for dealing with the interpretation of machine learning (ML) models and the formation of explanations for humans. Unfortunately, this trend is far from being over, with an abundance of knowledge in the field which is scattered and needs organisation. The goal of this article is to systematically review research works in the field of XAI and to try to define some boundaries in the field. From several hundreds of research articles focused on the concept of explainability, about 350 have been considered for review by using the following search methodology. In a first phase, Google Scholar was queried to find papers related to "explainable artificial intelligence", "explainable machine learning" and "interpretable machine learning". Subsequently, the bibliographic section of these articles was thoroughly examined to retrieve further relevant scientific studies. The first noticeable thing, as shown in figure 2 (a), is the distribution of the publication dates of selected research articles: sporadic in the 70s and 80s, receiving preliminary attention in the 90s, showing raising interest in 2000 and becoming a recognised body of knowledge after 2010. The first research concerned the development of an explanation-based system and its integration in a computer program designed to help doctors make diagnoses [3]. Some of the more recent papers focus on work devoted to the clustering of methods for explainability, motivating the need for organising the XAI literature [4, 5, 6].


Single-Agent Optimization Through Policy Iteration Using Monte-Carlo Tree Search

arXiv.org Artificial Intelligence

The combination of Monte-Carlo Tree Search (MCTS) and deep reinforcement learning is state-of-the-art in two-player perfect-information games. In this paper, we describe a search algorithm that uses a variant of MCTS which we enhanced by 1) a novel action value normalization mechanism for games with potentially unbounded rewards (which is the case in many optimization problems), 2) defining a virtual loss function that enables effective search parallelization, and 3) a policy network, trained by generations of self-play, to guide the search. We gauge the effectiveness of our method in "SameGame"---a popular single-player test domain. Our experimental results indicate that our method outperforms baseline algorithms on several board sizes. Additionally, it is competitive with state-of-the-art search algorithms on a public set of positions.


Competing in a Complex Hidden Role Game with Information Set Monte Carlo Tree Search

arXiv.org Artificial Intelligence

Advances in intelligent game playing agents have led to successes in perfect information games like Go and imperfect information games like Poker. The Information Set Monte Carlo Tree Search (ISMCTS) family of algorithms outperforms previous algorithms using Monte Carlo methods in imperfect information games. In this paper, Single Observer Information Set Monte Carlo Tree Search (SO-ISMCTS) is applied to Secret Hitler, a popular social deduction board game that combines traditional hidden role mechanics with the randomness of a card deck. This combination leads to a more complex information model than the hidden role and card deck mechanics alone. It is shown in 10108 simulated games that SO-ISMCTS plays as well as simpler rule based agents, and demonstrates the potential of ISMCTS algorithms in complicated information set domains.


A Novel Granular-Based Bi-Clustering Method of Deep Mining the Co-Expressed Genes

arXiv.org Artificial Intelligence

Traditional clustering methods are limited when dealing with huge and heterogeneous groups of gene expression data, which motivates the development of bi-clustering methods. Bi-clustering methods are used to mine bi-clusters whose subsets of samples (genes) are co-regulated under their test conditions. Studies show that mining bi-clusters of consistent trends and trends with similar degrees of fluctuations from the gene expression data is essential in bioinformatics research. Unfortunately, traditional bi-clustering methods are not fully effective in discovering such bi-clusters. Therefore, we propose a novel bi-clustering method by involving here the theory of Granular Computing. In the proposed scheme, the gene data matrix, considered as a group of time series, is transformed into a series of ordered information granules. With the information granules we build a characteristic matrix of the gene data to capture the fluctuation trend of the expression value between consecutive conditions to mine the ideal bi-clusters. The experimental results are in agreement with the theoretical analysis, and show the excellent performance of the proposed method.


A multi-component framework for the analysis and design of explainable artificial intelligence

arXiv.org Artificial Intelligence

The rapid growth of research in explainable artificial intelligence (XAI) follows on two substantial developments. First, the enormous application success of modern machine learning methods, especially deep and reinforcement learning, which have created high expectations for industrial, commercial and social value. Second, the emergence of concern for creating trusted AI systems, including the creation of regulatory principles to ensure transparency and trust of AI systems.These two threads have created a kind of "perfect storm" of research activity, all eager to create and deliver it any set of tools and techniques to address the XAI demand. As some surveys of current XAI suggest, there is yet to appear a principled framework that respects the literature of explainability in the history of science, and which provides a basis for the development of a framework for transparent XAI. Here we intend to provide a strategic inventory of XAI requirements, demonstrate their connection to a history of XAI ideas, and synthesize those ideas into a simple framework to calibrate five successive levels of XAI.


Multivariate Confidence Calibration for Object Detection

arXiv.org Machine Learning

Unbiased confidence estimates of neural networks are crucial especially for safety-critical applications. Many methods have been developed to calibrate biased confidence estimates. Though there is a variety of methods for classification, the field of object detection has not been addressed yet. Therefore, we present a novel framework to measure and calibrate biased (or miscalibrated) confidence estimates of object detection methods. The main difference to related work in the field of classifier calibration is that we also use additional information of the regression output of an object detector for calibration. Our approach allows, for the first time, to obtain calibrated confidence estimates with respect to image location and box scale. In addition, we propose a new measure to evaluate miscalibration of object detectors. Finally, we show that our developed methods outperform state-of-the-art calibration models for the task of object detection and provides reliable confidence estimates across different locations and scales.


Augmentation of the Reconstruction Performance of Fuzzy C-Means with an Optimized Fuzzification Factor Vector

arXiv.org Artificial Intelligence

Information granules have been considered to be the fundamental constructs of Granular Computing (GrC). As a useful unsupervised learning technique, Fuzzy C-Means (FCM) is one of the most frequently used methods to construct information granules. The FCM-based granulation-degranulation mechanism plays a pivotal role in GrC. In this paper, to enhance the quality of the degranulation (reconstruction) process, we augment the FCM-based degranulation mechanism by introducing a vector of fuzzification factors (fuzzification factor vector) and setting up an adjustment mechanism to modify the prototypes and the partition matrix. The design is regarded as an optimization problem, which is guided by a reconstruction criterion. In the proposed scheme, the initial partition matrix and prototypes are generated by the FCM. Then a fuzzification factor vector is introduced to form an appropriate fuzzification factor for each cluster to build up an adjustment scheme of modifying the prototypes and the partition matrix. With the supervised learning mode of the granulation-degranulation process, we construct a composite objective function of the fuzzification factor vector, the prototypes and the partition matrix. Subsequently, the particle swarm optimization (PSO) is employed to optimize the fuzzification factor vector to refine the prototypes and develop the optimal partition matrix. Finally, the reconstruction performance of the FCM algorithm is enhanced. We offer a thorough analysis of the developed scheme. In particular, we show that the classical FCM algorithm forms a special case of the proposed scheme. Experiments completed for both synthetic and publicly available datasets show that the proposed approach outperforms the generic data reconstruction approach.


Towards Designer Modeling through Design Style Clustering

arXiv.org Artificial Intelligence

We propose modeling designer style in mixed-initiative game content creation tools as archetypical design traces. These design traces are formulated as transitions between design styles; these design styles are in turn found through clustering all intermediate designs along the way to making a complete design. This method is implemented in the Evolutionary Dungeon Designer, a prototype mixed-initiative system for roguelike games. We present results both in the form of design styles for rooms, which can be analyzed to better understand the kind of rooms designed by users, and in the form of archetypical sequences between these rooms. We further discuss how the results here can be used to create style-sensitive suggestions. Such suggestions would allow the system to be one step ahead of the designer, offering suggestions for the next phase, assuming that the designer will follow one of the archetypical design traces.


On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation

arXiv.org Artificial Intelligence

New machine translations (MT) technologies are emerging rapidly and with them, bold claims of achieving human parity such as: (i) the results produced approach "accuracy achieved by average bilingual human translators" (Wu et al., 2017b) or (ii) the "translation quality is at human parity when compared to professional human translators" (Hassan et al., 2018) have seen the light of day (Laubli et al., 2018). Aside from the fact that many of these papers craft their own definition of human parity, these sensational claims are often not supported by a complete analysis of all aspects involved in translation. Establishing the discrepancies between the strengths of statistical approaches to MT and the way humans translate has been the starting point of our research. By looking at MT output and linguistic theory, we were able to identify some remaining issues. The problems range from simple number and gender agreement errors to more complex phenomena such as the correct translation of aspectual values and tenses. Our experiments confirm, along with other studies (Bentivogli et al., 2016), that neural MT has surpassed statistical MT in many aspects. However, some problems remain and others have emerged. We cover a series of problems related to the integration of specific linguistic features into statistical and neural MT, aiming to analyse and provide a solution to some of them. Our work focuses on addressing three main research questions that revolve around the complex relationship between linguistics and MT in general. We identify linguistic information that is lacking in order for automatic translation systems to produce more accurate translations and integrate additional features into the existing pipelines. We identify overgeneralization or 'algorithmic bias' as a potential drawback of neural MT and link it to many of the remaining linguistic issues.


Validation Set Evaluation can be Wrong: An Evaluator-Generator Approach for Maximizing Online Performance of Ranking in E-commerce

arXiv.org Artificial Intelligence

Learning-to-rank (LTR) has become a key technology in E-commerce applications. Previous LTR approaches followed the supervised learning paradigm so that learned models should match the labeled data point-wisely or pair-wisely. However, we have noticed that global context information, including the total order of items in the displayed webpage, can play an important role in interactions with the customers. Therefore, to approach the best global ordering, the exploration in a large combinatorial space of items is necessary, which requires evaluating orders that may not appear in the labeled data. In this scenario, we first show that the classical data-based metrics can be inconsistent with online performance, or even misleading. We then propose to learn an evaluator and search the best model guided by the evaluator, which forms the evaluator-generator framework for training the group-wise LTR model. The evaluator is learned from the labeled data, and is enhanced by incorporating the order context information. The generator is trained with the supervision of the evaluator by reinforcement learning to generate the best order in the combinatorial space. Our experiments in one of the world's largest retail platforms disclose that the learned evaluator is a much better indicator than classical data-based metrics. Moreover, our LTR model achieves a significant improvement ($\textgreater2\%$) from the current industrial-level pair-wise models in terms of both Conversion Rate (CR) and Gross Merchandise Volume (GMV) in online A/B tests.