Goto

Collaborating Authors

 Chen, Ke


Learning-Based Video Game Development in MLP@UoM: An Overview

arXiv.org Artificial Intelligence

Learning-Based Video Game Development in MLP@UoM: An Overview * Ke Chen, Senior Member, IEEE Department of Computer Science, The University of Manchester, Manchester M13 9PL, U.K. Email: Ke.Chen@manchester.ac.uk Abstract --In general, video games not only prevail in entertainment but also have become an alternative methodology for knowledge learning, skill acquisition and assistance for medical treatment as well as health care in education, vocational/military training and medicine. On the other hand, video games also provide an ideal test bed for AI researches. T o a large extent, however, video game development is still a laborious yet costly process, and there are many technical challenges ranging from game generation to intelligent agent creation. Unlike traditional methodologies, in Machine Learning and Perception Lab at the University of Manchester (MLP@UoM), we advocate applying machine learning to different tasks in video game development to address several challenges systematically. In this paper, we overview the main progress made in MLP@UoM recently and have an outlook on the future research directions in learning-based video game development arising from our works. I NTRODUCTION The video games industry has drastically grown since its inception and even surpassed the size of the film industry in 2004. Nowadays, the global revenue of the video industry still rises and increases, and the widespread availability of high-end graphics hardware have resulted in a demand for more complex video games. This in turn has increased the complexity of game development. In general, video games not only prevail in entertainment but also have become an alternative methodology for knowledge learning, skill acquisition and assistance for medical treatment as well as health care in education, vocational/military training and medicine. From an academic perspective, video games also provide an ideal test bed, which allows for researching into automatic video game development and testing new AI algorithms in such a complex yet well-structured environment with ground-truth.


Semi-Supervised Few-Shot Learning for Dual Question-Answer Extraction

arXiv.org Artificial Intelligence

This paper addresses the problem of key phrase extraction from sentences. Existing state-of-the-art supervised methods require large amounts of annotated data to achieve good performance and generalization. Collecting labeled data is, however, often expensive. In this paper, we redefine the problem as question-answer extraction, and present SAMIE: Self-Asking Model for Information Ixtraction, a semi-supervised model which dually learns to ask and to answer questions by itself. Briefly, given a sentence $s$ and an answer $a$, the model needs to choose the most appropriate question $\hat q$; meanwhile, for the given sentence $s$ and same question $\hat q$ selected in the previous step, the model will predict an answer $\hat a$. The model can support few-shot learning with very limited supervision. It can also be used to perform clustering analysis when no supervision is provided. Experimental results show that the proposed method outperforms typical supervised methods especially when given little labeled data.


The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

arXiv.org Artificial Intelligence

With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. This music meta-creation problem can also be incorporated into a plan recognition system with user inputs and predictive structural outputs. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.


Intervention Aided Reinforcement Learning for Safe and Practical Policy Optimization in Navigation

arXiv.org Artificial Intelligence

In contrast to the intense studies of deep Reinforcement Learning(RL) in games and simulations [1], employing deep RL to real world robots remains challenging, especially in high risk scenarios. Though there has been some progresses in RL based control in realistic robotics [2, 3, 4, 5], most of those previous works does not specifically deal with the safety concerns in the RL training process. For majority of high risk scenarios in real world, deep RL still suffer from bottlenecks both in cost and safety. As an example, collisions are extremely dangerous for UAV, while RL training requires thousands of times of collisions. Other works contributes to building simulation environments and bridging the gap between reality and simulation [4, 5]. However, building such simulation environment is arduous, not to mention that the gap can not be totally made up. To address the safety issue in real-world RL training, we present the Intervention Aided Reinforcement Learning (IARL) framework. Intervention is commonly used in many automatic control systems in real world for safety insurance. It is also regarded as an important evaluation criteria for autonomous navigation systems, e.g. the disengagement ratio in autonomous driving


Transferable Positive/Negative Speech Emotion Recognition via Class-wise Adversarial Domain Adaptation

arXiv.org Machine Learning

TRANSFERABLE POSITIVE/NEGATIVE SPEECH EMOTION RECOGNITION VIA CLASS-WISE ADVERSARIAL DOMAIN ADAPTATION Hao Zhou, Ke Chen School of Computer Science, The University of Manchester, Manchester, M13 9PL, U.K. ABSTRACT Speech emotion recognition plays an important role in building more intelligent and humanlike agents. Due to the difficulty of collecting speech emotional data, an increasingly popular solution is leveraging a related and rich source corpus to help address the target corpus. However, domain shift between the corpora poses a serious challenge, making domain shift adaptation difficult to function even on the recognition of positive/negative emotions. In this work, we propose class-wise adversarial domain adaptation to address this challenge by reducing the shift for all classes between different corpora. Experiments on the well-known corpora EMODB and Aibo demonstrate that our method is effective even when only a very limited number of target labeled examples are provided.


Multi-Label Zero-Shot Human Action Recognition via Joint Latent Embedding

arXiv.org Artificial Intelligence

Human action recognition refers to automatic recognizing human actions from a video clip, which is one of the most challenging tasks in computer vision. In reality, a video stream is often weakly-annotated with a set of relevant human action labels at a global level rather than assigning each label to a specific video episode corresponding to a single action, which leads to a multi-label learning problem. Furthermore, there are a great number of meaningful human actions in reality but it would be extremely difficult, if not impossible, to collect/annotate video clips regarding all of various human actions, which leads to a zero-shot learning scenario. To the best of our knowledge, there is no work that has addressed all the above issues together in human action recognition. In this paper, we formulate a real-world human action recognition task as a multi-label zero-shot learning problem and propose a framework to tackle this problem. Our framework simultaneously tackles the issue of unknown temporal boundaries between different actions for multi-label learning and exploits the side information regarding the semantic relationship between different human actions for zero-shot learning. As a result, our framework leads to a joint latent embedding representation for multi-label zero-shot human action recognition. The joint latent embedding is learned with two component models by exploring temporal coherence underlying video data and the intrinsic relationship between visual and semantic domain. We evaluate our framework with different settings, including a novel data split scheme designed especially for evaluating multi-label zero-shot learning, on two weakly annotated multi-label human action datasets: Breakfast and Charades. The experimental results demonstrate the effectiveness of our framework in multi-label zero-shot human action recognition.


Learning to Play General Video-Games via an Object Embedding Network

arXiv.org Artificial Intelligence

Deep reinforcement learning (DRL) has proven to be an effective tool for creating general video-game AI. However most current DRL video-game agents learn end-to-end from the video-output of the game, which is superfluous for many applications and creates a number of additional problems. More importantly, directly working on pixel-based raw video data is substantially distinct from what a human player does.In this paper, we present a novel method which enables DRL agents to learn directly from object information. This is obtained via use of an object embedding network (OEN) that compresses a set of object feature vectors of different lengths into a single fixed-length unified feature vector representing the current game-state and fulfills the DRL simultaneously. We evaluate our OEN-based DRL agent by comparing to several state-of-the-art approaches on a selection of games from the GVG-AI Competition. Experimental results suggest that our object-based DRL agent yields performance comparable to that of those approaches used in our comparative study.


On the Use of Sparse Filtering for Covariate Shift Adaptation

arXiv.org Machine Learning

In this paper we formally analyse the use of sparse filtering algorithms to perform covariate shift adaptation. We provide a theoretical analysis of sparse filtering by evaluating the conditions required to perform covariate shift adaptation. We prove that sparse filtering can perform adaptation only if the conditional distribution of the labels has a structure explained by a cosine metric. To overcome this limitation, we propose a new algorithm, named periodic sparse filtering, and carry out the same theoretical analysis regarding covariate shift adaptation. We show that periodic sparse filtering can perform adaptation under the looser and more realistic requirement that the conditional distribution of the labels has a periodic structure, which may be satisfied, for instance, by user-dependent data sets. We experimentally validate our theoretical results on synthetic data. Moreover, we apply periodic sparse filtering to real-world data sets to demonstrate that this simple and computationally efficient algorithm is able to achieve competitive performances.


Deploying learning materials to game content for serious education game development: A case study

arXiv.org Artificial Intelligence

The ultimate goals of serious education games (SEG) are to facilitate learning and maximizing enjoyment during playing SEGs. In SEG development, there are normally two spaces to be taken into account: knowledge space regarding learning materials and content space regarding games to be used to convey learning materials. How to deploy the learning materials seamlessly and effectively into game content becomes one of the most challenging problems in SEG development. Unlike previous work where experts in education have to be used heavily, we proposed a novel approach that works toward minimizing the efforts of education experts in mapping learning materials to content space. For a proof-of-concept, we apply the proposed approach in developing an SEG game, named \emph{Chem Dungeon}, as a case study in order to demonstrate the effectiveness of our proposed approach. This SEG game has been tested with a number of users, and the user survey suggests our method works reasonably well.


Learning-Based Procedural Content Generation

arXiv.org Artificial Intelligence

Procedural content generation (PCG) has recently become one of the hottest topics in computational intelligence and AI game researches. Among a variety of PCG techniques, search-based approaches overwhelmingly dominate PCG development at present. While SBPCG leads to promising results and successful applications, it poses a number of challenges ranging from representation to evaluation of the content being generated. In this paper, we present an alternative yet generic PCG framework, named learning-based procedure content generation (LBPCG), to provide potential solutions to several challenging problems in existing PCG techniques. By exploring and exploiting information gained in game development and public beta test via data-driven learning, our framework can generate robust content adaptable to end-user or target players on-line with minimal interruption to their experience. Furthermore, we develop enabling techniques to implement the various models required in our framework. For a proof of concept, we have developed a prototype based on the classic open source first-person shooter game, Quake. Simulation results suggest that our framework is promising in generating quality content.