Collaborating Authors

Recurrent Attention Walk for Semi-supervised Classification Machine Learning

In this paper, we study the graph-based semi-supervised learning for classifying nodes in attributed networks, where the nodes and edges possess content information. Recent approaches like graph convolution networks and attention mechanisms have been proposed to ensemble the first-order neighbors and incorporate the relevant neighbors. However, it is costly (especially in memory) to consider all neighbors without a prior differentiation. We propose to explore the neighborhood in a reinforcement learning setting and find a walk path well-tuned for classifying the unlabelled target nodes. We let an agent (of node classification task) walk over the graph and decide where to direct to maximize classification accuracy. We define the graph walk as a partially observable Markov decision process (POMDP). The proposed method is flexible for working in both transductive and inductive setting. Extensive experiments on four datasets demonstrate that our proposed method outperforms several state-of-the-art methods. Several case studies also illustrate the meaningful movement trajectory made by the agent.

Multi-label Classification for Automatic Tag Prediction in the Context of Programming Challenges Machine Learning

One of the best ways for developers to test and improve their skills in a fun and challenging way are programming challenges, offered by a plethora of websites. For the inexperienced ones, some of the problems might appear too challenging, requiring some suggestions to implement a solution. On the other hand, tagging problems can be a tedious task for problem creators. In this paper, we focus on automating the task of tagging a programming challenge description using machine and deep learning methods. We observe that the deep learning methods implemented outperform well-known IR approaches such as tf-idf, thus providing a starting point for further research on the task.

Multi-task Self-Supervised Learning for Human Activity Detection Machine Learning

Deep learning methods are successfully used in applications pertaining to ubiquitous computing, health, and well-being. Specifically, the area of human activity recognition (HAR) is primarily transformed by the convolutional and recurrent neural networks, thanks to their ability to learn semantic representations from raw input. However, to extract generalizable features, massive amounts of well-curated data are required, which is a notoriously challenging task; hindered by privacy issues, and annotation costs. Therefore, unsupervised representation learning is of prime importance to leverage the vast amount of unlabeled data produced by smart devices. In this work, we propose a novel self-supervised technique for feature learning from sensory data that does not require access to any form of semantic labels. We learn a multi-task temporal convolutional network to recognize transformations applied on an input signal. By exploiting these transformations, we demonstrate that simple auxiliary tasks of the binary classification result in a strong supervisory signal for extracting useful features for the downstream task. We extensively evaluate the proposed approach on several publicly available datasets for smartphone-based HAR in unsupervised, semi-supervised, and transfer learning settings. Our method achieves performance levels superior to or comparable with fully-supervised networks, and it performs significantly better than autoencoders. Notably, for the semi-supervised case, the self-supervised features substantially boost the detection rate by attaining a kappa score between 0.7-0.8 with only 10 labeled examples per class. We get similar impressive performance even if the features are transferred from a different data source. While this paper focuses on HAR as the application domain, the proposed technique is general and could be applied to a wide variety of problems in other areas.

c-TextGen: Conditional Text Generation for Harmonious Human-Machine Interaction Artificial Intelligence

In recent years, with the development of deep learning technology, text generation technology has undergone great changes and provided many kinds of services for human beings, such as restaurant reservation and daily communication. The automatically generated text is becoming more and more fluent so researchers begin to consider more anthropomorphic text generation technology, that is the conditional text generation, including emotional text generation, personalized text generation, and so on. Conditional text generation (c-TextGen) has thus become a research hotspot. As a promising research field, we find that many efforts have been paid to researches of c-TextGen. Therefore, we aim to give a comprehensive review of the new research trends of c-TextGen. We first give a brief literature review of text generation technology, based on which we formalize the concept model of c-TextGen. We further make an investigation of several different c-TextGen techniques, and illustrate the advantages and disadvantages of commonly used neural network models. Finally, we discuss the open issues and promising research directions of c-TextGen.

Delineating Knowledge Domains in the Scientific Literature Using Visual Information Machine Learning

Figures are an important channel for scientific communication, used to express complex ideas, models and data in ways that words cannot. However, this visual information is mostly ignored in analyses of the scientific literature. In this paper, we demonstrate the utility of using scientific figures as markers of knowledge domains in science, which can be used for classification, recommender systems, and studies of scientific information exchange. We encode sets of images into a visual signature, then use distances between these signatures to understand how patterns of visual communication compare with patterns of jargon and citation structures. We find that figures can be as effective for differentiating communities of practice as text or citation patterns. We then consider where these metrics disagree to understand how different disciplines use visualization to express ideas. Finally, we further consider how specific figure types propagate through the literature, suggesting a new mechanism for understanding the flow of ideas apart from conventional channels of text and citations. Our ultimate aim is to better leverage these information-dense objects to improve scientific communication across disciplinary boundaries.