Wuhan University of Technology
Generating Random SAT Instances: Multiple Solutions could be Predefined and Deeply Hidden
Zhao, Dongdong (Wuhan University of Technology) | Liao, Lei | Luo, Wenjian | Xiang, Jianwen | Jiang, Hao | Hu, Xiaoyi
The generation of SAT instances is an important issue in computer science, and it is useful for researchers to verify the effectiveness of SAT solvers. Addressing this issue could inspire researchers to propose new search strategies. SAT problems exist in various real-world applications, some of which have more than one solution. However, although several algorithms for generating random SAT instances have been proposed, few can be used to generate hard instances that have multiple predefined solutions. In this paper, we propose the KHidden-M algorithm to generate SAT instances with multiple predefined solutions that could be hard to solve by the local search strategy when the number of predefined solutions is small enough and the Hamming distance between them is not less than half of the solution length. Specifically, first, we generate an SAT instance that is satisfied by all of the predefined solutions. Next, if the generated SAT instance does not satisfy the hardness condition, then a strategy will be conducted to adjust clauses through multiple iterations to improve the hardness of the whole instance. We propose three strategies to generate the SAT instance in the first part. The first strategy is called the random strategy, which randomly generates clauses that are satisfied by all of the predefined solutions. The other two strategies are called the estimating strategy and greedy strategy, and using them, we attempt to generate an instance that directly satisfies or is closer to the hardness condition for the local search strategy. We employ two SAT solvers (i.e., WalkSAT and Kissat) to investigate the hardness of the SAT instances generated by our algorithm in the experiments. The experimental results show the effectiveness of the random, estimating and greedy strategies. Compared to the state-of-the-art algorithm for generating SAT instances with predefined solutions, namely, M-hidden, our algorithm could be more effective in generating hard SAT instances.
Enhancing RNN Based OCR by Transductive Transfer Learning From Text to Images
He, Yang (Wuhan University of Technology) | Yuan, Jingling (Wuhan University of Technology) | Li, Lin (Wuhan University of Technology)
This paper presents a novel approach for optical character recognition (OCR) on acceleration and to avoid underfitting by text. Previously proposed OCR models typically take much time in the training phase and require large amount of labelled data to avoid underfitting. In contrast, our method does not require such condition. This is a challenging task related to transferring the character sequential relationship from text to OCR. We build a model based on transductive transfer learning to achieve domain adaptation from text to image. We thoroughly evaluate our approach on different datasets, including a general one and a relatively small one. We also compare the performance of our model with the general OCR model on different circumstances. We show that (1) our approach accelerates the training phase 20-30% on time cost; and (2) our approach can avoid underfitting while model is trained on a small dataset.
Online Cross-Modal Hashing for Web Image Retrieval
Xie, Liang (Wuhan University of Technology) | Shen, Jialie (Singapore Management University) | Zhu, Lei (Singapore Management University)
Cross-modal hashing (CMH) is an efficient technique for the fast retrieval of web image data, and it has gained a lot of attentions recently. However, traditional CMH methods usually apply batch learning for generating hash functions and codes. They are inefficient for the retrieval of web images which usually have streaming fashion. Online learning can be exploited for CMH. But existing online hashing methods still cannot solve two essential problems: efficient updating of hash codes and analysis of cross-modal correlation. In this paper, we propose Online Cross-modal Hashing (OCMH) which can effectively address the above two problems by learning the shared latent codes (SLC). In OCMH, hash codes can be represented by the permanent SLC and dynamic transfer matrix. Therefore, inefficient updating of hash codes is transformed to the efficient updating of SLC and transfer matrix, and the time complexity is irrelevant to the database size. Moreover, SLC is shared by all the modalities, and thus it can encode the latent cross-modal correlation, which further improves the overall cross-modal correlation between heterogeneous data. Experimental results on two real-world multi-modal web image datasets: MIR Flickr and NUS-WIDE, demonstrate the effectiveness and efficiency of OCMH for online cross-modal web image retrieval.
Identifying Domain-Dependent Influential Microblog Users: A Post-Feature Based Approach
Liu, Nian (Wuhan University of Technology) | Li, Lin (Wuhan University of Technology) | Xu, Guandong (University of Technology, Sydney) | Yang, Zhenglu (Nankai University)
Users of a social network like to follow the posts published by influential users. Such posts usually are delivered quickly and thus will produce a strong influence on public opinions. In this paper, we focus on the problem of identifying domain-dependent influential users(or topic experts). Some of traditional approaches are based on the post contents of users user’s to identify influential users, which may be biased by spammers who try to make posts related to some topics through a simple copy and paste. Others make use of user authentication information given by a service platform or user self description (introduction or label) in finding influential users. However, what users have published is not necessarily related to what they have registed and described. In addition, if there is no comments from other users, it’s less objective to assess a user’s post quality. To improve effectiveness of recognizing influential users in a topic of microblogs, we propose a post-feature based approach which is supplementary to post-content based approaches. Our experimental results show that the post-feature based approach produces relatively higher precision than that of the content based approach.
Recommending Related Microblogs: A Comparison Between Topic and WordNet based Approaches
Chen, Xing (Wuhan University of Technology) | Li, Lin (Wuhan University of Technology) | Xu, Guandong (Victoria University) | Yang, Zhenglu (The University of Tokyo) | Kitsuregawa, Masaru (The University of Tokyo)
Computing similarity between short microblogs is an important step in microblog recommendation. In this paper, we investigate a topic based approach and a WordNet based approach to estimate similarity scores between microblogs and recommend top related ones to users. Empirical study is conducted to compare their recommendation effectiveness using two evaluation measures. The results show that the WordNet based approach has relatively higher precision than that of the topic based approach using 548 tweets as dataset. In addition, the Kendall tau distance between two lists recommended by WordNet and topic approaches is calculated. Its average of all the 548 pair lists tells us the two approaches have the relative high disaccord in the ranking of related tweets.