Country
Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding
Lv, Guangyi (University of Science and Technology of China) | Xu, Tong (University of Science and Technology of China) | Chen, Enhong (University of Science and Technology of China) | Liu, Qi (University of Science and Technology of China) | Zheng, Yi (Ant Financial Services Group)
Recent years have witnessed the boom of online sharing media contents, which raise significant challenges in effective management and retrieval. Though a large amount of efforts have been made, precise retrieval on video shots with certain topics has been largely ignored. At the same time, due to the popularity of novel time-sync comments, or so-called "bullet-screen comments", video semantics could be now combined with timestamps to support further research on temporal video labeling. In this paper, we propose a novel video understanding framework to assign temporal labels on highlighted video shots. To be specific, due to the informal expression of bullet-screen comments, we first propose a temporal deep structured semantic model (T-DSSM) to represent comments into semantic vectors by taking advantage of their temporal correlation. Then, video highlights are recognized and labeled via semantic vectors in a supervised way. Extensive experiments on a real-world dataset prove that our framework could effectively label video highlights with a significant margin compared with baselines, which clearly validates the potential of our framework on video understanding, as well as bullet-screen comments interpretation.
Solving Goal Recognition Design Using ASP
Son, Tran Cao (New Mexico State University) | Sabuncu, Orkunt (University of Potsdam) | Schulz-Hanke, Christian (University of Potsdam) | Schaub, Torsten (University of Potsdam) | Yeoh, William (New Mexico State University)
Goal Recognition Design involves identifying the best ways to modify an underlying environment that agents operate in, typically by making asubset of feasible actions infeasible, so that agents are forced to reveal their goals as early as possible. Thus far, existing work has focused exclusively on imperative classical planning. In this paper, we address the same problem with a different paradigm, namely, declarative approaches based on Answer Set Programming (ASP). Our experimental results show that one of our ASP encodings is more scalable and is significantly faster by up to three orders of magnitude than thecurrent state of the art.
Online ARIMA Algorithms for Time Series Prediction
Liu, Chenghao (Zhejiang University) | Hoi, Steven C.H. (Singapore Management University) | Zhao, Peilin (A*STAR) | Sun, Jianling (Zhejiang University)
Autoregressive integrated moving average (ARIMA) is one of the most popular linear models for time series forecasting due to its nice statistical properties and great flexibility. However, its parameters are estimated in a batch manner and its noise terms are often assumed to be strictly bounded, which restricts its applications and makes it inefficient for handling large-scale real data. In this paper, we propose online learning algorithms for estimating ARIMA models under relaxed assumptions on the noise terms, which is suitable to a wider range of applications and enjoys high computational efficiency. The idea of our ARIMA method is to reformulate the ARIMA model into a task of full information online optimization (without random noise terms). As a consequence, we can online estimation of the parameters in an efficient and scalable way. Furthermore, we analyze regret bounds of the proposed algorithms, which guarantee that our online ARIMA model is provably as good as the best ARIMA model in hindsight. Finally, our encouraging experimental results further validate the effectiveness and robustness of our method.
Text Classification with Heterogeneous Information Network Kernels
Wang, Chenguang (Peking University) | Song, Yangqiu (West Virginia University) | Li, Haoran (Peking University) | Zhang, Ming (Peking University) | Han, Jiawei (University of Illinois at Urbana-Champaign)
Text classification is an important problem with many applications. Traditional approaches represent text as a bag-of-words and build classifiers based on this representation. Rather than words, entity phrases, the relations between the entities, as well as the types of the entities and relations carry much more information to represent the texts. This paper presents a novel text as network classification framework, which introduces 1) a structured and typed heterogeneous information networks (HINs) representation of texts, and 2) a meta-path based approach to link texts. We show that with the new representation and links of texts, the structured and typed information of entities and relations can be incorporated into kernels. Particularly, we develop both simple linear kernel and indefinite kernel based on meta-paths in the HIN representation of texts, where we call them HIN-kernels. Using Freebase, a well-known world knowledge base, to construct HIN for texts, our experiments on two benchmark datasets show that the indefinite HIN kernel based on weighted meta-paths outperforms the state-of-the-art methods and other HIN-kernels.
The Complexity Landscape of Decompositional Parameters for ILP
Ganian, Robert (Technische Universitรคt Wien) | Ordyniak, Sebastian (Technische Universitรคt Wien)
Integer Linear Programming (ILP) can be seen as the archetypical problem for NP-complete optimization problems, and a wide range of problems in artificial intelligence are solved in practice via a translation to ILP. Despite its huge range of applications, only few tractable fragments of ILP are known, probably the most prominent of which is based on the notion of total unimodularity. Using entirely different techniques, we identify new tractable fragments of ILP by studying structural parameterizations of the constraint matrix within the framework of parameterized complexity. In particular, we show that ILP is fixed-parameter tractable when parameterized by the treedepth of the constraint matrix and the maximum absolute value of any coefficient occurring in the ILP instance. Together with matching hardness results for the more general parameter treewidth, we draw a detailed complexity landscape of ILP w.r.t. decompositional parameters defined on the constraint matrix.
Iterative Project Quasi-Newton Algorithm for Training RBM
Mi, Shuai (Tianjin University) | Zhao, Xiaozhao (Tianjin University) | Hou, Yuexian (Tianjin University) | Zhang, Peng (Tianjin University) | Li, Wenjie (The Hong Kong Polytechnic University) | Song, Dawei (Tianjin University)
The restricted Boltzmann machine (RBM) has been used as building blocks for many successful deep learning models, e.g., deep belief networks (DBN) and deep Boltzmann machine (DBM) etc. The training of RBM can be extremely slow in pathological regions. The second order optimization methods, such as quasi-Newton methods, were proposed to deal with this problem. However, the non-convexity results in many obstructions for training RBM, including the infeasibility of applying second order optimization methods. In order to overcome this obstruction, we introduce an em-like iterative project quasi-Newton (IPQN) algorithm. Specifically, we iteratively perform the sampling procedure where it is not necessary to update parameters, and the sub-training procedure that is convex. In sub-training procedures, we apply quasi-Newton methods to deal with the pathological problem. We further show that Newton's method turns out to be a good approximation of the natural gradient (NG) method in RBM training. We evaluate IPQN in a series of density estimation experiments on the artificial dataset and the MNIST digit dataset. Experimental results indicate that IPQN achieves an improved convergent performance over the traditional CD method.
Conservativeness of Untied Auto-Encoders
Im, Daniel Jiwoong (University of Montreal) | Belghazi, Mohamed Ishmael (University of Montreal) | Memisevic, Roland (University of Montreal)
We discuss necessary and sufficient conditions for an auto-encoder to define a conservative vector field, in which case it is associated with anenergy function akin to the unnormalized log-probability of the data.We show that the conditions for conservativeness are more general than for encoder and decoder weights to be the same ("tied weights''), and thatthey also depend on the form of the hidden unit activation functions.Moreover, we show that contractive training criteria, such as denoising, enforces these conditions locally.Based on these observations, we show how we can use auto-encoders to extract the conservative component of a vector field.
Semi-Supervised Dictionary Learning via Structural Sparse Preserving
Wang, Di (Wenzhou University) | Zhang, Xiaoqin (Wenzhou University) | Fan, Mingyu (Wenzhou University) | Ye, Xiuzi (Wenzhou University)
While recent techniques for discriminative dictionary learning have attained promising results on the classification tasks, their performance is highly dependent on the number of labeled samples available for training. However, labeling samples is expensive and time consuming due to the significant human effort involved. In this paper, we present a novel semi- supervised dictionary learning method which utilizes the structural sparse relationships between the labeled and unlabeled samples. Specifically, by connecting the sparse reconstruction coefficients on both the original samples and dictionary, the unlabeled samples can be automatically grouped to the different labeled samples, and the grouped samples share a small number of atoms in the dictionary via mixed l2p- norm regularization. This makes the learned dictionary more representative and discriminative since the shared atoms are learned by using the labeled and unlabeled samples potentially from the same class. Minimizing the derived objective function is a challenging task because it is non-convex and highly non-smooth. We propose an efficient optimization algorithm to solve the problem based on the block coordinate descent method. Moreover, we have a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of our method in classification applications.
Direct Discriminative Bag Mapping for Multi-Instance Learning
Wu, Jia (University of Technology Sydney) | Pan, Shirui (University of Technology Sydney) | Zhang, Peng (University of Technology Sydney) | Zhu, Xingquan (Florida Atlantic University)
Multi-instance learning (MIL) is useful for tackling labeling ambiguity in learning tasks, by allowing a bag of instances to share one label. Recently, bag mapping methods, which transform a bag to a single instance in a new space via instance selection, have drawn significant attentions. To date, most existing works are developed based on the original space, i.e., utilizing all instances for bag mapping, and instance selection is indirectly tied to the MIL objective. As a result, it is hard to guarantee the distinguish capacity of the selected instances in the new bag mapping space for MIL. In this paper, we propose a direct discriminative mapping approach for multi-instance learning (MILDM), which identifies instances to directly distinguish bags in the new mapping space. Experiments and comparisons on real-world learning tasks demonstrate the algorithm performance.
Pragmatic Querying in Heterogeneous Knowledge Graphs
Viswanathan, Amar (Rensselaer Polytechnic Institute)
Knowledge Graphs with rich schemas can allow for complex querying. My thesis focuses on providing accessible Knowledge using Gricean notions of Cooperative Answering as a motivation. More specifically, using Query Reformulations, Data Awareness, and a Pragmatic Context, along with the results they can become more responsive to user requirements and user context.