Goto

Collaborating Authors

 Optimization


Bridging Video Content and Comments: Synchronized Video Description with Temporal Summarization of Crowdsourced Time-Sync Comments

AAAI Conferences

With the rapid growth of online sharing media, we are facing a huge collection of videos. In the meantime, due to the volume and complexity of video data, it can be tedious and time consuming to index or annotate videos. In this paper, we propose to generate temporal descriptions of videos by exploiting the information of crowdsourced time-sync comments which are receiving increasing popularity on many video sharing websites. In this framework, representative and interesting comments of a video are selected and highlighted along the timeline, which provide an informative description of the video in a time-sync manner. The challenge of the proposed application comes from the extremely informal and noisy nature of the comments, which are usually short sentences and on very different topics. To resolve these issues, we propose a novel temporal summarization model based on the data reconstruction principle, where representative comments are selected in order to best reconstruct the original corpus at the text level as well as the topic level while incorporating the temporal correlations of the comments. Experimental results on real-world data demonstrate the effectiveness of the proposed framework and justify the idea of exploiting crowdsourced time-sync comments as a bridge to describe videos.


Finding Cut from the Same Cloth: Cross Network Link Recommendation via Joint Matrix Factorization

AAAI Conferences

With the emergence of online forums associated with major diseases, such as diabetes mellitus, many patients are increasingly dependent on such disease-specific social networks to gain access to additional resources. Among these patients, it is common for them to stick to one disease-specific social network, although their desired resources might be spread over multiple social networks, such as patients with similar questions and concerns. Motivated by this application, in this paper, we focus on cross network link recommendation, which aims to identify similar users across multiple heterogeneous social networks. The problem setting is different from existing work on cross network link prediction, which either tries to link accounts of the same user from different social networks, or aims to match users with complementary expertise or interest. To approach the problem of cross network link recommendation, we propose to jointly decompose the user-keyword matrices from multiple social networks, while requiring them to share the same topics and user group-topic association matrices. This constraint comes from the fact that social networks dedicated to the same disease tend to share the same topics as well as the interests of users groups in certain topics. Based on this intuition, we construct a generic optimization framework, provide four instantiations and an iterative optimization algorithm with performance analysis. In the experiments, we demonstrate the superiority of the proposed algorithm over state-of-the-art techniques on various real-world data sets.


Collaborative Company Profiling: Insights from an Employee's Perspective

AAAI Conferences

Company profiling is an analytical process to build an in-depth understanding of company's fundamental characteristics. It serves as an effective way to gain vital information of the target company and acquire business intelligence. Traditional approaches for company profiling rely heavily on the availability of rich finance information about the company, such as finance reports and SEC filings, which may not be readily available for many private companies. However, the rapid prevalence of online employment services enables a new paradigm โ€” to obtain the variety of company's information from their employees' online ratings and comments. This, in turn, raises the challenge to develop company profiles from an employee's perspective. To this end, in this paper, we propose a method named Company Profiling based Collaborative Topic Regression (CPCTR), for learning the latent structural patterns of companies. By formulating a joint optimization framework, CPCTR has the ability in collaboratively modeling both textual (e.g., reviews) and numerical information (e.g., salaries and ratings). Indeed, with the identified patterns, including the positive/negative opinions and the latent variable that influences salary, we can effectively carry out opinion analysis and salary prediction. Extensive experiments were conducted on a real-world data set to validate the effectiveness of CPCTR. The results show that our method provides a comprehensive understanding of company characteristics and delivers a more effective prediction of salaries than other baselines.


A Fast Algorithm to Compute Maximum k -Plexes in Social Network Analysis

AAAI Conferences

A clique model is one of the most important techniques on the cohesive subgraph detection; however, its applications are rather limited due to restrictive conditions of the model. Hence much research resorts to k -plex โ€” a graph in which any vertex is adjacent to all but at most k vertices โ€” which is a relaxation model of the clique. In this paper, we study the maximum k -plex problem and propose a fast algorithm to compute maximum k -plexes by exploiting structural properties of the problem. In an n -vertex graph, the algorithm computes optimal solutions in c n n O(1) time for a constant c < 2 depending only on k . To the best of our knowledge, this is the first algorithm that breaks the trivial theoretical bound of 2 n for each k โ‰ฅ 3. We also provide experimental results over multiple real-world social network instances in support.


Solving High-Dimensional Multi-Objective Optimization Problems with Low Effective Dimensions

AAAI Conferences

Multi-objective (MO) optimization problems require simultaneously optimizing two or more objective functions. An MO algorithm needs to find solutions that reach different optimal balances of the objective functions, i.e., optimal Pareto front, therefore, high dimensionality of the solution space can hurt MO optimization much severer than single-objective optimization, which was little addressed in previous studies. This paper proposes a general, theoretically-grounded yet simple approach ReMO, which can scale current derivative-free MO algorithms to the high-dimensional non-convex MO functions with low effective dimensions, using random embedding. We prove the conditions under which an MO function has a low effective dimension, and for such functions, we prove that ReMO possesses the desirable properties of optimal Pareto front preservation, time complexity reduction, and rotation perturbation invariance. Experimental results indicate that ReMO is effective for optimizing the high-dimensional MO functions with low effective dimensions, and is even effective for the high-dimensional MO functions where all dimensions are effective but most only have a small and bounded effect on the function value.


Going Beyond Primal Treewidth for (M)ILP

AAAI Conferences

Integer Linear Programming (ILP) and its mixed variant (MILP) are archetypical examples of NP-complete optimization problems which have a wide range of applications in various areas of artificial intelligence. However, we still lack a thorough understanding of which structural restrictions make these problems tractable. Here we focus on structure captured via so-called decompositional parameters, which have been highly successful in fields such as boolean satisfiability and constraint satisfaction but have not yet reached their full potential in the ILP setting. In particular, primal treewidth (an established decompositional parameter) can only be algorithmically exploited to solve ILP under restricted circumstances. Our main contribution is the introduction and algorithmic exploitation of two new decompositional parameters for ILP and MILP. The first, torso-width, is specifically tailored to the linear programming setting and is the first decompositional parameter which can also be used for MILP. The latter, incidence treewidth, is a concept which originates from boolean satisfiability but has not yet been used in the ILP setting; here we obtain a full complexity landscape mapping the precise conditions under which incidence treewidth can be used to obtain efficient algorithms. Both of these parameters overcome previous shortcomings of primal treewidth for ILP in unique ways, and consequently push the frontiers of tractability for these important problems.


A Generic Bet-and-Run Strategy for Speeding Up Stochastic Local Search

AAAI Conferences

A common strategy for improving optimization algorithms is to restart the algorithm when it is believed to be trapped in an inferior part of the search space. However, while specific restart strategies have been developed for specific problems (and specific algorithms), restarts are typically not regarded as a general tool to speed up an optimization algorithm. In fact, many optimization algorithms do not employ restarts at all. Recently, "bet-and-run" was introduced in the context of mixed-integer programming, where first a number of short runs with randomized initial conditions is made, and then the most promising run of these is continued. In this article, we consider two classical NP-complete combinatorial optimization problems, traveling salesperson and minimum vertex cover, and study the effectiveness of different bet-and-run strategies. In particular, our restart strategies do not take any problem knowledge into account, nor are tailored to the optimization algorithm. Therefore, they can be used off-the-shelf. We observe that state-of-the-art solvers for these problems can benefit significantly from restarts on standard benchmark instances.


Embedded Bandits for Large-Scale Black-Box Optimization

AAAI Conferences

Random embedding has been applied with empirical success to large-scale black-box optimization problems with low effective dimensions. This paper proposes the EmbeddedHunter algorithm, which incorporates the technique in a hierarchical stochastic bandit setting, following the optimism in the face of uncertainty principle and breaking away from the multiple-run framework in which random embedding has been conventionally applied similar to stochastic black-box optimization solvers. Our proposition is motivated by the bounded mean variation in the objective value for a low-dimensional point projected randomly into the decision space of Lipschitz-continuous problems. In essence, the EmbeddedHunter algorithm expands optimistically a partitioning tree over a low-dimensional โ€” equal to the effective dimension of the problem โ€”search space based on a bounded number of random embeddings of sampled points from the low-dimensional space. In contrast to the probabilistic theoretical guarantees of multiple-run random-embedding algorithms, the finite-time analysis of the proposed algorithm presents a theoretical upper bound on the regret as a function of the algorithm's number of iterations. Furthermore, numerical experiments were conducted to validate its performance. The results show a clear performance gain over recently proposed random embedding methods for large-scale problems, provided the intrinsic dimensionality is low.


Non-Additive Security Games

AAAI Conferences

Security agencies have found security games to be useful models to understand how to better protect their assets. The key practical elements in this work are: (i) the attacker can simultaneously attack multiple targets, and (ii) different targets exhibit different types of dependencies based on the assets being protected (e.g., protection of critical infrastructure, network security, etc.). However, little is known about the computational complexity of these problems, especially when there exist dependencies among the targets. Moreover, previous security game models do not in general scale well. In this paper, we investigate a general security game where the utility function is defined on a collection of subsets of all targets, and provide a novel theoretical framework to show how to compactly represent such a game, efficiently compute the optimal (minimax) strategies, and characterize the complexity of this problem. We apply our theoretical framework to the network security game. We characterize settings under which we find a polynomial time algorithm for computing optimal strategies. In other settings we prove the problem is NP-hard and provide an approximation algorithm.


The Positronic Economist: A Computational System for Analyzing Economic Mechanisms

AAAI Conferences

Computational mechanism analysis is a recent approach to economic analysis in which a mechanism design setting is analyzed entirely by a computer. For games with non-trivial numbers of players and actions, the approach is only feasible when these games can be encoded compactly, e.g., as Action-Graph Games. Such encoding is currently a manual process requiring expert knowledge; our aim is to simplify and automate it. Our contribution, the Positronic Economist is a software system having two parts: (1) a Python-based language for succinctly describing mechanisms; and (2) a system that takes such descriptions as input, automatically identifies computationally useful structure, and produces a compact Action-Graph Game.