Goto

Collaborating Authors

 Africa


Detecting Events and Patterns in Large-Scale User Generated Textual Streams with Statistical Learning Methods

arXiv.org Machine Learning

A vast amount of textual web streams is influenced by events or phenomena emerging in the real world. The social web forms an excellent modern paradigm, where unstructured user generated content is published on a regular basis and in most occasions is freely distributed. The present Ph.D. Thesis deals with the problem of inferring information - or patterns in general - about events emerging in real life based on the contents of this textual stream. We show that it is possible to extract valuable information about social phenomena, such as an epidemic or even rainfall rates, by automatic analysis of the content published in Social Media, and in particular Twitter, using Statistical Machine Learning methods. An important intermediate task regards the formation and identification of features which characterise a target event; we select and use those textual features in several linear, non-linear and hybrid inference approaches achieving a significantly good performance in terms of the applied loss function. By examining further this rich data set, we also propose methods for extracting various types of mood signals revealing how affective norms - at least within the social web's population - evolve during the day and how significant events emerging in the real world are influencing them. Lastly, we present some preliminary findings showing several spatiotemporal characteristics of this textual information as well as the potential of using it to tackle tasks such as the prediction of voting intentions.


Domain and Function: A Dual-Space Model of Semantic Relations and Compositions

Journal of Artificial Intelligence Research

Given appropriate representations of the semantic relations between carpenter and wood and between mason and stone (for example, vectors in a vector space model), a suitable algorithm should be able to recognize that these relations are highly similar (carpenter is to wood as mason is to stone; the relations are analogous). Likewise, with representations of dog, house, and kennel, an algorithm should be able to recognize that the semantic composition of dog and house, dog house, is highly similar to kennel (dog house and kennel are synonymous). It seems that these two tasks, recognizing relations and compositions, are closely connected. However, up to now, the best models for relations are significantly different from the best models for compositions. In this paper, we introduce a dual-space model that unifies these two tasks. This model matches the performance of the best previous models for relations and compositions. The dual-space model consists of a space for measuring domain similarity and a space for measuring function similarity. Carpenter and wood share the same domain, the domain of carpentry. Mason and stone share the same domain, the domain of masonry. Carpenter and mason share the same function, the function of artisans. Wood and stone share the same function, the function of materials. In the composition dog house, kennel has some domain overlap with both dog and house (the domains of pets and buildings). The function of kennel is similar to the function of house (the function of shelters). By combining domain and function similarities in various ways, we can model relations, compositions, and other aspects of semantics.


Tractable Triangles and Cross-Free Convexity in Discrete Optimisation

Journal of Artificial Intelligence Research

The minimisation problem of a sum of unary and pairwise functions of discrete variables is a general NP-hard problem with wide applications such as computing MAP configurations in Markov Random Fields (MRF), minimising Gibbs energy, or solving binary Valued Constraint Satisfaction Problems (VCSPs). We study the computational complexity of classes of discrete optimisation problems given by allowing only certain types of costs in every triangle of variable-value assignments to three distinct variables. We show that for several computational problems, the only non- trivial tractable classes are the well known maximum matching problem and the recently discovered joint-winner property. Our results, apart from giving complete classifications in the studied cases, provide guidance in the search for hybrid tractable classes; that is, classes of problems that are not captured by restrictions on the functions (such as submodularity) or the structure of the problem graph (such as bounded treewidth). Furthermore, we introduce a class of problems with convex cardinality functions on cross-free sets of assignments. We prove that while imposing only one of the two conditions renders the problem NP-hard, the conjunction of the two gives rise to a novel tractable class satisfying the cross-free convexity property, which generalises the joint-winner property to problems of unbounded arity.


Hierarchical Modeling with Tensor Inputs

AAAI Conferences

In many real applications, the input data are naturally expressed as tensors, such as virtual metrology in semiconductor manufacturing, face recognition and gait recognition in computer vision, etc. In this paper, we propose a general optimization framework for dealing with tensor inputs. Most existing methods for supervised tensor learning use only rank-one weight tensors in the linear model and cannot readily incorporate domain knowledge. In our framework, we obtain the weight tensor in a hierarchical way — we first approximate it by a low-rank tensor, and then estimate the low-rank approximation using the prior knowledge from various sources, e.g., different domain experts. This is motivated by wafer quality prediction in semiconductor manufacturing. Furthermore, we propose an effective algorithm named H-MOTE for solving this framework, which is guaranteed to converge. The time complexity of H-MOTE is linear with respect to the number of examples as well as the size of the weight tensor. Experimental results show the superiority of H-MOTE over state-of-the-art techniques on both synthetic and real data sets.


Towards Using Discrete Multiagent Pathfinding to Address Continuous Problems

AAAI Conferences

Motivated by efficient algorithms for solving combina- torial and discrete instances of the multi-agent pathfinding problem, this report investigates ways to utilize such solutions to solve similar problems in the continuous domain. While a simple discretization of the space which allows the direct application of combinatorial algorithms seems like a straightforward solution, there are additional constraints that such a discretization needs to satisfy in order to be able to provide some form of completeness guarantees in general configuration spaces. This report reviews ideas on how to utilize combinatorial algorithms to solve continuous multi-agent pathfinding problems. It aims to collect feedback from the community regarding the importance and the complexity of this challenge, as well as the appropriateness of the solutions considered here.


DEC-A*: A Decentralized A* Algorithm

AAAI Conferences

A* is the algorithm of finding the shortest path between two nodes in a graph. When the searching problem is constituted of a set of linked graphs, A* searches solution like if it is face of one graph formed by linked graphs. While researchers have developed solutions to reduce the execution time of A* in multiple cases by multiples techniques, we develop a new algorithm: DEC-A* which is a decentralized version of A* composing a solution through a collection of graph. A* uses a distance-plus-cost heuristic function to determine the order in which the search visits nodes in the tree. Our algorithm DEC-A* extends the evaluation of the distance-plus-cost heuristic to be the sum of two functions : local distance, which evaluates the cost to reach the nearest neighbor node s to the goal, and global distance which evaluates the cost from s to the goal through other graphs. DEC-A* reduces the time of finding the shortest path and reduces the complexity, while ensuring the privacy of graphs.


Repeated Sequential Auctions with Dynamic Task Clusters

AAAI Conferences

Sequential auctions can be used to provide solutions to the multi-robot task-allocation problem. In this paper we extend previous work on sequential auctions and propose an algorithm that clusters and auctions uninitiated task clusters repeatedly upon the completion of individual tasks. We demonstrate empirically that our algorithm results in lower overall team costs than other sequential auction algorithms that only assign tasks once.


Visibility Induction for Discretized Pursuit-Evasion Games

AAAI Conferences

We study a two-player pursuit-evasion game, in which an agent moving amongst obstacles is to be maintained within ``sight" of a pursuing robot. Using a discretization of the environment, our main contribution is to design an efficient algorithm that decides, given initial positions of both pursuer and evader, if the evader can take any moving strategy to go out of sight of the pursuer at any time instant. If that happens, we say that the evader wins the game. We analyze the algorithm, present several optimizations and show results for different environments. For situations where the evader cannot win, we compute, in addition, a pursuit strategy that keeps the evader within sight, for every strategy the evader can take. Finally, if it is determined that the evader wins, we compute its optimal escape trajectory and the corresponding optimal pursuit trajectory.


Symbolic Dynamic Programming for Continuous State and Action MDPs

AAAI Conferences

Many real-world decision-theoretic planning problemsare naturally modeled using both continuous state andaction (CSA) spaces, yet little work has provided ex-act solutions for the case of continuous actions. Inthis work, we propose a symbolic dynamic program-ming (SDP) solution to obtain the optimal closed-formvalue function and policy for CSA-MDPs with mul-tivariate continuous state and actions, discrete noise,piecewise linear dynamics, and piecewise linear (or re-stricted piecewise quadratic) reward. Our key contribu-tion over previous SDP work is to show how the contin-uous action maximization step in the dynamic program-ming backup can be evaluated optimally and symboli-cally — a task which amounts to symbolic constrainedoptimization subject to unknown state parameters; wefurther integrate this technique to work with an efficientand compact data structure for SDP — the extendedalgebraic decision diagram (XADD). We demonstrateempirical results on a didactic nonlinear planning exam-ple and two domains from operations research to showthe first automated exact solution to these problems.


Stochastic Safest and Shortest Path Problems

AAAI Conferences

Optimal solutions to Stochastic Shortest Path Problems (SSPs) usually require that there exists at least one policy that reaches the goal with probability 1 from the initial state. This condition is very strong and prevents from solving many interesting problems, for instance where all possible policies reach some dead-end states with a positive probability. We introduce a more general and richer dual optimization criterion, which minimizes the average (undiscounted) cost of only paths leading to the goal among all policies that maximize the probability to reach the goal. We present policy update equations in the form of dynamic programming for this new dual criterion, which are different from the standard Bellman equations, but produce the same solution if there exists one policy leading to the goal with probability 1 from the initial state. We demonstrate that our equations converge in infinite horizon without any condition on the structure of the problem or on its policies, which actually extends the class of SSPs that can be solved. We experimentally show that our dual criterion provides well-founded solutions to SSPs that can not be solved by the standard criterion, and that using a discount factor with the latter certainly provides solution policies but which are not optimal considering our well-founded criterion.