Goto

Collaborating Authors

 South America


Graph Construction for Semi-Supervised Learning

AAAI Conferences

Semi-Supervised Learning (SSL) techniques have become very relevant since they require a small set of labeled data. In this scenario, graph-based SSL algorithms provide a powerful framework for modeling manifold structures in high-dimensional spaces and are effective for the propagation of the few initial labels present in training data through the graph. An important step in graph-based SSL methods is the conversion of tabular data into a weighted graph. The graph construction has a key role in the quality of the classification in graph-based methods. Nevertheless, most of the SSL literature focuses on developing label inference algorithms without studying graph construction methods and its effect on the base algorithm performance. This PhD project aims to study this issue and proposes new methods for graph construction from ๏ฌ‚at data and improves the performance of the graph-based algorithms.


Expressive Rule-Based Stream Reasoning

AAAI Conferences

Stream reasoning is the task of continuously deriving conclusions on streaming data. As a research theme, it is targeted by different communities which emphasize different aspects, e.g., throughput vs. expressiveness. This thesis aims to advance the theoretical foundations underlying diverse stream reasoning approaches and to convert obtained insights into a prototypical expressive rule-based reasoning system that is lacking to date.


Heuristics for Cost-Optimal Classical Planning Based on Linear Programming

AAAI Conferences

This model is used to automatically synthetise a controller that maps executions to the next action to perform. Many heuristics for cost-optimal planning are The problem is thus cast as a synthesis problem from a based on linear programming. We cover several given specification. Two obstacles for this approach are that interesting heuristics of this type by a common a suitable model for the task is needed, and that the synthesis framework that fixes the objective function of the problem is intractable in general. But, this intractability does linear program. Within the framework, constraints not preclude the approach from being effective in meaningful from different heuristics can be combined in one cases. Planning is the model-based approach to autonomous heuristic estimate which dominates the maximum behaviour.


Using Social Media to Enhance Emergency Situation Awareness: Extended Abstract

AAAI Conferences

Social media platforms, such as Twitter, offer a rich source of real-time information about real-world events, particularly during mass emergencies. Sifting valuable information from social media provides useful insight into time-critical situations for emergency officers to understand the impact of hazards and act on emergency responses in a timely manner. This work focuses on analyzing Twitter messages generated during natural disasters, and shows how natural language processing and data mining techniques can be utilized to extract situation awareness information from Twitter. We present key relevant approaches that we have investigated including burst detection, tweet filtering and classification, online clustering, and geotagging.


Capturing a Musician's Groove: Generation of Realistic Accompaniments from Single Song Recordings

AAAI Conferences

This demonstration presents a concatenative synthesis engine for the generation of musical accompaniments, based on chord progressions. The system takes a player's song recording as input, and generates the accompaniment for any other song, based on the input content. We show that working on accompaniment requires a special care about temporal deviations at the border of the sliced chunks, because they make most of the rhythmic groove. We address it by discriminating accidental deviations against intentional ones, in order to correct the first while keeping the second. We will provide a full demonstration of the system, from the recording process to the generation, in various conditions, inviting the audience to participate.


A Direct Boosting Approach for Semi-supervised Classification

AAAI Conferences

We introduce a semi-supervised boosting approach (SSDBoost), which directly minimizes the classification errors and maximizes the margins on both labeled and unlabeled samples, without resorting to any upper bounds or approximations. A two-step algorithm based on coordinate descent/ascent is proposed to implement SSDBoost. Experiments on a number of UCI datasets and synthetic data show that SSDBoost gives competitive or superior results over the state-of-the-art supervised and semi-supervised boosting algorithms in the cases that the labeled data is limited, and it is very robust in noisy cases.


Discriminative Unsupervised Dimensionality Reduction

AAAI Conferences

As an important machine learning topic, dimensionality reduction has been widely studied and utilized in various kinds of areas. A multitude of dimensionality reduction methods have been developed, among which unsupervised dimensionality reduction is more desirable when obtaining label information requires onerous work. However, most previous unsupervised dimensionality reduction methods call for an affinity graph constructed beforehand, with which the following dimensionality reduction steps can be then performed. Separation of graph construction and dimensionality reduction leads the dimensionality reduction process highly dependent on quality of the input graph. In this paper, we propose a novel graph embedding method for unsupervised dimensionality reduction. We simultaneously conduct dimensionality reduction along with graph construction by assigning adaptive and optimal neighbors according to the projected local distances. Our method doesnโ€™t need an affinity graph constructed in advance, but instead learns the graph concurrently with dimensionality reduction. Thus, the learned graph is optimal for dimensionality reduction. Meanwhile, our learned graph has an explicit block diagonal structure, from which the clustering results could be directly revealed without any postprocessing steps. Extensive empirical results on dimensionality reduction as well as clustering are presented to corroborate the performance of our method.


An Efficient Classifier Based on Hierarchical Mixing Linear Support Vector Machines

AAAI Conferences

SVM in advance, and this limits their applications to largescale problems. To address this issue, several methods for Support vector machines (SVMs) play a very dominant selecting a set of basis vectors are proposed. They include role in data classification due to their good sampling from the training set in the Nystrom method generalization performance. However, they suffer [Williams and Seeger, 2001] and variants of the Incomplete from the high computational complexity in the Cholesky factorization [Bach and Jordan, 2005], core vector classification phase when there are a considerable machine (CVM) [Tsang et al., 2005], relevance vector machine number of support vectors (SVs). Then it is desirable (RVM)[Tipping, 2001], and relevance units machine to design efficient algorithms in the classification (RUM)[Gao and Zhang, 2009]. Wu et al. [Wu et al., 2006] phase to deal with the datasets of realtime add one constraint on the number of basis vectors to the standard pattern recognition systems. To this end, we SVM optimization problem, and then solve this modified propose a novel classifier called HMLSVMs (Hierarchical nonconvex problem to build sparse kernel learning algorithms Mixing Linear Support Vector Machines) (SKLA). Joachims and Yu [Joachims and Yu, 2009] in this paper, which has a hierarchical structure explore a new sparse kernel SVMs via cutting plane training, with a mixing linear SVMs classifier at each node called cutting-plane subspace pursuit (CPSP).Although and predicts the label of a sample using only a the above methods prunes the SVs and reduces computational few hyperplanes. We also give a generalization complexity in classification phase, when a new test sample is error bound for the class of locally linear SVMs introduced, they still need to compare it with these pruned (LLSVMs) based on the Rademacher theory, which SVs via kernel calculations to predict the label of the test ensures that overfitting can be effectively avoided.


Feature Selection from Microarray Data via an Ordered Search with Projected Margin

AAAI Conferences

Microarray experiments are capable of measuring the expression level of thousands of genes simultaneously. Dealing with this enormous amount of information requires complex computation. Support Vector Machines (SVM) have been widely used with great efficiency to solve classification problems that have high dimension. In this sense, it is plausible to develop new feature selection strategies for microarray data that are associated with this type of classifier. Therefore, we propose, in this paper, a new method for feature selection based on an ordered search process to explore the space of possible subsets. The algorithm, called Admissible Ordered Search (AOS), uses as evaluation function the margin values estimated for each hypothesis by a SVM classifier. An important theoretical contribution of this paper is the development of the projected margin concept. This value is computed as the margin vector projection on a lower dimensional subspace and is used as an upper bound for the current value of the hypothesis in the search process. This enables great economy in runtime and consequently efficiency in the search process as a whole. The algorithm was tested using five different microarray data sets yielding superior results when compared to three representative feature selection methods.


Sketch the Storyline with CHARCOAL: A Non-Parametric Approach

AAAI Conferences

Generating a coherent synopsis and revealing the development threads for news stories from the increasing amounts of news content remains aformidable challenge. In this paper, we proposed a hddCRP (hybird distant-dependent ChineseRestaurant Process) based HierARChical tOpic model for news Article cLustering, abbreviated as CHARCOAL. Given a bunch of news articles, the outcome of CHARCOAL is threefold: 1) it aggregates relevant new articles into clusters (i.e., stories); 2) it disentangles the chain links (i.e., storyline) between articles in their describing story; 3) it discerns the topics that each story is assigned (e.g., Malaysia Airlines Flight 370 story belongs to the aircraft accident topic and U.S presidential election stories belong to the politics topic). CHARCOAL completes this task by utilizing a hddCRP as prior, and the entities (e.g., names of persons, organizations, or locations) that appear in news articles as clues. Moveover, the adaptation of nonparametric nature in CHARCOAL makes our model can adaptively learn the appropriate number of stories and topics from news corpus. The experimental analysis and results demonstrate both interpretability and superiority of the proposed approach.