Goto

Collaborating Authors

 South America


Learning the Kernel Matrix with Low-Rank Multiplicative Shaping

AAAI Conferences

Selecting the optimal kernel is an important and difficult challenge in applying kernel methods to pattern recognition. To address this challenge, multiple kernel learning (MKL) aims to learn a kernel from a combination of base kernel functions that perform optimally on the task. In this paper, we propose a novel MKL-themed approach to combine base kernels that are multiplicatively shaped with low-rank positive semidefinitve matrices. The proposed approach generalizes several popular MKL methods and thus provides more flexibility in modeling data. Computationally, we show how these low-rank matrices can be learned efficiently from data using convex quadratic programming. Empirical studies on several standard benchmark datasets for MKL show that the new approach often improves prediction accuracy statistically significantly over very competitive single kernel and other MKL methods.


A Bayesian Approach to the Data Description Problem

AAAI Conferences

In this paper, we address the problem of data description using a Bayesian framework. The goal of data description is to draw a boundary around objects of a certain class of interest to discriminate that class from the rest of the feature space. Data description is also known as one-class learning and has a wide range of applications. The proposed approach uses a Bayesian framework to precisely compute the class boundary and therefore can utilize domain information in form of prior knowledge in the framework. It can also operate in the kernel space and therefore recognize arbitrary boundary shapes. Moreover, the proposed method can utilize unlabeled data in order to improve accuracy of discrimination. We evaluate our method using various real-world datasets and compare it with other state of the art approaches of data description. Experiments show promising results and improved performance over other data description and one-class learning algorithms.


Query Rewriting for Horn-SHIQ Plus Rules

AAAI Conferences

Query answering over Description Logic (DL) ontologies has become a vibrant field of research. Efficient realizations often exploit database technology and rewrite a given query to an equivalent SQL or Datalog query over a database associated with the ontology. This approach has been intensively studied for conjunctive query answering in the DL-Lite and EL families, but is much less explored for more expressive DLs and queries. We present a rewriting-based algorithm for conjunctive query answering over Horn-SHIQ ontologies, possibly extended with recursive rules under limited recursion as in DL+log. This setting not only subsumes both DL-Lite and EL, but also yields an algorithm for answering (limited) recursive queries over Horn-SHIQ ontologies (an undecidable problem for full recursive queries). A prototype implementation shows its potential for applications, as experiments exhibit efficient query answering over full Horn-SHIQ ontologies and benign downscaling to DL-Lite, where it is competitive with comparable state of the art systems.


Choosing Linguistics over Vision to Describe Images

AAAI Conferences

In this paper, we address the problem of automatically generating human-like descriptions for unseen images, given a collection of images and their corresponding human-generated descriptions. Previous attempts for this task mostly rely on visual clues and corpus statistics, but do not take much advantage of the semantic information inherent in the available image descriptions. Here, we present a generic method which benefits from all these three sources (i.e. visual clues, corpus statistics and available descriptions) simultaneously, and is capable of constructing novel descriptions. Our approach works on syntactically and linguistically motivated phrases extracted from the human descriptions. Experimental evaluations demonstrate that our formulation mostly generates lucid and semantically correct descriptions, and significantly outperforms the previous methods on automatic evaluation metrics. One of the significant advantages of our approach is that we can generate multiple interesting descriptions for an image. Unlike any previous work, we also test the applicability of our method on a large dataset containing complex images with rich descriptions.


Trap Avoidance in Local Search Using Pseudo-Conflict Learning

AAAI Conferences

A key challenge in developing efficient local search solvers is to effectively minimise search stagnation (i.e. avoiding traps or local minima). A majority of the state-of-the-art local search solvers perform random and/or Novelty-based walks to overcome search stagnation. Although such strategies are effective in diversifying a search from its current local minimum, they do not actively prevent the search from visiting previously encountered local minima. In this paper, we propose a new preventative strategy to effectively minimise search stagnation using pseudo-conflict learning. We define a pseudo-conflict as a derived path from the search trajectory that leads to a local minimum. We then introduce a new variable selection scheme that penalises variables causing those pseudo-conflicts. Our experimental results show that the new preventative approach significantly improves the performance of local search solvers on a wide range of structured and random benchmarks.


A Novel and Scalable Spatio-Temporal Technique for Ocean Eddy Monitoring

AAAI Conferences

Swirls of ocean currents known as ocean eddies are a crucial component of the ocean's dynamics. In addition to dominating the ocean's kinetic energy, eddies play a significant role in the transport of water, salt, heat, and nutrients. Therefore, understanding current and future eddy patterns is a central climate challenge to address future sustainability of marine ecosystems. The emergence of sea surface height observations from satellite radar altimeter has recently enabled researchers to track eddies at a global scale. The majority of studies that identify eddies from observational data employ highly parametrized connected component algorithms using expert filtered data, effectively making reproducibility and scalability challenging. In this paper, we frame the challenge of monitoring ocean eddies as an unsupervised learning problem. We present a novel change detection algorithm that automatically identifies and monitors eddies in sea surface height data based on heuristics derived from basic eddy properties. Our method is accurate, efficient, and scalable. To demonstrate its performance we analyze eddy activity in the Nordic Sea (60-80N and 20W-20E), an area that has received limited attention and has proven to be difficult to analyze using other methods.


A Convex Formulation for Learning from Crowds

AAAI Conferences

Recently crowdsourcing services are often used to collect a large amount of labeled data for machine learning, since they provide us an easy way to get labels at very low cost and in a short period. The use of crowdsourcing has introduced a new challenge in machine learning, that is, coping with the variable quality of crowd-generated data. Although there have been many recent attempts to address the quality problem of multiple workers, only a few of the existing methods consider the problem of learning classifiers directly from such noisy data. All these methods modeled the true labels as latent variables, which resulted in non-convex optimization problems. In this paper, we propose a convex optimization formulation for learning from crowds without estimating the true labels by introducing personal models of the individual crowd workers. We also devise an efficient iterative method for solving the convex optimization problems by exploiting conditional independence structures in multiple classifiers. We evaluate the proposed method against three competing methods on synthetic data sets and a real crowdsourced data set and demonstrate that the proposed method outperforms the other three methods.


Polarimetric SAR Image Segmentation with B-Splines and a New Statistical Model

arXiv.org Machine Learning

SAR sensors work on the microwaves spectrum, so they are almost immune to adverse weather conditions and they are able to penetrate, to some extent, the surface of certain targets. The first civilian SAR satellite was launched in 1978, and it was followed by a constellation of other similar sensors, mostly devoted to specific applications and in all cases operated at a single frequency and polarization. The Shuttle Imaging Radar-C/X-band SAR (SIRC/XSAR), launched in 1994, could be operated simultaneously at three frequencies, with two of them able to transmit and receive at both horizontal and vertical polarization. This polarimetric capability provides a more complete description of the target [46]. Polarimetric images are multiple complex-valued data sets requiring, thus, specialized models and algorithms.


Hypothesis Testing in Speckled Data with Stochastic Distances

arXiv.org Machine Learning

Images obtained with coherent illumination, as is the case of sonar, ultrasound-B, laser and Synthetic Aperture Radar -- SAR, are affected by speckle noise which reduces the ability to extract information from the data. Specialized techniques are required to deal with such imagery, which has been modeled by the G0 distribution and under which regions with different degrees of roughness and mean brightness can be characterized by two parameters; a third parameter, the number of looks, is related to the overall signal-to-noise ratio. Assessing distances between samples is an important step in image analysis; they provide grounds of the separability and, therefore, of the performance of classification procedures. This work derives and compares eight stochastic distances and assesses the performance of hypothesis tests that employ them and maximum likelihood estimation. We conclude that tests based on the triangular distance have the closest empirical size to the theoretical one, while those based on the arithmetic-geometric distances have the best power. Since the power of tests based on the triangular distance is close to optimum, we conclude that the safest choice is using this distance for hypothesis testing, even when compared with classical distances as Kullback-Leibler and Bhattacharyya.


Nonparametric Edge Detection in Speckled Imagery

arXiv.org Machine Learning

We address the issue of edge detection in Synthetic Aperture Radar imagery. In particular, we propose nonparametric methods for edge detection, and numerically compare them to an alternative method that has been recently proposed in the literature. Our results show that some of the proposed methods display superior results and are computationally simpler than the existing method. An application to real (not simulated) data is presented and discussed.