Goto

Collaborating Authors

 Asia


The CQC Algorithm: Cycling in Graphs to Semantically Enrich and Enhance a Bilingual Dictionary

Journal of Artificial Intelligence Research

Bilingual machine-readable dictionaries are knowledge resources useful in many automatic tasks. However, compared to monolingual computational lexicons like WordNet, bilingual dictionaries typically provide a lower amount of structured information such as lexical and semantic relations, and often do not cover the entire range of possible translations for a word of interest. In this paper we present Cycles and Quasi-Cycles (CQC), a novel algorithm for the automated disambiguation of ambiguous translations in the lexical entries of a bilingual machine-readable dictionary. The dictionary is represented as a graph, and cyclic patterns are sought in this graph to assign an appropriate sense tag to each translation in a lexical entry. Further, we use the algorithm's output to improve the quality of the dictionary itself, by suggesting accurate solutions to structural problems such as misalignments, partial alignments and missing entries. Finally, we successfully apply CQC to the task of synonym extraction.


Vector-valued Reproducing Kernel Banach Spaces with Applications to Multi-task Learning

arXiv.org Machine Learning

The purpose of this paper is to establish the notion of vector-valued reproducing kernel Banach spaces and demonstrate its applications to multi-task machine learning. Built on the theory of scalar-valued reproducing kernel Hilbert spaces (RKHS) [3], kernel methods have been proven successful in single task machine learning [10, 14, 29, 30, 33]. Multi-task learning where the unknown target function to be learned from finite sample data is vector-valued appears more often in practice. References [13, 25] proposed the development of kernel methods for learning multiple related tasks simultaneously. The mathematical foundation used there was the theory of vector-valued RKHS [5, 27].


Extended Mixture of MLP Experts by Hybrid of Conjugate Gradient Method and Modified Cuckoo Search

arXiv.org Artificial Intelligence

This paper investigates a new method for improving the learning algorithm of Mixture of Experts (ME) model using a hybrid of Modified Cuckoo Search (MCS) and Conjugate Gradient (CG) as a second order optimization technique. The CG technique is combined with Back-Propagation (BP) algorithm to yield a much more efficient learning algorithm for ME structure. In addition, the experts and gating networks in enhanced model are replaced by CG based Multi-Layer Perceptrons (MLPs) to provide faster and more accurate learning. The CG is considerably depends on initial weights of connections of Artificial Neural Network (ANN), so, a metaheuristic algorithm, the so-called Modified Cuckoo Search is applied in order to select the optimal weights. The performance of proposed method is compared with Gradient Decent Based ME (GDME) and Conjugate Gradient Based ME (CGME) in classification and regression problems. The experimental results show that hybrid MSC and CG based ME (MCS-CGME) has faster convergence and better performance in utilized benchmark data sets.


The Future of Search and Discovery in Big Data Analytics: Ultrametric Information Spaces

arXiv.org Machine Learning

Under the heading of "Addressing the big data challenge", the European 7th Framework Programme sees the issue thus (see INFSO, 2012): "Recent industry reports detail how data volumes are growing at a faster rate than our ability to interpret and exploit them for innovative ICT applications, for decision support, planning, monitoring, control and interaction. This includes unstructured data types such as video, audio, images and free text as well as structured data types such as database records, sensor readings and 3D. While each of these types requires some specific form of processing and analytics, many of the general principles for managing and storing them at extreme scales are common across all of them." Analytics tool capability is called for, to address these burgeoning issues in the data intensive industries, to support "effective policy making and implementation" of public bodies resulting in "significant annual savings from 1 Big Data applications", and also to exploit open, linked data - "foster the reuse of public sector information and strengthen other open data activities linked to commercial exploitation." The "big data" marketplace is stated to be potentially worth approximately USD 600 billion. To address the challenges of search and discovery in massive and complex data sets and data flows, it is our contention in this work that we must move to an appropriate topology - to an appropriate framework such that computation is greatly facilitated. Our work is all about empowering those who are involved in data analytics, through clustering and related algorithms, to face these new challenges. Scalability and interactivity are two of the performance issues that follow directly from clustering algorithms, for search, retrieval and discovery, that are of linear computational complexity or better (logarithmic, or constant).


What Cannot be Learned with Bethe Approximations

arXiv.org Machine Learning

We address the problem of learning the parameters in graphical models when inference is intractable. A common strategy in this case is to replace the partition function with its Bethe approximation. We show that there exists a regime of empirical marginals where such Bethe learning will fail. By failure we mean that the empirical marginals cannot be recovered from the approximated maximum likelihood parameters (i.e., moment matching is not achieved). We provide several conditions on empirical marginals that yield outer and inner bounds on the set of Bethe learnable marginals. An interesting implication of our results is that there exists a large class of marginals that cannot be obtained as stable fixed points of belief propagation. Taken together our results provide a novel approach to analyzing learning with Bethe approximations and highlight when it can be expected to work or fail.


New Probabilistic Bounds on Eigenvalues and Eigenvectors of Random Kernel Matrices

arXiv.org Machine Learning

Kernel methods are successful approaches for different machine learning problems. This success is mainly rooted in using feature maps and kernel matrices. Some methods rely on the eigenvalues/eigenvectors of the kernel matrix, while for other methods the spectral information can be used to estimate the excess risk. An important question remains on how close the sample eigenvalues/eigenvectors are to the population values. In this paper, we improve earlier results on concentration bounds for eigenvalues of general kernel matrices. Meanwhile, the obstacles for sharper bounds are accounted for and partially addressed. As a case study, we derive a concentration inequality for sample kernel target-alignment. 1 INTRODUCTION Kernel methods such as Spectral Clustering, Kernel Principal Component Analysis(KPCA), and Support Vector Machines, are successful approaches in many practical machine learning and data analysis problems (Steinwart & Christmann, 2008). The main ingredient of these methods is the kernel matrix, which is built using the kernel function, evaluated at given sample points.


Discovering causal structures in binary exclusive-or skew acyclic models

arXiv.org Machine Learning

Discovering causal relations among observed variables in a given data set is a main topic in studies of statistics and artificial intelligence. Recently, some techniques to discover an identifiable causal structure have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary data and propose a new approach to derive an identifiable causal structure governing the data based on skew Bernoulli distributions of external noise. Experimental evaluation shows excellent performance for both artificial and real world data sets.


Segmentation of Offline Handwritten Bengali Script

arXiv.org Artificial Intelligence

Character segmentation is one of the most important decision processes for optical character recognition (OCR). Isolating individual alphabetic characters in the script image is often significant enough to make a decisive contribution towards the success rate of the overall system. An OCR system may be designed to work for either of online and off-line purposes. Online OCR systems collect input data by recording the order of strokes made by the write on an electronic bit-pad, and off-line OCR systems do the same by recording the pixel by pixel digital image of the entire writing with a digital scanner. OCR has a wide field of application covering handwritten document transcription, automatic mail address recognition, machine processing of bankchecks, faxes etc. Off-line OCR of hand written words has long been an active area research. Some important contributions so far made in this field involve analysis of English texts [1], [2], [3], [5], Chinese script [6] and Arabic characters [9]. With this background of research, the present work considers Bengali script for developing suitable techniques for off-line OCR with it.


Iterated risk measures for risk-sensitive Markov decision processes with discounted cost

arXiv.org Artificial Intelligence

We demonstrate a limitation of discounted expected utility, a standard approach for representing the preference to risk when future cost is discounted. Specifically, we provide an example of the preference of a decision maker that appears to be rational but cannot be represented with any discounted expected utility. A straightforward modification to discounted expected utility leads to inconsistent decision making over time. We will show that an iterated risk measure can represent the preference that cannot be represented by any discounted expected utility and that the decisions based on the iterated risk measure are consistent over time.


Dynamic Mechanism Design for Markets with Strategic Resources

arXiv.org Artificial Intelligence

The assignment of tasks to multiple resources becomes an interesting game theoretic problem, when both the task owner and the resources are strategic. In the classical, nonstrategic setting, where the states of the tasks and resources are observable by the controller, this problem is that of finding an optimal policy for a Markov decision process (MDP). When the states are held by strategic agents, the problem of an efficient task allocation extends beyond that of solving an MDP and becomes that of designing a mechanism. Motivated by this fact, we propose a general mechanism which decides on an allocation rule for the tasks and resources and a payment rule to incentivize agents' participation and truthful reports. In contrast to related dynamic strategic control problems studied in recent literature, the problem studied here has interdependent values: the benefit of an allocation to the task owner is not simply a function of the characteristics of the task itself and the allocation, but also of the state of the resources. We introduce a dynamic extension of Mezzetti's two phase mechanism for interdependent valuations. In this changed setting, the proposed dynamic mechanism is efficient, within period ex-post incentive compatible, and within period ex-post individually rational.