Goto

Collaborating Authors

 Genre


2D Path Solutions from a Single Layer Excitable CNN Model

arXiv.org Artificial Intelligence

An easily implementable path solution algorithm for 2D spatial problems, based on excitable/programmable characteristics of a specific cellular nonlinear network (CNN) model is presented and numerically investigated. The network is a single layer bioinspired model which was also implemented in CMOS technology. It exhibits excitable characteristics with regionally bistable cells. The related response realizes propagations of trigger autowaves, where the excitable mode can be globally preset and reset. It is shown that, obstacle distributions in 2D space can also be directly mapped onto the coupled cell array in the network. Combining these two features, the network model can serve as the main block in a 2D path computing processor. The related algorithm and configurations are numerically experimented with circuit level parameters and performance estimations are also presented. The simplicity of the model also allows alternative technology and device level implementation, which may become critical in autonomous processor design of related micro or nanoscale robotic applications.


Overcoming Hierarchical Difficulty by Hill-Climbing the Building Block Structure

arXiv.org Artificial Intelligence

The Building Block Hypothesis suggests that Genetic Algorithms (GAs) are well-suited for hierarchical problems, where efficient solving requires proper problem decomposition and assembly of solution from sub-solution with strong non-linear interdependencies. The paper proposes a hill-climber operating over the building block (BB) space that can efficiently address hierarchical problems. The new Building Block Hill-Climber (BBHC) uses past hill-climb experience to extract BB information and adapts its neighborhood structure accordingly. The perpetual adaptation of the neighborhood structure allows the method to climb the hierarchical structure solving successively the hierarchical levels. It is expected that for fully non deceptive hierarchical BB structures the BBHC can solve hierarchical problems in linearithmic time. Empirical results confirm that the proposed method scales almost linearly with the problem size thus clearly outperforms population based recombinative methods.


Random Sentences from a Generalized Phrase-Structure Grammar Interpreter

arXiv.org Artificial Intelligence

In numerous domains in cognitive science it is often useful to have a source for randomly generated corpora. These corpora may serve as a foundation for artificial stimuli in a learning experiment (e.g., Ellefson & Christiansen, 2000), or as input into computational models (e.g., Christiansen & Dale, 2001). The following compact and general C program interprets a phrasestructure grammar specified in a text file. It follows parameters set at a Unix or Unix-based command-line and generates a corpus of random sentences from that grammar. The first and required input into the program is a file that contains a phrase-structure grammar description (see below).


How to Beat the Adaptive Multi-Armed Bandit

arXiv.org Artificial Intelligence

The multi-armed bandit is a concise model for the problem of iterated decision-making under uncertainty. In each round, a gambler must pull one of $K$ arms of a slot machine, without any foreknowledge of their payouts, except that they are uniformly bounded. A standard objective is to minimize the gambler's regret, defined as the gambler's total payout minus the largest payout which would have been achieved by any fixed arm, in hindsight. Note that the gambler is only told the payout for the arm actually chosen, not for the unchosen arms. Almost all previous work on this problem assumed the payouts to be non-adaptive, in the sense that the distribution of the payout of arm $j$ in round $i$ is completely independent of the choices made by the gambler on rounds $1, \dots, i-1$. In the more general model of adaptive payouts, the payouts in round $i$ may depend arbitrarily on the history of past choices made by the algorithm. We present a new algorithm for this problem, and prove nearly optimal guarantees for the regret against both non-adaptive and adaptive adversaries. After $T$ rounds, our algorithm has regret $O(\sqrt{T})$ with high probability (the tail probability decays exponentially). This dependence on $T$ is best possible, and matches that of the full-information version of the problem, in which the gambler is told the payouts for all $K$ arms after each round. Previously, even for non-adaptive payouts, the best high-probability bounds known were $O(T^{2/3})$, due to Auer, Cesa-Bianchi, Freund and Schapire. The expected regret of their algorithm is $O(T^{1/2}) for non-adaptive payouts, but as we show, $Ω(T^{2/3})$ for adaptive payouts.


Minimum Cost Homomorphisms to Proper Interval Graphs and Bigraphs

arXiv.org Artificial Intelligence

For graphs $G$ and $H$, a mapping $f: V(G)\dom V(H)$ is a homomorphism of $G$ to $H$ if $uv\in E(G)$ implies $f(u)f(v)\in E(H).$ If, moreover, each vertex $u \in V(G)$ is associated with costs $c_i(u), i \in V(H)$, then the cost of the homomorphism $f$ is $\sum_{u\in V(G)}c_{f(u)}(u)$. For each fixed graph $H$, we have the {\em minimum cost homomorphism problem}, written as MinHOM($H)$. The problem is to decide, for an input graph $G$ with costs $c_i(u),$ $u \in V(G), i\in V(H)$, whether there exists a homomorphism of $G$ to $H$ and, if one exists, to find one of minimum cost. Minimum cost homomorphism problems encompass (or are related to) many well studied optimization problems. We describe a dichotomy of the minimum cost homomorphism problems for graphs $H$, with loops allowed. When each connected component of $H$ is either a reflexive proper interval graph or an irreflexive proper interval bigraph, the problem MinHOM($H)$ is polynomial time solvable. In all other cases the problem MinHOM($H)$ is NP-hard. This solves an open problem from an earlier paper. Along the way, we prove a new characterization of the class of proper interval bigraphs.


Cross-lingual keyword assignment

arXiv.org Artificial Intelligence

Introduction In the last years, many useful NLP tools have been developed and many of them are now even available commercially. Most of these tools are monolingual or multi-monolingual, meaning that the software can deal with more than one language, but that the re sults will al ways be displayed in the same language as the text. We therefore distinguish these applications from cross-lingual software, which is software that helps to transgress the language bound ary. Examples for such applications are machine translation and cross-lingual document retrieval, i.e. retrieval using search engines which allow to en ter a search term in one language and which also yield results in other languages, usually because the query is translated in one way or another. In our eyes, cross-lingual applications are currently the bottleneck of available NLP tools. To our knowledge, there are no applications that allow comparing documents written in dif ferent languages with each other and there are very few which give users a quick overview of the ap proximate contents of documents written in different languages.


Automatic Identification of Document Translations in Large Multilingual Document Collections

arXiv.org Artificial Intelligence

Texts and their translations are a rich linguistic resource that can be used to train and test statistics-based Machine Translation systems and many other applications. In this paper, we present a working system that can identify translations and other very similar documents among a large number of candidates, by representing the document contents with a vector of thesaurus terms from a multilingual thesaurus, and by then measuring the semantic similarity between the vectors. Tests on different text types have shown that the system can detect translations with over 96% precision in a large search space of 820 documents or more. The system was tuned to ignore language-specific similarities and to give similar documents in a second language the same similarity score as equivalent documents in the same language. The application can also be used to detect cross-lingual document plagiarism.


Automatic annotation of multilingual text collections with a conceptual thesaurus

arXiv.org Artificial Intelligence

Automatic annotation of documents with controlled vocabulary terms (descriptors) from a conceptual thesaurus is not only useful for document indexing and retrieval. The mapping of texts onto the same thesaurus furthermore allows to establish links between similar documents. This is also a substantial requirement of the Semantic Web. This paper presents an almost language-independent system that maps documents written in different languages onto the same multilingual conceptual thesaurus, EUROVOC. Conceptual thesauri differ from Natural Language Thesauri in that they consist of relatively small controlled lists of words or phrases with a rather abstract meaning. To automatically identify which thesaurus descriptors describe the contents of a document best, we developed a statistical, associative system that is trained on texts that have previously been indexed manually. In addition to describing the large number of empirically optimised parameters of the fully functional application, we present the performance of the software according to a human evaluation by professional indexers.


The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages

arXiv.org Artificial Intelligence

We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EUanguages, with additional documents being available in the languages of the EU candidate countries. The corpus consists of almost 8,000 documents per language, with an average size of nearly 9 million words per language. Pair-wise paragraph alignment information produced by two different aligners (Vanilla and HunAlign) is available for all 190+ language pair combinations. Most texts have been manually classified according to the EUROVOC subject domains so that the collection can also be used to train and test multi-label classification algorithms and keyword-assignment software. The corpus is encoded in XML, according to the Text Encoding Initiative Guidelines. Due to the large number of parallel texts in many languages, the JRC-Acquis is particularly suitable to carry out all types of cross-language research, as well as to test and benchmark text analysis software across different languages (for instance for alignment, sentence splitting and term extraction).


Interactive Configuration by Regular String Constraints

arXiv.org Artificial Intelligence

A product configurator which is complete, backtrack free and able to compute the valid domains at any state of the configuration can be constructed by building a Binary Decision Diagram (BDD). Despite the fact that the size of the BDD is exponential in the number of variables in the worst case, BDDs have proved to work very well in practice. Current BDD-based techniques can only handle interactive configuration with small finite domains. In this paper we extend the approach to handle string variables constrained by regular expressions. The user is allowed to change the strings by adding letters at the end of the string. We show how to make a data structure that can perform fast valid domain computations given some assignment on the set of string variables. We first show how to do this by using one large DFA. Since this approach is too space consuming to be of practical use, we construct a data structure that simulates the large DFA and in most practical cases are much more space efficient. As an example a configuration problem on $n$ string variables with only one solution in which each string variable is assigned to a value of length of $k$ the former structure will use $Ω(k^n)$ space whereas the latter only need $O(kn)$. We also show how this framework easily can be combined with the recent BDD techniques to allow both boolean, integer and string variables in the configuration problem.