AITopics

doi: 10.1613/jair.2088

10500

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Bouffier, Amanda, Poibeau, Thierry

Automatically Restructuring Practice Guidelines using the GEM DTD

arXiv.org Artificial IntelligenceJun-8-2007

This paper describes a system capable of semi-automatically filling an XML template from free texts in the clinical domain (practice guidelines). The XML template includes semantic information not explicitly encoded in the text (pairs of conditions and actions/recommendations). Therefore, there is a need to compute the exact scope of conditions over text sequences expressing the required actions. We present a system developed for this task. We show that it yields good performance when applied to the analysis of French practice guidelines.

artificial intelligence, natural language, text processing, (17 more...)

0706.1137

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine (0.70)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Information Management (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Journal of Artificial Intelligence ResearchJun-4-2007

NP Animacy Identification for Anaphora Resolution

Orasan, C., Evans, R. J.

In anaphora resolution for English, animacy identification can play an integral role in the application of agreement restrictions between pronouns and candidates, and as a result, can improve the accuracy of anaphora resolution systems. In this paper, two methods for animacy identification are proposed and evaluated using intrinsic and extrinsic measures. The first method is a rule-based one which uses information about the unique beginners in WordNet to classify NPs on the basis of their animacy. The second method relies on a machine learning algorithm which exploits a WordNet enriched with animacy information for each sense. The effect of word sense disambiguation on the two methods is also assessed. The intrinsic evaluation reveals that the machine learning method reaches human levels of performance. The extrinsic evaluation demonstrates that animacy identification can be beneficial in anaphora resolution, especially in the cases where animate entities are identified with high precision.

animacy, corpus, pronoun, (10 more...)

doi: 10.1613/jair.2179

10499

Country:

Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
Europe > France (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(12 more...)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Wemmenhove, Bastian, Kappen, Bert

Loop corrections for message passing algorithms in continuous variable models

arXiv.org Artificial IntelligenceMay-31-2007

In this paper we derive the equations for Loop Corrected Belief Propagation on a continuous variable Gaussian model. Using the exactness of the averages for belief propagation for Gaussian models, a different way of obtaining the covariances is found, based on Belief Propagation on cavity graphs. We discuss the relation of this loop correction algorithm to Expectation Propagation algorithms for the case in which the model is no longer Gaussian, but slightly perturbed by nonlinear terms.

artificial intelligence, cavity distribution, equation, (14 more...)

0705.4566

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.79)

Journal of Artificial Intelligence ResearchMay-31-2007

Computationally Feasible VCG Mechanisms

Nisan, N., Ronen, A.

A major achievement of mechanism design theory is a general method for the construction of truthful mechanisms called VCG (Vickrey, Clarke, Groves). When applying this method to complex problems such as combinatorial auctions, a difficulty arises: VCG mechanisms are required to compute optimal outcomes and are, therefore, computationally infeasible. However, if the optimal outcome is replaced by the results of a sub-optimal algorithm, the resulting mechanism (termed VCG-based) is no longer necessarily truthful. The first part of this paper studies this phenomenon in depth and shows that it is near universal. Specifically, we prove that essentially all reasonable approximations or heuristics for combinatorial auctions as well as a wide class of cost minimization problems yield non-truthful VCG-based mechanisms. We generalize these results for affine maximizers. The second part of this paper proposes a general method for circumventing the above problem. We introduce a modification of VCG-based mechanisms in which the agents are given a chance to improve the output of the underlying algorithm. When the agents behave truthfully, the welfare obtained by the mechanism is at least as good as the one obtained by the algorithm's output. We provide a strong rationale for truth-telling behavior. Our method satisfies individual rationality as well.

agent, algorithm, mechanism, (15 more...)

doi: 10.1613/jair.2046

10498

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre: Overview (0.87)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Journal of Artificial Intelligence ResearchMay-31-2007

Solution-Guided Multi-Point Constructive Search for Job Shop Scheduling

Beck, J. C.

Solution-Guided Multi-Point Constructive Search (SGMPCS) is a novel constructive search technique that performs a series of resource-limited tree searches where each search begins either from an empty solution (as in randomized restart) or from a solution that has been encountered during the search. A small number of these "elite'' solutions is maintained during the search. We introduce the technique and perform three sets of experiments on the job shop scheduling problem. First, a systematic, fully crossed study of SGMPCS is carried out to evaluate the performance impact of various parameter settings. Second, we inquire into the diversity of the elite solution set, showing, contrary to expectations, that a less diverse set leads to stronger performance. Finally, we compare the best parameter setting of SGMPCS from the first two experiments to chronological backtracking, limited discrepancy search, randomized restart, and a sophisticated tabu search algorithm on a set of well-known benchmark problems. Results demonstrate that SGMPCS is significantly better than the other constructive techniques tested, though lags behind the tabu search.

algorithm, diversity, elite solution, (11 more...)

doi: 10.1613/jair.2169

10497

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Cilibrasi, Rudi, Vitanyi, Paul M. B.

The Google Similarity Distance

arXiv.org Artificial IntelligenceMay-30-2007

Words and phrases acquire meaning from the way they are used in society, from their relative semantics to other words and phrases. For computers the equivalent of `society' is `database,' and the equivalent of `use' is `way to search the database.' We present a new theory of similarity between words and phrases based on information distance and Kolmogorov complexity. To fix thoughts we use the world-wide-web as database, and Google as search engine. The method is also applicable to other search engines and databases. This theory is then applied to construct a method to automatically extract similarity, the Google similarity distance, of words and phrases from the world-wide-web using Google page counts. The world-wide-web is the largest database on earth, and the context information entered by millions of independent users averages out to provide automatic semantics of useful quality. We give applications in hierarchical clustering, classification, and language translation. We give examples to distinguish between colors and numbers, cluster names of paintings by 17th century Dutch masters and names of books by English novelists, the ability to understand emergencies, and primes, and we demonstrate the ability to do a simple automatic English-Spanish translation. Finally, we use the WordNet database as an objective baseline against which to judge the performance of our method. We conduct a massive randomized trial in binary classification using support vector machines to learn categories based on our Google distance, resulting in an a mean agreement of 87% with the expert crafted WordNet categories.

machine learning, natural language, search term, (20 more...)

cs/0412098

Country:

North America > United States (0.46)
Europe > Netherlands (0.28)
North America > Canada > Alberta (0.28)

Genre:

Research Report > Experimental Study (0.48)
Research Report > Strength High (0.34)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)

arXiv.org Artificial IntelligenceMay-29-2007

Truecluster matching

Oehlschlägel, Jens

Cluster matching by permuting cluster labels is important in many clustering contexts such as cluster validation and cluster ensemble techniques. The classic approach is to minimize the euclidean distance between two cluster solutions which induces inappropriate stability in certain settings. Therefore, we present the truematch algorithm that introduces two improvements best explained in the crisp case. First, instead of maximizing the trace of the cluster crosstable, we propose to maximize a chi-square transformation of this crosstable. Thus, the trace will not be dominated by the cells with the largest counts but by the cells with the most non-random observations, taking into account the marginals. Second, we suggest a probabilistic component in order to break ties and to make the matching algorithm truly random on random data. The truematch algorithm is designed as a building block of the truecluster framework and scales in polynomial time. First simulation results confirm that the truematch algorithm gives more consistent truecluster results for unequal cluster sizes. Free R software is available.

algorithm, artificial intelligence, machine learning, (15 more...)

0705.4302

Country: Europe (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

arXiv.org Artificial IntelligenceMay-28-2007

Truecluster: robust scalable clustering with model selection

Oehlschlägel, Jens

Data-based classification is fundamental to most branches of science. While recent years have brought enormous progress in various areas of statistical computing and clustering, some general challenges in clustering remain: model selection, robustness, and scalability to large datasets. We consider the important problem of deciding on the optimal number of clusters, given an arbitrary definition of space and clusteriness. We show how to construct a cluster information criterion that allows objective model selection. Differing from other approaches, our truecluster method does not require specific assumptions about underlying distributions, dissimilarity definitions or cluster models. Truecluster puts arbitrary clustering algorithms into a generic unified (sampling-based) statistical framework. It is scalable to big datasets and provides robust cluster assignments and case-wise diagnostics. Truecluster will make clustering more objective, allows for automation, and will save time and costs. Free R software is available.

artificial intelligence, base cluster algorithm, machine learning, (15 more...)

cs/0601001

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Szabo, Zoltan, Poczos, Barnabas, Lorincz, Andras

Undercomplete Blind Subspace Deconvolution

arXiv.org Machine LearningMay-20-2007

We introduce the blind subspace deconvolution (BSSD) problem, which is the extension of both the blind source deconvolution (BSD) and the independent subspace analysis (ISA) tasks. We examine the case of the undercomplete BSSD (uBSSD). Applying temporal concatenation we reduce this problem to ISA. The associated `high dimensional' ISA problem can be handled by a recent technique called joint f-decorrelation (JFD). Similar decorrelation methods have been used previously for kernel independent component analysis (kernel-ICA). More precisely, the kernel canonical correlation (KCCA) technique is a member of this family, and, as is shown in this paper, the kernel generalized variance (KGV) method can also be seen as a decorrelation method in the feature space. These kernel based algorithms will be adapted to the ISA task. In the numerical examples, we (i) examine how efficiently the emerging higher dimensional ISA tasks can be tackled, and (ii) explore the working and advantages of the derived kernel-ISA methods.

artificial intelligence, isa task, machine learning, (17 more...)

arXiv.org Machine Learning

math/0701210

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.93)