AITopics

In this paper we present a graph-based approach aimed at learning a lexical taxonomy automatically starting from a domain corpus and the Web. Unlike many taxonomy learning approaches in the literature, our novel algorithm learns both concepts and relations entirely from scratch via the automated extraction of terms, definitions and hypernyms. This results in a very dense, cyclic and possibly disconnected hypernym graph. The algorithm then induces a taxonomy from the graph. Our experiments show that we obtain high-quality results, both when building brand-new taxonomies and when reconstructing WordNet sub-hierarchies.

algorithm, graph, taxonomy, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.46)

Improving Topic Evaluation Using Conceptual Knowledge

Musat, Claudiu Cristian ("Politehnica") | Velcin, Julien (University of Bucharest) | Trausan-Matu, Stefan (Universit&eacute) | Rizoiu, Marian-Andrei (Lumière)

The growing number of statistical topic models led to the need to better evaluate their output. Traditional evaluation means estimate the model’s fitness to unseen data. It has recently been proven than the output of human judgment can greatly differ from these measures. Thus the need for methods that better emulate human judgment is stringent. In this paper we present a system that computes the usefulness of individual topics from a given model on the basis of information drawn from a given ontology, in this case WordNet. The notion of utility is regarded as the ability to attribute a concept to each topic and separate words related to the topic from the unrelated ones based on that concept. In multiple experiments we prove the correlation between the automatic evaluation method and the answers received from human evaluators, for various corpora and difficulty levels. By changing the evaluation focus from a statistical one to a conceptual one we were able to detect which topics are conceptually meaningful and rank them accordingly.

evaluator, experiment, spurious word, (13 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
(3 more...)

Muresan, Smaranda (Rutgers University)

Learning for Deep Language Understanding

Lexicalized Well-Founded Grammar (LWFG) is a recently developed syntactic-semantic grammar formalism for deep language understanding, which balances expressiveness with provable learnability results. The learnability result for LWFGs assumes that the semantic composition constraints are learnable. In this paper, we show what are the properties and principles the semantic representation and grammar formalism require, in order to be able to learn these constraints from examples, and give a learning algorithm. We also introduce a LWFG parser as a deductive system, used as an inference engine during LWFG induction. An example for learning a grammar for noun compounds is given.

algorithm, grammar, representative example, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Mendes, Ana Cristina (Instituto Superior Técnico, Technical University of Lisbon and Spoken Language Systems Lab/INESC-ID Lisboa) | Coheur, Luísa (Instituto Superior Técnico, Technical University of Lisbon and Spoken Language Systems Lab/INESC-ID Lisboa)

An Approach to Answer Selection in Question-Answering Based on Semantic Relations

A usual strategy to select the final answer in factoid Question-Answering (QA) relies on redundancy. A score is given to each candidate answer as a function of its frequency of occurrence, and the final answer is selected from the set of candidates sorted in decreasing order of score. For that purpose, systems often try to group together semantically equivalent answers. However, they hold several other semantic relations, such as inclusion, which are not considered, and candidates are mostly seen independently, as competitors. Our hypothesis is that not just equivalence, but other relations between candidate answers have impact on the performance of a redundancy-based QA system. In this paper, we describe experimental studies to back up this hypothesis. Our findings show that, with relatively simple techniques to recognize relations, systems' accuracy can be improved for answers of categories Number, Date and Entity.

candidate answer, frequency, relation, (17 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
Europe > Italy (0.05)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.86)

Lo, Chi-kiu (Hong Kong University of Science and Technology) | Wu, Dekai (Hong Kong University of Science and Technology)

SMT Versus AI Redux: How Semantic Frames Evaluate MT More Accurately

We argue for an alternative paradigm in evaluating machine translation quality that is strongly empirical but more accurately reflects the utility of translations, by returning to a representational foundation based on AI oriented lexical semantics, rather than the superficial flat n-gram and string representations recently dominating the field. Driven by such metrics as BLEU and WER, current SMT frequently produces unusable translations where the semantic event structure is mistranslated: who did what to whom, when, where, why, and how? We argue that it is time for a new generation of more “intelligent” automatic and semi-automatic metrics, based clearly on getting the structure right at the lexical semantics level. We show empirically that it is possible to use simple PropBank style semantic frame representations to surpass all currently widespread metrics' correlation to human adequacy judgments, including even HTER. We also show that replacing human annotators with automatic semantic role labeling still yields much of the advantage of the approach. We combine the best of both worlds: from an SMT perspective, we provide superior yet low-cost quantitative objective functions for translation quality; and yet from an AI perspective, we regain the representational transparency and clear reflection of semantic utility of structural frame-based knowledge representations.

correlation, evaluation, translation, (16 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Czechia > Prague (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Unsupervised Modeling of Dialog Acts in Asynchronous Conversations

Joty, Shafiq Rayhan (University of British Columbia) | Carenini, Giuseppe (University of British Columbia) | Lin, Chin-Yew (Microsoft Research Asia)

We present unsupervised approaches to the problem of modeling dialog acts in asynchronous conversations; i.e., conversations where participants collaborate with each other at different times. In particular, we investigate a graph-theoretic deterministic framework and two probabilistic conversation models (i.e., HMM and HMM+Mix) for modeling dialog acts in emails and forums. We train and test our conversation models on (a) temporal order and (b) graph-structural order of the datasets. Empirical evaluation suggests (i) the graph-theoretic framework that relies on lexical and structural similarity metrics is not the right model for this task, (ii) conversation models perform better on the graph-structural order than the temporal order of the datasets and (iii) HMM+Mix is a better conversation model than the simple HMM model.

conversation model, fragment, similarity, (17 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Asia > India > Karnataka > Bengaluru (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(4 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(2 more...)

Automatic Discovery of Fuzzy Synsets from Dictionary Definitions

Oliveira, Hugo Gonçalo (University of Coimbra) | Gomes, Paulo (University of Coimbra)

In order to deal with ambiguity in natural language, it is common to organise words, according to their senses, in synsets, which are groups of synonymous words that can be seen as concepts. The manual creation of a broad-coverage synset base is a time-consuming task, so we take advantage of dictionary definitions for extracting synonymy pairs and clustering for identifying synsets. Since word senses are not discrete, we create fuzzy synsets, where each word has a membership degree. We report on the results of the creation of a fuzzy synset base for Portuguese, from three electronic dictionaries. The resulting resource is larger than existing hancrafted Portuguese thesauri.

fuzzy synset, graph, synset, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Middle East > Malta (0.04)
South America > Brazil (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Denis, Pascal (Alpage, INRIA and University of Paris Diderot) | Muller, Philippe (Alpage, INRIA and IRIT and University of Toulouse)

Predicting Globally-Coherent Temporal Structures from Texts Via Endpoint Inference and Graph Decomposition

An elegant approach to learning temporal orderings from texts is to formulate this problem as a constraint optimization problem, which can be then given an exact solution using Integer Linear Programming. This works well for cases where the number of possible relations between temporal entities is restricted to the mere precedence relation [Bramsen et al., 2006; Chambers and Jurafsky, 2008], but becomes impractical when considering all possible interval relations. This paper proposes two innovations, inspired from work on temporal reasoning, that control this combinatorial blow-up, therefore rendering an exact ILP inference viable in the general case. First, we translate our network of constraints from temporal intervals to their endpoints, to handle a drastically smaller set of constraints, while preserving the same temporal information. Second, we show that additional efficiency is gained by enforcing coherence on particular subsets of the entire temporal graphs. We evaluate these innovations through various experiments on TimeBank 1.2, and compare our ILP formulations with various baselines and oracle systems.

chamber and jurafsky, constraint, relation, (15 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Utah (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Online Latent Structure Training for Language Acquisition

Connor, Michael James (University of Illinois) | Fisher, Cynthia (University of Illinois) | Roth, Dan (University of Illinois)

A fundamental step in sentence comprehension involves assigning semantic roles to sentence constituents. To accomplish this, the listener must parse the sentence, find constituents that are candidate arguments, and assign semantic roles to those constituents. Where do children learning their first languages begin in solving this problem? Even assuming children can derive a rough meaning for the sentence from the situation, how do they begin to map this meaning to the structure and the structure to the form of the sentence? In this paper we use feedback from a semantic role labeling (SRL) task to improve the intermediate syntactic representations that feed the SRL. We accomplish this by training an intermediate classifier using signals derived from latent structure optimization techniques. By using a separate classifier to predict internal structure we see benefits due to knowledge embedded in the classifier's feature representation. This extra structure allows the system to begin to learn using weaker, more plausible semantic feedback.

argument, classifier, predicate, (17 more...)

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Illinois (0.05)
North America > United States > New Jersey > Bergen County > Mahwah (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.91)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)

Semantic Relationship Discovery with Wikipedia Structure

Bu, Fan (Tsinghua University) | Hao, Yu (Tsinghua University) | Zhu, Xiaoyan (Tsinghua University)

Thanks to the idea of social collaboration, Wikipedia has accumulated vast amount of semi-structured knowledge in which the link structure reflects human's cognition on semantic relationship to some extent. In this paper, we proposed a novel method RCRank to jointly compute concept-concept relatedness and concept-category relatedness base on the assumption that information carried in concept-concept links and concept-category links can mutually reinforce each other. Different from previous work, RCRank can not only find semantically related concepts but also interpret their relations by categories. Experimental results on concept recommendation and relation interpretation show that our method substantially outperforms classical methods.

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

Europe > Germany (0.05)
Europe > France (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
(7 more...)

Genre: Research Report (0.34)

Industry: Government (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)