AITopics

Negation is present in all human languages and it is used to reverse the polarity of parts of a statement. It is a complex phenomenon that interacts with many other aspects of language. Besides the direct meaning, negated statements often carry a latent positive meaning. Negation can be interpreted in terms of its scope and focus. This paper explores the importance of both scope and focus to capture the meaning of negated statements. Some issues on detecting negation from text are outlined, the forms in which negation occurs are depicted and heuristics to detect its scope and focus are proposed.

negation, representation, semantic representation, (15 more...)

Country:

North America > Mexico > Quintana Roo > Cancún (0.05)
South America > Brazil (0.04)
North America > United States > Texas > Dallas County > Richardson (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.96)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.94)

Galitsky, Boris Lluis de la (University of Girona) | Rose, Josep Lluis Lluis de la de la (University of Girona) | Dobrocsi, Gabor Lluis de la (University of Miskolc Miskolc)

Mapping Syntactic to Semantic Generalizations of Linguistic Parse Trees

We define sentence generalization and generalization diagrams as a special case of least general generalization (LGG) as applied to linguistic parse trees. Similarity measure between linguistic parse trees is developed as LGG operation on the lists of sub-trees of these trees. The diagrams introduced are representation of mapping between the syntactic generalization level and semantic generalization level. Generalization diagrams are intended as a framework to compute semantic similarity between texts relying on linguistic parse tree data. Such structured approach significantly improves text relevance assessment in a horizontal domain, where ontologies are not available

customer, expression, generalization, (16 more...)

Country:

Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.04)
Oceania > Australia (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(6 more...)

Industry: Banking & Finance (0.95)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Building Integrated Opinion Delivery Environment

Galitsky, Boris (University of Girona) | Rose, Josep Lluis de la (Universitat de Girona) | Dobrocsi, Gabor (University of Miskolc Miskolc )

We introduce a search engine and information retrieval system for providing access to opinion data. Natural language technology of generalization of syntactic parse trees is introduced as a similarity measure between subjects of textual opinions to link them on the fly. Information extraction algorithm for automatic summarization of web pages in the format of Google sponsored links is presented. We outline the usability of the implemented system, integrated opinion delivery environment (IODE).

advertisement, expression, generalization, (12 more...)

Country:

Europe > Spain > Catalonia > Girona Province > Girona (0.05)
Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.05)
Oceania > Australia (0.04)
(2 more...)

Industry:

Banking & Finance (0.69)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.90)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.55)

Mulkar-Mehta, Rutu (University of Southern California Information Sciences Institute (USC-ISI)) | Welty, Christopher (IBM Watson Research Center) | Hobbs, Jerry (University of Southern California Information Sciences Institute (USC-ISI)) | Hovy, Eduard (University of Southern California Information Sciences Institute (USC-ISI))

Using Part-Of Relations for Discovering Causality

Historically, causal markers, syntactic structures and connectives have been the sole identifying features for automatically extracting causal relations in natural language discourse. However various connectives such as “and,” prepositions such as “as” and other syntactic structures are highly ambiguous in nature, and it is clear that one cannot solely rely on lexico-syntactic markers for detection of causal phenomenon in discourse. This paper introduces the theory of granularity and describes different approaches to identify granularity in natural language. As causality is often granular in nature, we use granularity relations to discover and infer the presence of causal relations in text. We compare this with causal relations identified using just causal markers. We achieve a precision of 0.91 and a recall of 0.79 using granularity for causal relation detection, as compared to a precision of 0.79 and a recall of 0.44 using pure causal markers for causality detection.

causal relation, granularity, relation, (14 more...)

Country:

North America > United States > New York (0.14)
North America > United States > California > San Francisco County > San Francisco (0.06)
North America > United States > Virginia (0.04)
(3 more...)

Industry: Leisure & Entertainment > Sports > Football (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

arXiv.org Machine LearningApr-28-2011

Notes on a New Philosophy of Empirical Science

Burfoot, Daniel

This book presents a methodology and philosophy of empirical science based on large scale lossless data compression. In this view a theory is scientific if it can be used to build a data compression program, and it is valuable if it can compress a standard benchmark database to a small size, taking into account the length of the compressor itself. This methodology therefore includes an Occam principle as well as a solution to the problem of demarcation. Because of the fundamental difficulty of lossless compression, this type of research must be empirical in nature: compression can only be achieved by discovering and characterizing empirical regularities in the data. Because of this, the philosophy provides a way to reformulate fields such as computer vision and computational linguistics as empirical sciences: the former by attempting to compress databases of natural images, the latter by attempting to compress large text databases. The book argues that the rigor and objectivity of the compression principle should set the stage for systematic progress in these fields. The argument is especially strong in the context of computer vision, which is plagued by chronic problems of evaluation. The book also considers the field of machine learning. Here the traditional approach requires that the models proposed to solve learning problems be extremely simple, in order to avoid overfitting. However, the world may contain intrinsically complex phenomena, which would require complex models to understand. The compression philosophy can justify complex models because of the large quantity of data being modeled (if the target database is 100 Gb, it is easy to justify a 10 Mb model). The complex models and abstractions learned on the basis of the raw data (images, language, etc) can then be reused to solve any specific learning problem, such as face recognition or machine translation.

artificial intelligence, machine learning, natural language, (25 more...)

arXiv.org Machine Learning

1104.5466

Country:

Europe (0.67)
Asia (0.67)
North America > United States > New York (0.45)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
(11 more...)

arXiv.org Artificial IntelligenceApr-19-2011

Understanding Exhaustive Pattern Learning

Shen, Libin

Pattern learning in an important problem in Natural Language Processing (NLP). Some exhaustive pattern learning (EPL) methods (Bod, 1992) were proved to be flawed (Johnson, 2002), while similar algorithms (Och and Ney, 2004) showed great advantages on other tasks, such as machine translation. In this article, we first formalize EPL, and then show that the probability given by an EPL model is constant-factor approximation of the probability given by an ensemble method that integrates exponential number of models obtained with various segmentations of the training data. This work for the first time provides theoretical justification for the widely used EPL algorithm in NLP, which was previously viewed as a flawed heuristic method. Better understanding of EPL may lead to improved pattern learning algorithms in future.

machine learning, natural language, segmentation, (18 more...)

arXiv.org Artificial Intelligence

1104.3929

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)

Sahay, Saurav (Georgia Institute of Technology) | Ram, Ashwin (Georgia Institute of Technology)

Socio-Semantic Health Information Access

AAAI ConferencesMar-19-2011

We describe Cobot, a mixed initiative socio-semantic conversational search and recommendation system for finding health information. With Cobot, users can start a real time conversation about their health concerns. Cobot then connects relevant users together in the conversation also providing contextual recommendations relevant to the conversation. Conventional search engines and content portals provide a solitary search experience inundating the health information seeker with a hoard of information often confusing and frustrating them. Cobot brings relevant healthcare information directly or through other users without any search through natural language conversation.

bioinformatics, information, machine learning, (19 more...)

2011 AAAI Spring Symposium Series

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
South America > Brazil > Ceará > Fortaleza (0.04)
North America > United States > New York > New York County > New York City (0.04)

Industry: Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Biomedical Informatics > Clinical Informatics (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(5 more...)

Journal of Artificial Intelligence ResearchFeb-25-2011

Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution

Rahman, A., Ng, V.

Traditional learning-based coreference resolvers operate by training the mention-pair model for determining whether two mentions are coreferent or not. Though conceptually simple and easy to understand, the mention-pair model is linguistically rather unappealing and lags far behind the heuristic-based coreference models proposed in the pre-statistical NLP era in terms of sophistication. Two independent lines of recent research have attempted to improve the mention-pair model, one by acquiring the mention-ranking model to rank preceding mentions for a given anaphor, and the other by training the entity-mention model to determine whether a preceding cluster is coreferent with a given mention. We propose a cluster-ranking approach to coreference resolution, which combines the strengths of the mention-ranking model and the entity-mention model, and is therefore theoretically more appealing than both of these models. In addition, we seek to improve cluster rankers via two extensions: (1) lexicalization and (2) incorporating knowledge of anaphoricity by jointly modeling anaphoricity determination and coreference resolution. Experimental results on the ACE data sets demonstrate the superior performance of cluster rankers to competing approaches as well as the effectiveness of our two extensions.

coreference model, coreference resolution, mention-pair model, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3120

AI Access Foundation

10694

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Education (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(4 more...)

arXiv.org Artificial IntelligenceFeb-25-2011

Universal Higher Order Grammar

Gluzberg, Victor

We examine the class of languages that can be defined entirely in terms of provability in an extension of the sorted type theory (Ty_n) by embedding the logic of phonologies, without introduction of special types for syntactic entities. This class is proven to precisely coincide with the class of logically closed languages that may be thought of as functions from expressions to sets of logically equivalent Ty_n terms. For a specific sub-class of logically closed languages that are described by finite sets of rules or rule schemata, we find effective procedures for building a compact Ty_n representation, involving a finite number of axioms or axiom schemata. The proposed formalism is characterized by some useful features unavailable in a two-component architecture of a language model. A further specialization and extension of the formalism with a context type enable effective account of intensional and dynamic semantics.

artificial intelligence, logic & formal reasoning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1102.5185

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Johnson, Mark, Demuth, Katherine, Jones, Bevan, Black, Michael J.

Synergies in learning words and their referents

Neural Information Processing SystemsDec-31-2010

This paper presents Bayesian non-parametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves are novel kinds of Adaptor Grammars that are an extension of an embedding of topic models into PCFGs. These models simultaneously segment phoneme sequences into words and learn the relationship between non-linguistic objects to the words that refer to them. We show (i) that modelling inter-word dependencies not only improves the accuracy of the word segmentation but also of word-object relationships, and (ii) that a model that simultaneously learns word-object relationships and word segmentation segments more accurately than one that just learns word segmentation on its own. We argue that these results support an interactive view of language acquisition that can take advantage of synergies such as these.

machine learning, natural language, word segmentation, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)