AITopics

0709.1167

Country: North America > United States > Massachusetts (0.14)

Genre: Overview (0.69)

Industry:

Telecommunications > Networks (0.34)
Information Technology > Networks (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

McCallum, A., Wang, X., Corrada-Emmanuel, A.

Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email

Journal of Artificial Intelligence ResearchOct-13-2007

Previous work in social network analysis (SNA) has modeled the existence of links from one entity to another, but not the attributes such as language content or topics on those links. We present the Author-Recipient-Topic (ART) model for social network analysis, which learns topic distributions based on the direction-sensitive messages sent between entities. The model builds on Latent Dirichlet Allocation (LDA) and the Author-Topic (AT) model, adding the key attribute that distribution over topics is conditioned distinctly on both the sender and recipient---steering the discovery of topics according to the relationships between people. We give results on both the Enron email corpus and a researcher's email archive, providing evidence not only that clearly relevant topics are discovered, but that the ART model better predicts people's roles and gives lower perplexity on previously unseen messages. We also present the Role-Author-Recipient-Topic (RART) model, an extension to ART that explicitly represents people's roles.

art model, mccallum, topic and role discovery, (12 more...)

doi: 10.1613/jair.2229

AI Access Foundation

10515

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Information Technology > Services (0.92)
Energy > Power Industry (0.72)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
(3 more...)

Journal of Artificial Intelligence ResearchOct-12-2007

Compressed Pattern Databases

Felner, A., Korf, R. E., Meshulam, R., Holte, R. C.

A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengths of optimal solutions for subproblem instances. Standard PDBs have a distinct entry in the table for each subproblem instance. In this paper we investigate compressing PDBs by merging several entries into one, thereby allowing the use of PDBs that exceed available memory in their uncompressed form. We introduce a number of methods for determining which entries to merge and discuss their relative merits. These vary from domain-independent approaches that allow any set of entries in the PDB to be merged, to more intelligent methods that take into account the structure of the problem. The choice of the best compression method is based on domain-dependent attributes. We present experimental results on a number of combinatorial problems, including the four-peg Towers of Hanoi problem, the sliding-tile puzzles, and the Top-Spin puzzle. For the Towers of Hanoi, we show that the search time can be reduced by up to three orders of magnitude by using compressed PDBs compared to uncompressed PDBs of the same size. More modest improvements were observed for the other domains.

compression, mapping, pdb, (17 more...)

doi: 10.1613/jair.2241

AI Access Foundation

10514

Country:

Asia > Vietnam > Hanoi > Hanoi (0.46)
North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Recanati, Catherine, Rogovschi, Nicoleta, Bennani, Younès

The structure of verbal sequences analyzed with unsupervised learning techniques

arXiv.org Artificial IntelligenceOct-12-2007

Data mining allows the exploration of sequences of phenomena, whereas one usually tends to focus on isolated phenomena or on the relation between two phenomena. It offers invaluable tools for theoretical analyses and exploration of the structure of sentences, texts, dialogues, and speech. We report here the results of an attempt at using it for inspecting sequences of verbs from French accounts of road accidents. This analysis comes from an original approach of unsupervised training allowing the discovery of the structure of sequential data. The entries of the analyzer were only made of the verbs appearing in the sentences. It provided a classification of the links between two successive verbs into four distinct clusters, allowing thus text segmentation. We give here an interpretation of these clusters by comparing the statistical distribution of independent semantic annotations.

artificial intelligence, machine learning, verb, (17 more...)

0710.2446

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Thampi, Sabu M., Sekaran, K. Chandra

A System for Predicting Subcellular Localization of Yeast Genome Using Neural Network

arXiv.org Artificial IntelligenceOct-11-2007

The subcellular location of a protein can provide valuable information about its function. With the rapid increase of sequenced genomic data, the need for an automated and accurate tool to predict subcellular localization becomes increasingly important. Many efforts have been made to predict protein subcellular localization. This paper aims to merge the artificial neural networks and bioinformatics to predict the location of protein in yeast genome. We introduce a new subcellular prediction method based on a backpropagation neural network. The results show that the prediction within an error limit of 5 to 10 percentage can be achieved with the system.

artificial intelligence, machine learning, protein, (15 more...)

0710.2227

Genre: Research Report (0.71)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ponzetto, S. P., Strube, M.

Knowledge Derived From Wikipedia For Computing Semantic Relatedness

Journal of Artificial Intelligence ResearchOct-10-2007

Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.

doi: 10.1613/jair.2308

AI Access Foundation

10513

Country:

Asia > South Korea (0.28)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(34 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Ravanbakhsh, Mohsen, Abbasi-Yadkori, Yasin, Abbaspour, Maghsoud, Sarbazi-Azad, Hamid

A Heuristic Routing Mechanism Using a New Addressing Scheme

arXiv.org Artificial IntelligenceOct-10-2007

Current methods of routing are based on network information in the form of routing tables, in which routing protocols determine how to update the tables according to the network changes. Despite the variability of data in routing tables, node addresses are constant. In this paper, we first introduce the new concept of variable addresses, which results in a novel framework to cope with routing problems using heuristic solutions. Then we propose a heuristic routing mechanism based on the application of genes for determination of network addresses in a variable address network and describe how this method flexibly solves different problems and induces new ideas in providing integral solutions for variety of problems. The case of ad-hoc networks is where simulation results are more supportive and original solutions have been proposed for issues like mobility.

neighbor, node, similarity, (16 more...)

0710.1924

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Telecommunications > Networks (0.91)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)

Konstantopoulos, Stasinos

What's in a Name?

arXiv.org Artificial IntelligenceOct-8-2007

This paper describes experiments on identifying the language of a single name in isolation or in a document written in a different language. A new corpus has been compiled and made available, matching names against languages. This corpus is used in a series of experiments measuring the performance of general language models and names-only language models on the language identification task. Conclusions are drawn from the comparison between using general language models and names-only language models and between identifying the language of isolated names and the language of very short document fragments. Future research directions are outlined.

artificial intelligence, natural language, text processing, (20 more...)

0710.1481

Country:

Europe (1.00)
North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)

Sierocinski, Thomas, Béchec, Anthony Le, Théret, Nathalie, Petritis, Dimitri

Semantic distillation: a method for clustering objects by their contextual specificity

arXiv.org Machine LearningOct-6-2007

Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel method of fuzzy hierarchical clustering, termed \textit{semantic distillation} -- strongly inspired from the theory of quantum measurement --, we developed to analyse raw data coming from various types of experiments on DNA arrays. We illustrate the method by analysing DNA arrays experiments and clustering the genes of the array according to their specificity.

graph, information, representation, (15 more...)

arXiv.org Machine Learning

0710.1203

Country:

North America > United States > New York (0.05)
Europe > Netherlands > South Holland > Dordrecht (0.05)
Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
(5 more...)

Genre: Research Report (0.70)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.48)

Jung, Kyomin, Shah, Devavrat

Local approximate inference algorithms

arXiv.org Artificial IntelligenceOct-2-2007

We present a new local approximation algorithm for computing Maximum a Posteriori (MAP) and log-partition function for arbitrary exponential family distribution represented by a finite-valued pair-wise Markov random field (MRF), say $G$. Our algorithm is based on decomposition of $G$ into {\em appropriately} chosen small components; then computing estimates locally in each of these components and then producing a {\em good} global solution. We show that if the underlying graph $G$ either excludes some finite-sized graph as its minor (e.g. Planar graph) or has low doubling dimension (e.g. any graph with {\em geometry}), then our algorithm will produce solution for both questions within {\em arbitrary accuracy}. We present a message-passing implementation of our algorithm for MAP computation using self-avoiding walk of graph. In order to evaluate the computational cost of this implementation, we derive novel tight bounds on the size of self-avoiding walk tree for arbitrary graph. As a consequence of our algorithmic result, we show that the normalized log-partition function (also known as free-energy) for a class of {\em regular} MRFs will converge to a limit, that is computable to an arbitrary accuracy.

algorithm, artificial intelligence, machine learning, (16 more...)

cs/0610111

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)