AITopics

Scientists typically need to take a large volume of information into account in order to deal with re-occurring tasks such as inspecting proceedings, finding related work, or reviewing papers. Our work aims at filling the gap between text documents and a structured representations of their content in the domain of resilience computing by combining computer linguistics and ontological methods. The results of our research include: a thesaurus of the domain, automatic clustering of the domain documents, a domain ontology, and a tool for constructing ontologies with the aid of domain thesauri.

mapping, ontology, thesaurus, (13 more...)

Country:

Europe > Romania (0.04)
Europe > Lithuania > Kaunas County > Kaunas (0.04)
Europe > Germany > Saarland (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Extracting Meaning from Cell Phone Improvement Ideas

Turner, Jenine (Athenahealth) | Lencevicius, Raimondas (Qwobl) | Adler, Mark (Nokia Research Center)

Numerous companies nowadays gather product improvement There are two additional modifications we use to adjust ideas. Reviewing all of the resulting thousands of our feature set, that provide improvements over the original ideas without tools would require a great deal of time and feature counts. The first is based upon our assumption that resources. Automatic tools can help these reviewers in a words in the title are more important than words in the other number of ways. The questions we address here are categorization, text fields. We simply weight unigrams and bigrams that finding common ideas, and finding idea trends over appear in the title ten times as heavily as those that appear in time. We explore techniques to answer these questions using the rest of the text.

category, classification, probability, (12 more...)

Country: North America > United States > Massachusetts (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.74)

Hidden Markov Random Fields Based LSI Text Semi-supervised Clustering

Min, Kerui (Fudan University) | Liu, Gang (Fudan University) | Chen, Xin (Nanjing University) | Lu, Shengqi (Fudan University)

Semi-supervised learning is an active research field. Previous results shown that unite background information into the original unsupervised clustering problem could archive higher accuracy. In this paper, we explore the cooperation between the pairwise constrains given by the user and the sematic information in natural language. In addition, we reduce the time complexity to make the algorithm feasible for large quantities of data. Experiments on different scales of corpus show the robustness and effectiveness of the proposed algorithm, which the F-measure archives 20% higher than previous algorithms.

algorithm, constraint, hidden markov random field, (11 more...)

Country:

Asia > China > Shanghai > Shanghai (0.06)
Asia > China > Jiangsu Province > Nanjing (0.05)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.71)

Hierarchical Soft Clustering and Automatic Text Summarization for Accessing the Web on Mobile Devices for Visually Impaired People

Dias, Gaël Harry (University of Beira Interior) | Pais, Sebastião (University of Beira Interior) | Cunha, Fernando (University of Beira Interior) | Costa, Hugo (University of Beira Interior) | Machado, David (University of Beira Interior) | Barbosa, Tiago (University of Beira Interior) | Martins, Bruno (University of Beira Interior)

In this paper, we propose a universal solution to web search and web browsing on handheld devices for visually impaired people. For this purpose, we propose (1) to automatically cluster web page results and (2) to summarize all the information in web pages so that speech-to-speech interaction is used efficiently to access information.

algorithm, information, snippet, (13 more...)

Country:

North America > United States > Florida > Monroe County > Key West (0.04)
Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
Europe > Portugal (0.04)
(3 more...)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.95)
(2 more...)

Millar, Jeremy R. (Air Force Institute of Technology) | Peterson, Gilbert L. (Air Force Institute of Technology) | Mendenhall, Michael J. (Air Force Institute of Technology)

Document Clustering and Visualization with Latent Dirichlet Allocation and Self-Organizing Maps

Clustering and visualization of large text document collections aids in browsing, navigation, and information retrieval. We present a document clustering and visualization method based on Latent Dirichlet Allocation and self-organizing maps (LDA-SOM). LDA-SOM clusters documents based on topical content and renders clusters in an intuitive two-dimensional format. Document topics are inferred using a probabilistic topic model. Then, due to the topology preserving properties of self-organizing maps, document clusters with similar topic distributions are placed near one another in the visualization. This provides the user an intuitive means of browsing from one cluster to another based on topics held in common. The effectiveness of LDA-SOM is evaluated on the 20 Newsgroups and NIPS data sets.

document collection, topic distribution, vector, (15 more...)

Country:

Asia > Middle East > Jordan (0.06)
North America > United States > New York (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
(3 more...)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Griffith, Obi L., Gao, Byron J., Bilenky, Mikhail, Prichyna, Yuliya, Ester, Martin, Jones, Steven J. M.

KiWi: A Scalable Subspace Clustering Algorithm for Gene Expression Analysis

arXiv.org Artificial IntelligenceApr-13-2009

Numerous studies have used coexpression of large expression datasets to infer functional associations between genes [1], to identify groups of related genes that are important in specific cancers or represent common tumour progression mechanisms [2], to study evolutionary change [3], for integration with other large-scale datasets [4][5], [6], and for the generation of high-quality biological interaction networks [7][8][9] [10]. A number of studies have also attempted to use coexpression to identify coregulation with the hypothesis that if two or more genes are expressed at the same time and location and at similar levels then they may be regulated by the same transcription factors and regulatory elements. This approach has shown promise particularly in simpler model organisms such as A. thaliana and S. cerevisiae [11] [12][13] [14] and many groups are currently working on implementing this idea in mammalian systems. However, traditional clustering methods have not worked particularly well on large datasets for this problem. Most methods assign each gene to only one cluster while in reality many genes likely take part in multiple processes. Also, global coexpression is measured across all conditions, whereas, it is probable that most genes are only tightly coregulated under certain conditions or locations. In recent years, a new field of clustering analysis termed subspace clustering (or biclustering) has gained increasing popularity in the analysis of gene expression data and other biological data [15][16][17][18] [19]. In contrast to traditional clustering methods such as hierarchical clustering, subspace clustering methods do not require expression to be correlated across all conditions for genes to be assigned to the same cluster. This has several advantages for data in which biologically relevant subsets exist (e.g.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

0904.1931

Country:

North America > Canada (0.04)
North America > United States > Texas (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.30)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Journal of Artificial Intelligence ResearchMar-24-2009

Unsupervised Methods for Determining Object and Relation Synonyms on the Web

Yates, A., Etzioni, O.

The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fully-implemented system that runs in O(KN log N) time in the number of extractions, N, and the maximum number of synonyms per word, K. The system, called Resolver , introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. On a set of two million assertions extracted from the Web, Resolver resolves objects with 78% precision and 68% recall, and resolves relations with 90% precision and 35% recall. Several variations of resolver's probabilistic model are explored, and experiments demonstrate that under appropriate conditions these variations can improve F1 by 5%. An extension to the basic Resolver system allows it to handle polysemous names with 97% precision and 95% recall on a data set from the TREC corpus.

extraction, relation, resolver, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2772

AI Access Foundation

10591

Journal of Artificial Intelligence Research

Country:

North America > United States > Virginia (0.04)
North America > United States > West Virginia (0.04)
North America > United States > District of Columbia > Washington (0.04)
(5 more...)

Genre:

Overview (0.85)
Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(4 more...)

Niennattrakul, Vit, Ratanamahatana, Chotirat Ann

Learning DTW Global Constraint for Time Series Classification

arXiv.org Artificial IntelligenceFeb-28-2009

1-Nearest Neighbor with the Dynamic Time Warping (DTW) distance is one of the most effective classifiers on time series domain. Since the global constraint has been introduced in speech community, many global constraint models have been proposed including Sakoe-Chiba (S-C) band, Itakura Parallelogram, and Ratanamahatana-Keogh (R-K) band. The R-K band is a general global constraint model that can represent any global constraints with arbitrary shape and size effectively. However, we need a good learning algorithm to discover the most suitable set of R-K bands, and the current R-K band learning algorithm still suffers from an 'overfitting' phenomenon. In this paper, we propose two new learning algorithms, i.e., band boundary extraction algorithm and iterative learning algorithm. The band boundary extraction is calculated from the bound of all possible warping paths in each class, and the iterative learning is adjusted from the original R-K band learning. We also use a Silhouette index, a well-known clustering validation technique, as a heuristic function, and the lower bound function, LB_Keogh, to enhance the prediction speed. Twenty datasets, from the Workshop and Challenge on Time Series Classification, held in conjunction of the SIGKDD 2007, are used to evaluate our approach.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

0903.0041

Country:

North America > United States > District of Columbia > Washington (0.04)
North America > United States > California > Orange County > Newport Beach (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Fournier-Viger, P., Nkambou, R., Nguifo, E. Mephu

A Knowledge Discovery Framework for Learning Task Models from User Interactions in Intelligent Tutoring Systems

arXiv.org Artificial IntelligenceJan-29-2009

Domain experts should provide relevant domain knowledge to an Intelligent Tutoring System (ITS) so that it can guide a learner during problemsolving learning activities. However, for many ill-defined domains, the domain knowledge is hard to define explicitly. In previous works, we showed how sequential pattern mining can be used to extract a partial problem space from logged user interactions, and how it can support tutoring services during problem-solving exercises. This article describes an extension of this approach to extract a problem space that is richer and more adapted for supporting tutoring services. We combined sequential pattern mining with (1) dimensional pattern mining (2) time intervals, (3) the automatic clustering of valued actions and (4) closed sequences mining. Some tutoring services have been implemented and an experiment has been conducted in a tutoring system.

artificial intelligence, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-540-88636-5

0901.4761

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > France (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Chen, Guangliang, Lerman, Gilad

Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

arXiv.org Machine LearningJan-14-2009

The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem, and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu's multi-way spectral clustering framework (CVPR 2005) and Ng et al.'s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).

artificial intelligence, equation, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/s10208-009-9043-7

0810.3724

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)