AITopics

Twenty-Second International Joint Conference on Artificial Intelligence

Country: Asia > India (0.15)

Genre: Research Report (0.46)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
(2 more...)

Laurino, Omar, D'Abrusco, Raffaele, Longo, Giuseppe, Riccio, Giuseppe

Astroinformatics of galaxies and quasars: a new general method for photometric redshifts estimation

arXiv.org Machine LearningJul-15-2011

With the availability of the huge amounts of data produced by current and future large multi-band photometric surveys, photometric redshifts have become a crucial tool for extragalactic astronomy and cosmology. In this paper we present a novel method, called Weak Gated Experts (WGE), which allows to derive photometric redshifts through a combination of data mining techniques. \noindent The WGE, like many other machine learning techniques, is based on the exploitation of a spectroscopic knowledge base composed by sources for which a spectroscopic value of the redshift is available. This method achieves a variance \sigma^2(\Delta z)=2.3x10^{-4} (\sigma^2(\Delta z) =0.08), where \Delta z = z_{phot} - z_{spec}) for the reconstruction of the photometric redshifts for the optical galaxies from the SDSS and for the optical quasars respectively, while the Root Mean Square (RMS) of the \Delta z variable distributions for the two experiments is respectively equal to 0.021 and 0.35. The WGE provides also a mechanism for the estimation of the accuracy of each photometric redshift. We also present and discuss the catalogs obtained for the optical SDSS galaxies, for the optical candidate quasars extracted from the DR7 SDSS photometric dataset {The sample of SDSS sources on which the accuracy of the reconstruction has been assessed is composed of bright sources, for a subset of which spectroscopic redshifts have been measured.}, and for optical SDSS candidate quasars observed by GALEX in the UV range. The WGE method exploits the new technological paradigm provided by the Virtual Observatory and the emerging field of Astroinformatics.

artificial intelligence, neural network, photometric redshift, (17 more...)

arXiv.org Machine Learning

doi: 10.1111/j.1365-2966.2011.19416.x

1107.316

Country:

North America > United States (0.45)
Europe > Italy (0.28)

Genre: Research Report > Promising Solution (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

The Party Is Over Here: Structure and Content in the 2010 Election

Livne, Avishay (The University of Michigan) | Simmons, Matthew (The University of Michigan) | Adar, Eytan (The University of Michigan) | Adamic, Lada (The University of Michigan)

In this work, we study the use of Twitter by House, Senate and gubernatorial candidates during the midterm (2010) elections in the U.S. Our data includes almost 700 candidates and over 690k documents that they produced and cited in the 3.5 years leading to the elections. We utilize graph and text mining techniques to analyze differences between Democrats, Republicans and Tea Party candidates, and suggest a novel use of language modeling for estimating content cohesiveness. Our findings show significant differences in the usage patterns of social media, and suggest conservative candidates used this medium more effectively, conveying a coherent message and maintaining a dense graph of connections. Despite the lack of party leadership, we find Tea Party members display both structural and language-based cohesiveness. Finally, we investigate the relation between network structure, content and election results by creating a proof-of-concept model that predicts candidate victory with an accuracy of 88.0%.

social media, twitter, us government, (21 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Improving Text Clustering with Social Tagging

Ares, M. Eduardo (University of A Coruña) | Parapar, Javier (University of A Coruña) | Barreiro, Álvaro (University of A Coruña)

Another important question is the absoluteness of the constraints. Lately several web-based tagging systems such as Technorati, Even if we use this approach to turn tags into constraints, Flickr or Delicious have become very popular. In this a fair amount of them are bound to be inaccurate paper we will exploit the information created by the community (i.e., linking documents which should not be in the same in Delicious: a social bookmarking service where cluster) until a high value of the parameter t, due to the polysemy the users can save the URLs of their favourite webpages of the terms used as tags or to differences in the criteria offering also the possibility of associating tags to them. of the taggers. Consequently, we have used soft positive On the other hand the clustering methods are a very important constraints, meaning that the documents affected by one of data mining tool in order to exploit the knowledge them are likely to be in the same cluster, without forcing the present in data collections. In the last years a new family of clustering algorithm to actually put them so.

artificial intelligence, constraint, social media, (20 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.47)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

Using Hierarchical Community Structure to Improve Community-Based Message Routing

Stabeler, Matthew (University College Dublin) | Lee, Conrad (University College Dublin) | Williamson, Graham (University College Dublin) | Cunningham, Pádraig (University College Dublin)

Information about community structure can be useful in a variety of mobile web applications. For instance, it has been shown that community-based methods can be more effective than alternatives for routing messages in delay-tolerant networks. In this paper we present initial research that shows that information on hierarchical structures in communities can further improve the effectiveness of message routing. This is interesting because despite much previous work on the topic, there have been few concrete applications which exploit hierarchical community structure.

artificial intelligence, community structure, data mining, (15 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.28)

Technology:

Information Technology > Communications > Mobile (0.70)
Information Technology > Artificial Intelligence (0.69)
Information Technology > Data Science > Data Mining (0.48)
Information Technology > Communications > Networks (0.47)

Sentiment Flow Through Hyperlink Networks

Miller, Mahalia (Stanford University) | Sathi, Conal (Stanford University) | Wiesenthal, Daniel (Stanford University) | Leskovec, Jure (Stanford University) | Potts, Christopher (Stanford University)

How does sentiment flow through hyperlink networks? Earlier work on hyperlink networks has focused on the structure of the network, often modeling posts as nodes in a directed graph in which edges represent hyperlinks. At the same time, sentiment analysis has largely focused on classifying texts in isolation. Here we analyze a large hyperlinked network of mass media and weblog posts to determine how sentiment features of a post affect the sentiment of connected posts and the structure of the network itself. We explore the phenomena of sentiment flow through experiments on a graph containing nearly 8 million nodes and 15 million edges. Our analysis indicates that (1) nodes are strongly influenced by their immediate neighbors, (2) deep cascades lead complex but predictable lives, (3) shallow cascades tend to be objective, and (4) sentiment becomes more polarized as depth increases.

artificial intelligence, cascade, social media, (19 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Media > News (0.90)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.49)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)

DuBois, Christopher (University of California, Irvine) | Foulds, James (University of California, Irvine) | Smyth, Padhraic (University of California, Irvine)

Latent Set Models for Two-Mode Network Data

Two-mode networks are a natural representation for many kinds of relational data. These networks are bipartite graphs consisting of two distinct sets ("modes") of entities. For example, one can model multiple recipient email data as a two-mode network of (a) individuals and (b) the emails that they send or receive. In this work we present a statistical model for two-mode network data which posits that individuals belong to latent sets and that the members of a particular set tend to co-appear. We show how to infer these latent sets from observed data using a Markov chain Monte Carlo inference algorithm. We apply the model to the Enron email corpus, using it to discover interpretable latent structure as well as evaluating its predictive accuracy on a missing data task. Extensions to the model are discussed that incorporate additional side information such as the email's sender or text content, further improving the accuracy of the model.

bayesian inference, latent, télécommunications, (21 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > California (0.14)

Industry:

Telecommunications > Networks (0.61)
Information Technology > Networks (0.61)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Large-Scale Community Detection on YouTube for Topic Discovery and Exploration

Gargi, Ullas (Google, Inc.) | Lu, Wenjun (University of Maryland) | Mirrokni, Vahab (Google, Inc.) | Yoon, Sangho (Google, Inc.)

Detecting coherent, well-connected communities in large graphs provides insight into the graph structure and can serve as the basis for content discovery. Clustering is a popular technique for community detection but global algorithms that examine the entire graph do not scale. Local algorithms are highly parallelizable but perform sub-optimally, especially in applications where we need to optimize multiple metrics. We present a multi-stage algorithm based on local-clustering that is highly scalable, combining a pre-processing stage, a lo- cal clustering stage, and a post-processing stage. We apply it to the YouTube video graph to generate named clusters of videos with coherent content. We formalize coverage, co- herence, and connectivity metrics and evaluate the quality of the algorithm for large YouTube graphs. Our use of local algorithms for global clustering, and its implementation and practical evaluation on such a large scale is a ﬁrst of its kind.

algorithm, artificial intelligence, social media, (17 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > Maryland (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)

Scalable Event-Based Clustering of Social Media Via Record Linkage Techniques

Reuter, Timo (CITEC, University of Bielefeld) | Cimiano, Philipp (CITEC, University of Bielefeld) | Drumond, Lucas (University of Hildesheim) | Buza, Krisztian (University of Hildesheim) | Schmidt-Thieme, Lars (University of Hildesheim)

We tackle the problem of grouping content available in social media applications such as Flickr, Youtube, Panoramino etc. into clusters of documents describing the same event. This task has been referred to as event identiﬁcation before. We present a new formalization of the event identiﬁcation task as a record linkage problem and show that this formulation leads to a principled and highly efﬁcient solution to the problem. We present results on two datasets derived from Flickr — last.fm and upcoming — comparing the results in terms of Normalized Mutual Information and F-Measure with respect to several baselines, showing that a record linkage approach outperforms all baselines as well as a state-of-the-art system. We demonstrate that our approach can scale to large amounts of data, reducing the processing time considerably compared to a state-of-the-art approach. The scalability is achieved by applying an appropriate blocking strategy and relying on a Single Linkage clustering algorithm which avoids the exhaustive computation of pairwise similarities.

artificial intelligence, dataset, social media, (19 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.14)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Information Propagation on the Web: Data Extraction, Modeling and Simulation

Nel, François (LIP6 - UPMC) | Lesot, Marie-Jeanne (LIP6 - UPMC) | Delavallade, Thomas (Thales Land and Joint Systems) | Capet, Philippe (Thales Land and Joint Systems)

This paper proposes a model of information propagation mechanisms on the Web, describing all steps of its design and use in simulation. First the characteristics of a real network are studied, in particular in terms of citation policies: from a network extracted from the Web by a crawling tool, distinct publishing behaviours are identified and characterised. The Zero Crossing model for information diffusion is then extended to increase its expressive power and allow it to reproduce this variety of behaviours. Experimental results based on a simulation validate the proposed extension.

artificial intelligence, data mining, publication behaviour, (15 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.95)
Information Technology > Data Science > Data Mining > Text Mining (0.40)