AITopics

Twenty-Second International Joint Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Industry:

Leisure & Entertainment > Games (1.00)
Leisure & Entertainment > Sports > Soccer (0.71)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Jalali, Ali, Johnson, Chris, Ravikumar, Pradeep

On Learning Discrete Graphical Models Using Greedy Methods

arXiv.org Machine LearningJul-16-2011

In this paper, we address the problem of learning the structure of a pairwise graphical model from samples in a high-dimensional setting. Our first main result studies the sparsistency, or consistency in sparsity pattern recovery, properties of a forward-backward greedy algorithm as applied to general statistical models. As a special case, we then apply this algorithm to learn the structure of a discrete graphical model via neighborhood estimation. As a corollary of our general result, we derive sufficient conditions on the number of samples n, the maximum node-degree d and the problem size p, as well as other conditions on the model parameters, so that the algorithm recovers all the edges with high probability. Our result guarantees graph selection for samples scaling as n = Omega(d^2 log(p)), in contrast to existing convex-optimization based algorithms that require a sample complexity of \Omega(d^3 log(p)). Further, the greedy algorithm only requires a restricted strong convexity condition which is typically milder than irrepresentability assumptions. We corroborate these results using numerical simulations at the end.

algorithm, artificial intelligence, machine learning, (19 more...)

1107.3258

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Laurino, Omar, D'Abrusco, Raffaele, Longo, Giuseppe, Riccio, Giuseppe

Astroinformatics of galaxies and quasars: a new general method for photometric redshifts estimation

arXiv.org Machine LearningJul-15-2011

With the availability of the huge amounts of data produced by current and future large multi-band photometric surveys, photometric redshifts have become a crucial tool for extragalactic astronomy and cosmology. In this paper we present a novel method, called Weak Gated Experts (WGE), which allows to derive photometric redshifts through a combination of data mining techniques. \noindent The WGE, like many other machine learning techniques, is based on the exploitation of a spectroscopic knowledge base composed by sources for which a spectroscopic value of the redshift is available. This method achieves a variance \sigma^2(\Delta z)=2.3x10^{-4} (\sigma^2(\Delta z) =0.08), where \Delta z = z_{phot} - z_{spec}) for the reconstruction of the photometric redshifts for the optical galaxies from the SDSS and for the optical quasars respectively, while the Root Mean Square (RMS) of the \Delta z variable distributions for the two experiments is respectively equal to 0.021 and 0.35. The WGE provides also a mechanism for the estimation of the accuracy of each photometric redshift. We also present and discuss the catalogs obtained for the optical SDSS galaxies, for the optical candidate quasars extracted from the DR7 SDSS photometric dataset {The sample of SDSS sources on which the accuracy of the reconstruction has been assessed is composed of bright sources, for a subset of which spectroscopic redshifts have been measured.}, and for optical SDSS candidate quasars observed by GALEX in the UV range. The WGE method exploits the new technological paradigm provided by the Virtual Observatory and the emerging field of Astroinformatics.

artificial intelligence, data mining, machine learning, (16 more...)

doi: 10.1111/j.1365-2966.2011.19416.x

1107.316

Country:

North America > United States (0.45)
Europe > Italy (0.28)

Genre: Research Report > Promising Solution (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Richtárik, Peter, Takáč, Martin

Iteration Complexity of Randomized Block-Coordinate Descent Methods for Minimizing a Composite Function

arXiv.org Machine LearningJul-14-2011

In this paper we develop a randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function and prove that it obtains an $\epsilon$-accurate solution with probability at least $1-\rho$ in at most $O(\tfrac{n}{\epsilon} \log \tfrac{1}{\rho})$ iterations, where $n$ is the number of blocks. For strongly convex functions the method converges linearly. This extends recent results of Nesterov [Efficiency of coordinate descent methods on huge-scale optimization problems, CORE Discussion Paper #2010/2], which cover the smooth case, to composite minimization, while at the same time improving the complexity by the factor of 4 and removing $\epsilon$ from the logarithmic term. More importantly, in contrast with the aforementioned work in which the author achieves the results by applying the method to a regularized version of the objective function with an unknown scaling factor, we show that this is not necessary, thus achieving true iteration complexity bounds. In the smooth case we also allow for arbitrary probability vectors and non-Euclidean norms. Finally, we demonstrate numerically that the algorithm is able to solve huge-scale $\ell_1$-regularized least squares and support vector machine problems with a billion variables.

artificial intelligence, iteration, machine learning, (19 more...)

1107.2848

Country: Europe > United Kingdom (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Álvarez, Mauricio A., Luengo, David, Lawrence, Neil D.

Linear Latent Force Models using Gaussian Processes

arXiv.org Artificial IntelligenceJul-13-2011

Purely data driven approaches for machine learning present difficulties when data is scarce relative to the complexity of the model or when the model is forced to extrapolate. On the other hand, purely mechanistic approaches need to identify and specify all the interactions in the problem at hand (which may not be feasible) and still leave the issue of how to parameterize the system. In this paper, we present a hybrid approach using Gaussian processes and differential equations to combine data driven modelling with a physical model of the system. We show how different, physically-inspired, kernel functions can be developed through sensible, simple, mechanistic assumptions about the underlying system. The versatility of our approach is illustrated with three case studies from motion capture, computational biology and geostatistics.

equation, survey article, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

1107.2699

Country:

North America > United States > New York (0.28)
Europe > Spain (0.28)
North America > United States > Massachusetts (0.14)
(4 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Energy > Oil & Gas > Upstream (0.48)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningJul-13-2011

Fast Learning Rate of lp-MKL and its Minimax Optimality

Suzuki, Taiji

In this paper, we give a new sharp generalization bound of lp-MKL which is a generalized framework of multiple kernel learning (MKL) and imposes lp-mixed-norm regularization instead of l1-mixed-norm regularization. We utilize localization techniques to obtain the sharp learning rate. The bound is characterized by the decay rate of the eigenvalues of the associated kernels. A larger decay rate gives a faster convergence rate. Furthermore, we give the minimax learning rate on the ball characterized by lp-mixed-norm in the product space. Then we show that our derived learning rate of lp-MKL achieves the minimax optimal rate on the lp-mixed-norm ball.

artificial intelligence, assumption, machine learning, (16 more...)

1103.5202

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Suzuki, Taiji, Tomioka, Ryota, Sugiyama, Masashi

Fast Convergence Rate of Multiple Kernel Learning with Elastic-net Regularization

arXiv.org Machine LearningJul-13-2011

We investigate the learning rate of multiple kernel leaning (MKL) with elastic-net regularization, which consists of an $\ell_1$-regularizer for inducing the sparsity and an $\ell_2$-regularizer for controlling the smoothness. We focus on a sparse setting where the total number of kernels is large but the number of non-zero components of the ground truth is relatively small, and prove that elastic-net MKL achieves the minimax learning rate on the $\ell_2$-mixed-norm ball. Our bound is sharper than the convergence rates ever shown, and has a property that the smoother the truth is, the faster the convergence rate is.

artificial intelligence, assumption, machine learning, (16 more...)

1103.0431

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

AAAI ConferencesJul-12-2011

Exploiting User Interest on Social Media for Aggregating Diverse Data and Predicting Interest

Nori, Nozomi (The University of Tokyo) | Bollegala, Danushka (The University of Tokyo) | Ishizuka, Mitsuru (The University of Tokyo)

More and more users have been taking various actions to diverse resources referred to by URLs such as news, web pages, images, products, movies as a result of the growth of social media. They are annotating, tweeting in Twitter, reblogging in Tumblr, and Liking in Facebook, etc. Analyses about these diverse actions will be useful for aggregating or integrating diverse resources. In this paper, we view users’ actions to resources as expressing their some interests, and by investigating how their interests are expressed in social media, we get suggestions for aggregations. Our results show that a certain kind of action (such as tagging on Delicious) can be used to make predictions on a different kind of action (such as favorite on Twitter). These analyses will be useful for aggregating or integrating diverse contents on multiple sources. In addition to some experimental analyses, we propose a novel method to predict users’ interests in social media, using time-evolving, multinomial relational data. Our experimental results show that the proposed method significantly outperforms standard tensor analysis and an existing state-of-the-art method (LDA) in prediction tasks.

action type, delicious, node, (17 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Asia > Middle East > Jordan (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.68)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

AAAI ConferencesJul-12-2011

Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter

Lee, Kyumin (Texas A&M University) | Eoff, Brian David (Texas A&M University) | Caverlee, James (Texas A&M University)

The rise in popularity of social networking sites such as Twitter and Facebook has been paralleled by the rise of unwanted, disruptive entities on these networks- — including spammers, malware disseminators, and other content polluters. Inspired by sociologists working to ensure the success of commons and criminologists focused on deterring vandalism and preventing crime, we present the first long-term study of social honeypots for tempting, profiling, and filtering content polluters in social media. Concretely, we report on our experiences via a seven-month deployment of 60 honeypots on Twitter that resulted in the harvesting of 36,000 candidate content polluters. As part of our study, we (1) examine the harvested Twitter users, including an analysis of link payloads, user behavior over time, and followers/following network dynamics and (2) evaluate a wide range of features to investigate the effectiveness of automatic content polluter identification.

content polluter, legitimate user, polluter, (14 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

AAAI ConferencesJul-12-2011

Find Me the Right Content! Diversity-Based Sampling of Social Media Spaces for Topic-Centric Search

Choudhury, Munmun De (Rutgers, The State University of New Jersey) | Counts, Scott (Microsoft Research) | Czerwinski, Mary (Microsoft Research)

Social media and networking websites, such as Twitter and Facebook, generate large quantities of information and have become mechanisms for real-time content dissipation to users. An important question that arises is: how do we sample such social media information spaces in order to deliver relevant content on a topic to end users? Notice that these large-scale information spaces are inherently diverse, featuring a wide array of attributes such as location, recency, degree of diffusion effects in the network and so on. Naturally, for the end user, different levels of diversity in social media content can significantly impact the information consumption experience: low diversity can provide focused content that may be simpler to understand, while high diversity can increase breadth in the exposure to multiple opinions and perspectives. Hence to address our research question, we turn to diversity as a core concept in our proposed sampling methodology. Here we are motivated by ideas in the "compressive sensing" literature and utilize the notion of sparsity in social media information to represent such large spaces via a small number of basis components. Thereafter we use a greedy iterative clustering technique on this transformed space to construct samples matching a desired level of diversity. Based on Twitter Firehose data, we demonstrate quantitatively that our method is robust, and performs better than other baseline techniques over a variety of trending topics. In a user study, we further show that users find samples generated by our method to be more interesting and subjectively engaging compared to techniques inspired by state-of-the-art systems, with improvements in the range of 15--45%.

diversity, information space, tweet, (15 more...)

Fifth International AAAI Conference on Weblogs and Social Media

Country:

North America > Haiti (0.14)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.88)

Industry:

Information Technology (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)