AITopics | Yahoo! Research

Collaborating Authors

Yahoo! Research

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HARP: Hierarchical Representation Learning for Networks

Chen, Haochen (Stony Brook University) | Perozzi, Bryan (Google Research) | Hu, Yifan (Yahoo! Research) | Skiena, Steven (Stony Brook University)

AAAI ConferencesFeb-8-2018

We present HARP, a novel method for learning low dimensional embeddings of a graph’s nodes which preserves higher-order structural features. Our proposed method achieves this by compressing the input graph prior to embedding it, effectively avoiding troublesome embedding configurations (i.e. local minima) which can pose problems to non-convex optimization. HARP works by finding a smaller graph which approximates the global structure of its input. This simplified graph is used to learn a set of initial representations, which serve as good initializations for learning representations in the original, detailed graph. We inductively extend this idea, by decomposing a graph in a series of levels, and then embed the hierarchy of graphs from the coarsest one to the original graph. HARP is a general meta-strategy to improve all of the state-of-the-art neural algorithms for embedding graphs, including DeepWalk, LINE, and Node2vec. Indeed, we demonstrate that applying HARP’s hierarchical paradigm yields improved implementations for all three of these methods, as evaluated on classification tasks on real-world graphs such as DBLP, BlogCatalog, and CiteSeer, where we achieve a performance gain over the original implementations by up to 14% Macro F1.

artificial intelligence, graph, survey article, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > New York (0.14)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

RIPML: A Restricted Isometry Property-Based Approach to Multilabel Learning

Soni, Akshay (Yahoo! Research) | Mehdad, Yashar (Airbnb)

AAAI ConferencesMay-16-2017

The multilabel learning problem with large number of labels, features, and data-points has generated a tremendous interest recently. A recurring theme of these problems is that only a few labels are active in any given data point as compared to the total number of labels. However, only a small number of existing work take direct advantage of this inherent extreme sparsity in the label space. By the virtue of Restricted Isometry Property (RIP), satisfied by many random ensembles, we propose a novel procedure for multilabel learning known as RIPML. During the training phase, in RIPML, labels are projected onto a random low-dimensional subspace followed by solving a least-square problem in this subspace. Inference is done by a k-nearest neighbor (kNN) based approach. We demonstrate the effectiveness of RIPML by conducting extensive simulations and comparing results with the state-of-the-art linear dimensionality reduction based approaches.

multilabel learning, restricted isometry property-based approach, ripml

AAAI Conferences

The Thirtieth International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.53)

Add feedback

Leveraging Browsing Patterns for Topic Discovery and Photostream Recommendation

Chiarandini, Luca (Universitat Pompeu Fabra and Yahoo! Research) | Grabowicz, Przemyslaw A. (IFISC (CSIC-UIB)) | Trevisiol, Michele (Universitat Pompeu Fabra and Yahoo! Research) | Jaimes, Alejandro (Yahoo! Research)

AAAI ConferencesJul-5-2013

In photo-sharing websites and in social networks, photographs are most often browsed as a sequence: users who view a photo are likely to click on those that follow. The sequences of photos (which we call photostreams), as opposed to individual images, can therefore be considered to be very important content units in their own right. In spite of their importance, those sequences have received little attention even though they are at the core of how people consume image content. In this paper, we focus on photostreams. First, we perform an analysis of a large dataset of user logs containing over 100 million pageviews, examining navigation patterns between photostreams. Based on observations from the analysis, we build a stream transition graph to analyze common stream topic transitions (e.g., users often view “train” photostreams followed by “firetruck” photostreams). We then implement two stream recommendation algorithms, based on collaborative filtering and on photo tags, and report the results of a user study involving 40 participants. Our analysis yields interesting insights into how people navigate between photostreams, while the results of the user study provide useful feedback for evaluating the performance and characteristics of the recommendation algorithms.

leveraging browsing pattern, topic discovery and photostream recommendation

AAAI Conferences

Seventh International AAAI Conference on Weblogs and Social Media

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.73)

Add feedback

Fairness and Welfare Through Redistribution When Utility Is Transferable

Cavallo, Ruggiero (Yahoo! Research)

AAAI ConferencesJul-21-2012

We join the goals of two giant and related fields of research in group decision-making that have historically had little contact: fair division, and efficient mechanism design with monetary payments. To do this we adopt the standard mechanism design paradigm where utility is assumed to be quasilinear and thus transferable across agents. We generalize the traditional binary criteria of envy-freeness, proportionality, and efficiency (welfare) to measures of degree that range between 0 and 1. We demonstrate that in the canonical fair division settings under any allocatively-efficient mechanism the worst-case welfare rate is 0 and disproportionality rate is 1; in other words, the worst-case results are as bad as possible. This strongly motivates an average-case analysis. We then set as the goal identification of a mechanism that achieves high welfare, low envy, and low disproportionality in expectation across a spectrum of fair division settings. We establish that the VCG mechanism is not a satisfactory candidate, but the redistribution mechanism of [Bailey, 1997; Cavallo, 2006] is.

artificial intelligence, game theory, mechanism, (18 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York (0.14)
North America > United States > Louisiana (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Game Theory (0.94)

Add feedback

Efficient Crowdsourcing With Stochastic Production

Cavallo, Ruggiero (Yahoo! Research) | Jain, Shaili (Yale University)

AAAI ConferencesMar-25-2012

A principal seeks production of a good within a limited time-frame with a hard deadline, after which any good procured has no value. There is inherent uncertainty in the production process, which in light of the deadline may warrant simultaneous production of multiple goods by multiple producers despite there being no marginal value for extra goods beyond the maximum quality good produced. This motivates a crowdsourcing model of procurement. We address efficient execution of such procurement from a social planner's perspective, taking account of and optimally balancing the value to the principal with the costs to producers (modeled as effort expenditure) while, crucially, contending with self-interest on the part of all players. A solution to this problem involves both an algorithmic aspect that determines an optimal effort level for each producer given the principal's value, and also an incentive mechanism that achieves equilibrium implementation of the socially optimal policy despite the principal privately observing his value, producers privately observing their skill levels and effort expenditure, and all acting only to maximize their own individual welfare. In contrast to popular "winner take all" contests, the efficient mechanism we propose involves a payment to every producer that expends non-zero effort in the efficient policy.

agent, crowdsourcing, game theory, (21 more...)

AAAI Conferences

2012 AAAI Spring Symposium Series

Country: North America > United States > New York (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)
Information Technology > Game Theory (0.94)
Information Technology > Communications > Social Media > Crowdsourcing (0.73)

Add feedback

Using Crowdsourcing to Improve Profanity Detection

Sood, Sara Owsley (Pomona College) | Antin, Judd (Yahoo! Research) | Churchill, Elizabeth (Yahoo! Research)

AAAI ConferencesMar-25-2012

Profanity detection is often thought to be an easy task. However, past work has shown that current, list-based systems are performing poorly. They fail to adapt to evolving profane slang, identify profane terms that have been disguised or only partially censored (e.g., @ss, f$#%) or intentionally or unintentionally misspelled (e.g., biatch, shiiiit). For these reasons, they are easy to circumvent and have very poor recall. Secondly, they are a one-size fits all solution – making assumptions that the definition, use and perceptions of profane or inappropriate holds across all contexts. In this article, we present work that attempts to move beyond list-based profanity detection systems by identifying the context in which profanity occurs. The proposed system uses a set of comments from a social news site labeled by Amazon Mechanical Turk workers for the presence of profanity. This system far surpasses the performance of list-based profanity detection techniques. The use of crowdsourcing in this task suggests an opportunity to build profanity detection systems tailored to sites and communities.

crowdsourcing, profanity, social media, (18 more...)

AAAI Conferences

2012 AAAI Spring Symposium Series

Country:

North America > United States > Nevada (0.14)
North America > United States > California (0.14)

Industry: Law > Civil Rights & Constitutional Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.94)
Information Technology > Communications > Social Media > Crowdsourcing (0.93)

Add feedback

Who Does What on the Web: A Large-Scale Study of Browsing Behavior

Goel, Sharad (Yahoo! Research) | Hofman, Jake M. (Yahoo! Research) | Sirer, M. Irmak (Northwestern University)

AAAI ConferencesFeb-22-2012

As the Web has become integrated into daily life, understanding how individuals spend their time online impacts domains ranging from public policy to marketing. It is difficult, however, to measure even simple aspects of browsing behavior via conventional methods---including surveys and site-level analytics---due to limitations of scale and scope. In part addressing these limitations, large-scale Web panel data are a relatively novel means for investigating patterns of Internet usage. In one of the largest studies of browsing behavior to date, we pair Web histories for 250,000 anonymized individuals with user-level demographics---including age, sex, race, education, and income---to investigate three topics. First, we examine how behavior changes as individuals spend more time online, showing that the heaviest users devote nearly twice as much of their time to social media relative to typical individuals. Second, we revisit the digital divide, finding that the frequency with which individuals turn to the Web for research, news, and healthcare is strongly related to educational background, but not as closely tied to gender and ethnicity. Finally, we demonstrate that browsing histories are a strong signal for inferring user attributes, including ethnicity and household income, a result that may be leveraged to improve ad targeting.

category, educational setting, social media, (21 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.68)

Industry:

Health & Medicine (0.66)
Education > Educational Setting > Higher Education (0.49)
Information Technology > Services (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Social Media Is NOT that Bad! The Lexical Quality of Social Media

Rello, Luz (Universitat Pompeu Fabra) | Baeza-Yates, Ricardo (Yahoo! Research)

AAAI ConferencesFeb-22-2012

There is a strong correlation between spelling errors and web text content quality. Using our lexical quality measure, based in a small corpus of spelling errors, we present an estimation of the lexical quality of the main Social Media sites. This paper presents an updated and complete analysis of the lexical quality of Social Media written in English and Spanish, including how lexical quality changes in time.

artificial intelligence, social media, text processing, (15 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States (0.15)
Europe > Spain (0.15)
Asia > India (0.14)

Genre: Research Report (0.47)

Industry: Information Technology > Services (0.71)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.55)

Add feedback

A Kernel-Based Iterative Combinatorial Auction

Lahaie, Sebastien (Yahoo! Research)

AAAI ConferencesAug-4-2011

This paper describes an iterative combinatorial auction for single-minded bidders that offers modularity in the choice of price structure, drawing on ideas from kernel methods and the primal-dual paradigm of auction design. In our implementation, the auction is able to automatically detect, as the rounds progress, whether price expressiveness must be increased to clear the market. The auction also features a configurable step size which can be tuned to trade-off between monotonicity in prices and the number of bidding rounds, with no impact on efficiency. An empirical evaluation against a state of the art ascending-price auction demonstrates the performance gains that can be obtained in efficiency, revenue, and rounds to convergence through various configurations of our design.

artificial intelligence, auction, machine learning, (19 more...)

AAAI Conferences

Twenty-Fifth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.35)

Add feedback

Event Summarization Using Tweets

Chakrabarti, Deepayan (Yahoo! Research) | Punera, Kunal (Yahoo! Research)

AAAI ConferencesJul-12-2011

Twitter has become exceedingly popular, with hundreds of millions of tweets being posted every day on a wide variety of topics. This has helped make real-time search applications possible with leading search engines routinely displaying relevant tweets in response to user queries. Recent research has shown that a considerable fraction of these tweets are about "events," and the detection of novel events in the tweet-stream has attracted a lot of research interest. However, very little research has focused on properly displaying this real-time information about events. For instance, the leading search engines simply display all tweets matching the queries in reverse chronological order. In this paper we argue that for some highly structured and recurring events, such as sports, it is better to use more sophisticated techniques to summarize the relevant tweets. We formalize the problem of summarizing event-tweets and give a solution based on learning the underlying hidden state representation of the event via Hidden Markov Models. In addition, through extensive experiments on real-world data we show that our model significantly outperforms some intuitive and competitive baselines.

artificial intelligence, social media, tweet, (21 more...)

AAAI Conferences

Fifth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.46)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)

Add feedback