Plotting

 Information Technology


Role-Dynamics: Fast Mining of Large Dynamic Networks

arXiv.org Artificial Intelligence

To understand the structural dynamics of a large-scale social, biological or technological network, it may be useful to discover behavioral roles representing the main connectivity patterns present over time. In this paper, we propose a scalable non-parametric approach to automatically learn the structural dynamics of the network and individual nodes. Roles may represent structural or behavioral patterns such as the center of a star, peripheral nodes, or bridge nodes that connect different communities. Our novel approach learns the appropriate structural role dynamics for any arbitrary network and tracks the changes over time. In particular, we uncover the specific global network dynamics and the local node dynamics of a technological, communication, and social network. We identify interesting node and network patterns such as stationary and non-stationary roles, spikes/steps in role-memberships (perhaps indicating anomalies), increasing/decreasing role trends, among many others. Our results indicate that the nodes in each of these networks have distinct connectivity patterns that are non-stationary and evolve considerably over time. Overall, the experiments demonstrate the effectiveness of our approach for fast mining and tracking of the dynamics in large networks. Furthermore, the dynamic structural representation provides a basis for building more sophisticated models and tools that are fast for exploring large dynamic networks.


In-network Sparsity-regularized Rank Minimization: Algorithms and Applications

arXiv.org Machine Learning

Given a limited number of entries from the superposition of a low-rank matrix plus the product of a known fat compression matrix times a sparse matrix, recovery of the low-rank and sparse components is a fundamental task subsuming compressed sensing, matrix completion, and principal components pursuit. This paper develops algorithms for distributed sparsity-regularized rank minimization over networks, when the nuclear- and $\ell_1$-norm are used as surrogates to the rank and nonzero entry counts of the sought matrices, respectively. While nuclear-norm minimization has well-documented merits when centralized processing is viable, non-separability of the singular-value sum challenges its distributed minimization. To overcome this limitation, an alternative characterization of the nuclear norm is adopted which leads to a separable, yet non-convex cost minimized via the alternating-direction method of multipliers. The novel distributed iterations entail reduced-complexity per-node tasks, and affordable message passing among single-hop neighbors. Interestingly, upon convergence the distributed (non-convex) estimator provably attains the global optimum of its centralized counterpart, regardless of initialization. Several application domains are outlined to highlight the generality and impact of the proposed framework. These include unveiling traffic anomalies in backbone networks, predicting networkwide path latencies, and mapping the RF ambiance using wireless cognitive radios. Simulations with synthetic and real network data corroborate the convergence of the novel distributed algorithm, and its centralized performance guarantees.


Towards Electronic Shopping of Composite Product

arXiv.org Artificial Intelligence

In the paper, frameworks for electronic shopping of composite (modular) products are described: (a) multicriteria selection (product is considered as a whole system, it is a traditional approach), (b) combinatorial synthesis (composition) of the product from its components, (c) aggregation of the product from several selected products/prototypes. The following product model is examined: (i) general tree-like structure, (ii) set of system parts/components (leaf nodes), (iii) design alternatives (DAs) for each component, (iv) ordinal priorities for DAs, and (v) estimates of compatibility between DAs for different components. The combinatorial synthesis is realized as morphological design of a composite (modular) product or an extended composite product (e.g., product and support services as financial instruments). Here the solving process is based on Hierarchical Morphological Multicriteria Design (HMMD): (i) multicriteria selection of alternatives for system parts, (ii) composing the selected alternatives into a resultant combination (while taking into account ordinal quality of the alternatives above and their compatibility). The aggregation framework is based on consideration of aggregation procedures, for example: (i) addition procedure: design of a products substructure or an extended substructure ('kernel') and addition of elements, and (ii) design procedure: design of the composite solution based on all elements of product superstructure. Applied numerical examples (e.g., composite product, extended composite product, product repair plan, and product trajectory) illustrate the proposed approaches.


A Contextual-Bandit Approach to Personalized News Article Recommendation

arXiv.org Artificial Intelligence

Personalized web services strive to adapt their services (advertisements, news articles, etc) to individual users by making use of both content and user information. Despite a few recent advances, this problem remains challenging for at least two reasons. First, web service is featured with dynamically changing pools of content, rendering traditional collaborative filtering methods inapplicable. Second, the scale of most web services of practical interest calls for solutions that are both fast in learning and computation. In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks. The contributions of this work are three-fold. First, we propose a new, general contextual bandit algorithm that is computationally efficient and well motivated from learning theory. Second, we argue that any bandit algorithm can be reliably evaluated offline using previously recorded random traffic. Finally, using this offline evaluation method, we successfully applied our new algorithm to a Yahoo! Front Page Today Module dataset containing over 33 million events. Results showed a 12.5% click lift compared to a standard context-free bandit algorithm, and the advantage becomes even greater when data gets more scarce.


SearchBuddies: Bringing Search Engines into the Conversation

AAAI Conferences

Although people receive trusted, personalized recommendations and auxiliary social benefits when they ask questions of their friends, using a search engine is often a more effective way to find an answer. Attempts to integrate social and algorithmic search have thus far focused on bringing social content into algorithmic search results. However, more of the benefits of social search can be preserved by reversing this approach and bringing algorithmic content into natural question-based conversations. To do this successfully, it is necessary to adapt search engine interaction to a social context. In this paper, we present SearchBuddies, a system that responds to Facebook status message questions with algorithmic search results. Via a three-month deployment of the system to 122 social network users, we explore how people responded to search content in a highly social environment. Our experience deploying SearchBuddies shows that a socially embedded search engine can successfully provide users with unique and highly relevant information in a social context and can be integrated into conversations around an information need. The deployment also illuminates specific challenges of embedding a search engine in a social environment and provides guidance toward solutions.


Crossing Media Streams with Sentiment: Domain Adaptation in Blogs, Reviews and Twitter

AAAI Conferences

Most sentiment analysis studies address classification of a single source of data such as reviews or blog posts. However, the multitude of social media sources available for text analysis lends itself naturally to domain adaptation. In this study, we create a dataset spanning three social media sources -- blogs, reviews, and Twitter -- and a set of 37 common topics. We first examine sentiments expressed in these three sources while controlling for the change in topic. Then using this multi-dimensional data we show that when classifying documents in one source (a target source), models trained on other sources of data can be as good as or even better than those trained on the target data. That is, we show that models trained on some social media sources are generalizable to others. All source adaptation models we implement show reviews and Twitter to be the best sources of training data. It is especially useful to know that models trained on Twitter data are generalizable, since, unlike reviews, Twitter is more topically diverse.


Network Sampling Designs for Relational Classification

AAAI Conferences

Relational classification has been extensively studied recently due to its applications in social, biological, technological, and information networks. Much of the work in relational learning has focused on analyzing input data that comprise a single network. Although machine learning researchers have considered the issue of how to sample training and test sets from the input network (for evaluation), the mechanisms which are used to construct the input networks have largely been ignored. In most cases, the input network has itself been sampled from a larger target network (e.g., Facebook) and often the researcher is unaware of how the input network was constructed or what impact that may have on evaluation of the relational models. Since the goal in evaluating relational classification algorithms is to accurately assess their performance on the larger target network, it is critical to understand what impact the initial sampling method may have on our estimates of classification accuracy.In this paper, we present different sampling methods and systematically study their impact on evaluation of relational classification. Our results indicate that the choice of sampling method can impact classification performance, and thus consequently affects the accuracy of evaluation.


Do You Feel What I Feel? Social Aspects of Emotions in Twitter Conversations

AAAI Conferences

We present a computational framework for understanding the social aspects of emotions in Twitter conversations. Using unannotated data and semisupervised machine learning, we look at emotional transitions, emotional influences among the conversation partners, and patterns in the overall emotional exchanges. We find that conversational partners usually express the same emotion, which we name Emotion accommodation, but when they do not, one of the conversational partners tends to respond with a positive emotion. We also show that tweets containing sympathy, apology, and complaint are significant emotion influencers. We verify the emotion classification part of our framework by a human-annotated corpus.


Where Is This Tweet From? Inferring Home Locations of Twitter Users

AAAI Conferences

We present a new algorithm for inferring the home locations of Twitter users at different granularities, such as city, state, or time zone, using the content of their tweets and their tweeting behavior. Unlike existing approaches, our algorithm uses an ensemble of statistical and heuristic classifiers to predict locations. We find that a hierarchical classification approach can improve prediction accuracy. Experimental evidence suggests that our algorithm works well in practice and outperforms the best existing algorithms for predicting the location of Twitter users.


Where Online Friends Meet: Social Communities in Location-Based Networks

AAAI Conferences

Recent research suggests that, as in offline scenarios, spatial proximity increases the likelihood that two individuals establish an online social connection, and geographic closeness could therefore influence the formation of online communities. In this work we present a study of communities in two online social networks with location-sharing features and analyze their social and spatial properties. We study the places users visit to understand whether communities revolve around places or whether they exist independently. Our results suggest that community structure in social networks may arise from both social and spatial factors, so that exploiting information about the places where people go could benefit the definition of new community detection methods and community evolution models.