AITopics

1504.04114

Country:

Asia > Middle East > Israel (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Services (0.69)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

arXiv.org Artificial IntelligenceApr-14-2015

Inferring Social Status and Rich Club Effects in Enterprise Communication Networks

Dong, Yuxiao, Tang, Jie, Chawla, Nitesh, Lou, Tiancheng, Yang, Yang, Wang, Bai

Social status, defined as the relative rank or position that an individual holds in a social hierarchy, is known to be among the most important motivating forces in social behaviors. In this paper, we consider the notion of status from the perspective of a position or title held by a person in an enterprise. We study the intersection of social status and social networks in an enterprise. We study whether enterprise communication logs can help reveal how social interactions and individual status manifest themselves in social networks. To that end, we use two enterprise datasets with three communication channels --- voice call, short message, and email --- to demonstrate the social-behavioral differences among individuals with different status. We have several interesting findings and based on these findings we also develop a model to predict social status. On the individual level, high-status individuals are more likely to be spanned as structural holes by linking to people in parts of the enterprise networks that are otherwise not well connected to one another. On the community level, the principle of homophily, social balance and clique theory generally indicate a "rich club" maintained by high-status individuals, in the sense that this community is much more connected, balanced and dense. Our model can predict social status of individuals with 93% accuracy.

social media, social status, télécommunications, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1371/journal.pone.0119446

1404.3708

Country: North America > United States > New York (0.14)

Industry:

Telecommunications (1.00)
Government > Military (0.68)
Information Technology > Services (0.57)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kailkhura, Bhavya, Brahma, Swastik, Varshney, Pramod K.

Consensus based Detection in the Presence of Data Falsification Attacks

arXiv.org Machine LearningApr-13-2015

This paper considers the problem of detection in distributed networks in the presence of data falsification (Byzantine) attacks. Detection approaches considered in the paper are based on fully distributed consensus algorithms, where all of the nodes exchange information only with their neighbors in the absence of a fusion center. In such networks, we characterize the negative effect of Byzantines on the steady-state and transient detection performance of the conventional consensus based detection algorithms. To address this issue, we study the problem from the network designer's perspective. More specifically, we first propose a distributed weighted average consensus algorithm that is robust to Byzantine attacks. We show that, under reasonable assumptions, the global test statistic for detection can be computed locally at each node using our proposed consensus algorithm. We exploit the statistical distribution of the nodes' data to devise techniques for mitigating the influence of data falsifying Byzantines on the distributed detection system. Since some parameters of the statistical distribution of the nodes' data might not be known a priori, we propose learning based techniques to enable an adaptive design of the local fusion or update rules.

artificial intelligence, machine learning, node, (15 more...)

doi: 10.1109/TSIPN.2016.2607119

1504.03413

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kim, Do-kyum, Voelker, Geoffrey M., Saul, Lawrence K.

Topic Modeling of Hierarchical Corpora

arXiv.org Machine LearningApr-13-2015

We study the problem of topic modeling in corpora whose documents are organized in a multi-level hierarchy. We explore a parametric approach to this problem, assuming that the number of topics is known or can be estimated by cross-validation. The models we consider can be viewed as special (finite-dimensional) instances of hierarchical Dirichlet processes (HDPs). For these models we show that there exists a simple variational approximation for probabilistic inference. The approximation relies on a previously unexploited inequality that handles the conditional dependence between Dirichlet latent variables in adjacent levels of the model's hierarchy. We compare our approach to existing implementations of nonparametric HDPs. On several benchmarks we find that our approach is faster than Gibbs sampling and able to learn more predictive models than existing variational methods. Finally, we demonstrate the large-scale viability of our approach on two newly available corpora from researchers in computer security---one with 350,000 documents and over 6,000 internal subcategories, the other with a five-level deep hierarchy.

bayesian inference, corpora, social media, (20 more...)

1409.3518

Country: North America > United States > California > San Diego County (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(3 more...)

Wang, Yu-Xiang, Fienberg, Stephen E., Smola, Alex

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

arXiv.org Machine LearningApr-11-2015

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy:, a cryptographic approach to protect individual-level privacy while permiting database-level utility. Specifically, we show that that under standard assumptions, getting one single sample from a posterior distribution is differentially private "for free". We will see that estimator is statistically consistent, near optimal and computationally tractable whenever the Bayesian model of interest is consistent, optimal and tractable. Similarly but separately, we show that a recent line of works that use stochastic gradient for Hybrid Monte Carlo (HMC) sampling also preserve differentially privacy with minor or no modifications of the algorithmic procedure at all, these observations lead to an "anytime" algorithm for Bayesian learning under privacy constraint. We demonstrate that it performs much better than the state-of-the-art differential private methods on synthetic and real datasets.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1502.07645

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningApr-4-2015

Sync-Rank: Robust Ranking, Constrained Ranking and Rank Aggregation via Eigenvector and Semidefinite Programming Synchronization

Cucuringu, Mihai

We consider the classic problem of establishing a statistical ranking of a set of n items given a set of inconsistent and incomplete pairwise comparisons between such items. Instantiations of this problem occur in numerous applications in data analysis (e.g., ranking teams in sports data), computer vision, and machine learning. We formulate the above problem of ranking with incomplete noisy information as an instance of the group synchronization problem over the group SO(2) of planar rotations, whose usefulness has been demonstrated in numerous applications in recent years. Its least squares solution can be approximated by either a spectral or a semidefinite programming (SDP) relaxation, followed by a rounding procedure. We perform extensive numerical simulations on both synthetic and real-world data sets, showing that our proposed method compares favorably to other algorithms from the recent literature. Existing theoretical guarantees on the group synchronization problem imply lower bounds on the largest amount of noise permissible in the ranking data while still achieving exact recovery. We propose a similar synchronization-based algorithm for the rank-aggregation problem, which integrates in a globally consistent ranking pairwise comparisons given by different rating systems on the same set of items. We also discuss the problem of semi-supervised ranking when there is available information on the ground truth rank of a subset of players, and propose an algorithm based on SDP which recovers the ranks of the remaining players. Finally, synchronization-based ranking, combined with a spectral technique for the densest subgraph problem, allows one to extract locally-consistent partial rankings, in other words, to identify the rank of a small subset of players whose pairwise comparisons are less noisy than the rest of the data, which other methods are not able to identify.

constraint-based reasoning, matrix, soccer, (25 more...)

1504.0107

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.46)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games > Computer Games (0.92)
Banking & Finance (0.67)
Information Technology > Services (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Information Management > Search (0.92)
(4 more...)

Maity, Suman Kalyan (Indian Institute of Technology Kharagpur) | Gupta, Abhishek (Indian Institute of Technology Kharagpur) | Goyal, Pawan (Indian Institute of Technology Kharagpur) | Mukherjee, Animesh (Indian Institute of Technology Kharagpur)

A Stratified Learning Approach for Predicting the Popularity of Twitter Idioms

Twitter Idioms are one of the important types of hashtags that spread in Twitter. In this paper, we propose a classifier that can stratify the Idioms from the other kind of hashtags with 86.93% accuracy and high precision and recall rate. We then learn regression models on the stratified samples (Idioms and non-Idioms) separately to predict the popularity of the Idioms. This stratification not only itself allows us to make more accurate predictions but also makes it possible to include Idiom-specific features to separately improve the accuracy for the Idioms. Experimental results show that such stratification during the training phase followed by inclusion of Idiom-specific features leads to an overall improvement of 11.13% and 19.56% in correlation coefficient over the baseline method after the 7th and the 11th month respectively.

idiom, stratified learning approach

Ninth International AAAI Conference on Web and Social Media

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

On the k-Anonymization of Time-Varying and Multi-Layer Social Graphs

Rossi, Luca (University of Birmingham) | Musolesi, Mirco (University of Birmingham) | Torsello, Andrea (Università Ca')

time-varying and multi-layer social graph

The popularity of online social media platforms provides an unprecedented opportunity to study real-world complex networks of interactions. However, releasing this data to researchers and the public comes at the cost of potentially exposing private and sensitive user information. It has been shown that a naive anonymization of a network by removing the identity of the nodes is not sufficient to preserve users' privacy. In order to deal with malicious attacks, k-anonymity solutions have been proposed to partially obfuscate topological information that can be used to infer nodes' identity. In this paper, we study the problem of ensuring k-anonymity in time-varying graphs, i.e., graphs with a structure that changes over time, and multi-layer graphs, i.e., graphs with multiple types of links. More specifically, we examine the case in which the attacker has access to the degree of the nodes. The goal is to generate a new graph where, given the degree of a node in each (temporal) layer of the graph, such a node remains indistinguishable from other k-1 nodes in the graph. In order to achieve this, we find the optimal partitioning of the graph nodes such that the cost of anonymizing the degree information within each group is minimum. We show that this reduces to a special case of a Generalized Assignment Problem, and we propose a simple yet effective algorithm to solve it. Finally, we introduce an iterated linear programming approach to enforce the realizability of the anonymized degree sequences. The efficacy of the method is assessed through an extensive set of experiments on synthetic and real-world graphs.

Ninth International AAAI Conference on Web and Social Media

Industry: Information Technology > Security & Privacy (0.53)

Technology:

Information Technology > Security & Privacy (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)

An Image Is Worth More than a Thousand Favorites: Surfacing the Hidden Beauty of Flickr Pictures

Schifanella, Rossano (University of Turin) | Redi, Miriam (Yahoo Labs) | Aiello, Luca Maria (Yahoo Labs)

The dynamics of attention in social media tend to obey power laws. Attention concentrates on a relatively small number of popular items and neglecting the vast majority of content produced by the crowd. Although popularity can be an indication of the perceived value of an item within its community, previous research has hinted to the fact that popularity is distinct from intrinsic quality. As a result, content with low visibility but high quality lurks in the tail of the popularity distribution. This phenomenon can be particularly evident in the case of photo-sharing communities, where valuable photographers who are not highly engaged in online social interactions contribute with high-quality pictures that remain unseen. We propose to use a computer vision method to surface beautiful pictures from the immense pool of near-zero-popularity items, and we test it on a large dataset of creative-commons photos on Flickr. By gathering a large crowdsourced ground truth of aesthetics scores for Flickr images, we show that our method retrieves photos whose median perceived beauty score is equal to the most popular ones, and whose average is lower by only 1.5%.

flickr picture, hidden beauty, surfacing

Ninth International AAAI Conference on Web and Social Media

Industry: Information Technology > Services (0.80)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (0.87)

Analyzing and Modeling Special Offer Campaigns in Location-Based Social Networks

Zhang, Ke (University of Pittsburgh) | Pelechrinis, Konstantinos (University of Pittsburgh) | Lappas, Theodoros (Stevens Insitute of Technology)

analyzing and modeling, location-based social network, modeling special offer campaign

The proliferation of mobile handheld devices in combination with the technological advancements in mobile computing has led to a number of innovative services that make use of the location information available on such devices. Traditional yellow pages websites have now moved to mobile platforms, giving the opportunity to local businesses and potential, near-by, customers to connect. These platforms can offer an affordable advertisement channel to local businesses. One of the mechanisms offered by location-based social networks (LBSNs) allows businesses to provide special offers to their customers that connect through the platform. We collect a large time-series dataset from approximately 14 million venues on Foursquare and analyze the performance of such campaigns using randomization techniquesand (non-parametric) hypothesis testing with statistical bootstrapping. Our main finding indicates that this type of promotions are not as effective as anecdote success stories might suggest. Finally, we design classifiers by extracting three different types of features that are able to provide an educated decision on whether a special offer campaign for a local business will succeed or not both in short and long term.

Ninth International AAAI Conference on Web and Social Media

Industry:

Retail > Catalog (0.80)
Information Technology > Services (0.60)

Technology:

Information Technology > Communications > Social Media (0.60)
Information Technology > Communications > Mobile (0.53)
Information Technology > Artificial Intelligence > Machine Learning (0.53)