Stony Brook University
HARP: Hierarchical Representation Learning for Networks
Chen, Haochen (Stony Brook University) | Perozzi, Bryan (Google Research) | Hu, Yifan (Yahoo! Research) | Skiena, Steven (Stony Brook University)
We present HARP, a novel method for learning low dimensional embeddings of a graphโs nodes which preserves higher-order structural features. Our proposed method achieves this by compressing the input graph prior to embedding it, effectively avoiding troublesome embedding configurations (i.e. local minima) which can pose problems to non-convex optimization. HARP works by finding a smaller graph which approximates the global structure of its input. This simplified graph is used to learn a set of initial representations, which serve as good initializations for learning representations in the original, detailed graph. We inductively extend this idea, by decomposing a graph in a series of levels, and then embed the hierarchy of graphs from the coarsest one to the original graph. HARP is a general meta-strategy to improve all of the state-of-the-art neural algorithms for embedding graphs, including DeepWalk, LINE, and Node2vec. Indeed, we demonstrate that applying HARPโs hierarchical paradigm yields improved implementations for all three of these methods, as evaluated on classification tasks on real-world graphs such as DBLP, BlogCatalog, and CiteSeer, where we achieve a performance gain over the original implementations by up to 14% Macro F1.
Event Representations With Tensor-Based Compositions
Weber, Noah (Stony Brook University) | Balasubramanian, Niranjan (Stony Brook University) | Chambers, Nathanael (United States Naval Academy)
Robust and flexible event representations are important to many core areas in language understanding. Scripts were proposed early on as a way of representing sequences of events for such understanding, and has recently attracted renewed attention. However, obtaining effective representations for modeling script-like event sequences is challenging. It requires representations that can capture event-level and scenario-level semantics. We propose a new tensor-based composition method for creating event representations. The method captures more subtle semantic interactions between an event and its entities and yields representations that are effective at multiple event-related tasks. With the continuous representations, we also devise a simple schema generation method which produces better schemas compared to a prior discrete representation based method. Our analysis shows that the tensors capture distinct usages of a predicate even when there are only subtle differences in their surface realizations.
RuleML (Web Rule Symposium) 2016 Report
Foder, Paul (Stony Brook University) | Governatori, Guido (data61) | Alfers, Josรฉ Jรบlio (Universidade Nova de Lisboa) | Bertossi, Leopoldo (Carleton University)
Moreover, 2 keynote and 2 tutorial papers were invited. Most regular papers were presented in one of these tracks: Smart Contracts, Blockchain, and Rules, Constraint Handling Rules, Event Driven Architectures and Active Database Systems, Legal Rules and Reasoning, Rule-and Ontology-Based Data Access and Transformation, Rule Induction, and Learning. Following up on previous years, RuleML also hosted the 6th RuleML Doctoral Consortium and the 10th International Rule Challenge, which this year was dedicated to applications of rule-based reasoning, such as Rules in Retail, Rules in Tourism, Rules in Transportation, Rules in Geography, Rules in Location-Based Search, Rules in Insurance Regulation, Rules in Medicine, and Rules in Ecosystem Research. The 10th International Rule Challenge Awards went to Ingmar Dasseville, Laurent Janssens, Gerda Janssens, Jan Vanthienen, and Marc Denecker, for their paper Combining DMN and the Knowledge Base Paradigm for Flexible Decision Enactment, and Jacob Feldman for his paper What-If Analyzer for DMN-based Decision Models. As in previous years, RuleML 2016 was also a place for presentations and face-to-face meetings about rule technology standardizations, which this year Mark Your Calendars!
Engineering Agreement: The Naming Game with Asymmetric and Heterogeneous Agents
Gao, Jie (Stony Brook University) | Li, Bo (University of Michigan) | Schoenebeck, Grant (University of Michgan) | Yu, Fang-Yi (University of Michigan)
Being popular in language evolution, cognitive science, and culture dynamics, the Naming Game has been widely used to analyze how agents reach global consensus via communications in multi-agent systems. Most prior work considered networks that are symmetric and homogeneous (e.g., vertex transitive). In this paper we consider asymmetric or heterogeneous settings that complement the current literature: 1) we show that increasing asymmetry in network topology can improve convergence rates. The star graph empirically converges faster than all previously studied graphs; 2) we consider graph topologies that are particularly challenging for naming game such as disjoint cliques or multi-level trees and ask how much extra homogeneity (random edges) is required to allow convergence or fast convergence. We provided theoretical analysis which was confirmed by simulations; 3) we analyze how consensus can be manipulated when stubborn nodes are introduced at different points of the process. Early introduction of stubborn nodes can easily influence the outcome in certain family of networks while late introduction of stubborn nodes has much less power.
Freshman or Fresher? Quantifying the Geographic Variation of Language in Online Social Media
Kulkarni, Vivek (Stony Brook University) | Perozzi, Bryan (Stony Brook University) | Skiena, Steven (Stony Brook University)
In this paper we present a new computational technique to detect and analyze statistically significant geographic variation in language. While previous approaches have primarily focused on lexical variation between regions, our method identifies words that demonstrate semantic and syntactic variation as well. We extend recently developed techniques for neural language models to learn word representations which capture differing semantics across geographical regions. In order to quantify this variation and ensure robust detection of true regional differences, we formulate a null model to determine whether observed changes are statistically significant. Our method is the first such approach to explicitly account for random variation due to chance while detecting regional variation in word meaning. To validate our model, we study and analyze two different massive online data sets: millions of tweets from Twitter as well as millions of phrases contained in the Google Book Ngrams. Our analysis reveals interesting facets of language change across countries.
Computing Nash Equilibrium in Interdependent Defense Games
Chan, Hau (Stony Brook University) | Ortiz, Luis (Stony Brook University)
Roughly speaking, Interdependent Defense (IDD) games, previously proposed, model the situation where an attacker wants to cause as much damage as possible to a network by attacking one of the sites in the network. Each site must make an investment decision regarding security to protect itself against a direct or indirect attack, the latter due to potential transfer-risk from an unprotected neighboring site. The work introducing IDD games discusses potential applications to model the essence of real-world scenarios such as the 2006 transatlantic aircraft plot. In this paper, our focus is the study of the problem of computing a Nash Equilibrium (NE) in IDD games. We show that an efficient algorithm to determine whether some attackerโs strategy can be a part of a NE in an instance of IDD games is unlikely to exist. Yet, we provide a dynamic programming algorithm to compute an approximate NE when the graph/network structure of the game is a directed tree with a single source, and show that it is an FPTAS. We also introduce an improved heuristic to compute an approximate NE on arbitrary graph structures. Our experiments show that our heuristic is more efficient, and provides better approximations, than best-response-gradient dynamics for the case of Internet games, a class of games introduced and studied in the original work on IDD games.
Refer-to-as Relations as Semantic Knowledge
Feng, Song (IBM T.J. Watson Research Center / Stony Brook University) | Ravi, Sujith (Google) | Kumar, Ravi (Google) | Kuznetsova, Polina (Stony Brook University) | Liu, Wei (University of North Carolina at Chapel Hill) | Berg, Alexander C. (University of North Carolina at Chapel Hill) | Berg, Tamara L. (University of North Carolina at Chapel Hill) | Choi, Yejin (University of Washington)
We study Refer-to-as relations as a new type of semanticknowledge. Compared to the much studied Is-a relation,which concerns factual taxonomy knowledge, Refer-to-as relationsaim to address pragmatic semantic knowledge. Forexample, a โpenguinโ is a โbirdโ from a taxonomy point ofview, but people rarely refer to a โpenguinโ as a โbirdโ invernacular use. This observation closely relates to the entrylevelcategorization studied in Prototype Theory in Psychology.We posit that Refer-to-as relations can be learned fromdata, and that both textual and visual information would behelpful in inferring the relations. By integrating existing lexicalstructure knowledge with language statistics and visualsimilarities, we formulate a collective inference approach tomap all object names in an encyclopedia to commonly usednames for each object. Our contributions include a new labeleddata set, the inference and optimization approach, andthe computed mappings and similarities.
Distributional Footprints of Deceptive Product Reviews
Feng, Song (Stony Brook University) | Xing, Longfei (Stony Brook University) | Gogar, Anupam (Stony Brook University) | Choi, Yejin (Stony Brook University)
This paper postulates that there are natural distributions of opinions in product reviews. In particular, we hypothesize that for a given domain, there is a set of representative distributions of review rating scores. A deceptive business entity that hires people to write fake reviews will necessarily distort its distribution of review scores, leaving distributional footprints behind. In order to validate this hypothesis, we introduce strategies to create dataset with pseudo-gold standard that is labeled automatically based on different types of distributional footprints. A range of experiments confirm the hypothesized connection between the distributional anomaly and deceptive reviews. This study also provides novel quantitative insights into the characteristics of natural distributions of opinions in the TripAdvisor hotel review and the Amazon product review domains.
A Game-Theoretic Approach to Influence in Networks
Irfan, Mohammad Tanvir (Stony Brook University) | Ortiz, Luis Enrique (Stony Brook University)
We propose influence games, a new class of graphical games, as a model of the behavior of large but finite networked populations. Grounded in non-cooperative game theory, we introduce a new approach to the study of influence in networks that captures the strategic aspects of complex interactions in the network. We study computational problems on influence games, including the identification of the most influential nodes. We characterize the computational complexity of various problems in influence games, propose several heuristics for the hard cases, and design approximation algorithms, with provable guarantees, for the most influential nodes problem.
Robustness Across the Structure of Sub-Networks: The Contrast Between Infection and Information Dynamics
Grim, Patrick (Stony Brook University) | Reade, Christopher (University of Michigan) | Singer, Daniel J. (University of Michigan) | Fisher, Steven (University of Michigan) | Majewicz, Stephen (Kingsborough Community College)
In this paper we make a simple theoretical point using a practical issue as an example. The simple theoretical point is that robustness is not 'all or nothing': in asking whether a system is robust one has to ask 'robust with respect to what property?' and 'robust over what set of changes in the system?' The practical issue used to illustrate the point is an examination of degrees of linkage between sub-networks and a pointed contrast in robustness and fragility between the dynamics of (1) contact infection and (2) information transfer or belief change. Time to infection across linked sub-networks, it turns out, is fairly robust with regard to the degree of linkage between them. Time to infection is fragile and sensitive, however, with regard to the type of sub-network involved: total, ring, small world, random, or scale-free. Aspects of robustness and fragility are reversed where it is belief updating with reinforcement rather than infection that is at issue. In information dynamics, the pattern of time to consensus is robust across changes in network type but remarkably fragile with respect to degree of linkage between sub-networks. These results have important implications for public health interventions in realistic social networks, particularly with an eye to ethnic and socio-economic sub-communities, and in social networks with sub-communities changing in structure or linkage.