Goto

Collaborating Authors

 Oceania


Towards Discovery of Influence and Personality Traits through Social Link Prediction

AAAI Conferences

Estimation of a person's influence and personality traits from social media data has many applications. We use social linkage criteria, such as number of followers and friends, as proxies to form corpora, from popular blogging site Livejournal, for examining two two-class classification problems: influential vs. non-influential, and extraversion vs. introversion. Classification is performed using automatically-derived psycholinguistic and mood-based features of a user's textual messages. We experiment with three sub-corpora of 10000 users each, and present the most effective predictors for each category. The best classification result, at 80%, is achieved using psycholinguistic features; e.g., influentials are found to use more complex language, than non-influentials, and use more leisure-related terms.


Dimensions of Self-Expression in Facebook Status Updates

AAAI Conferences

We describe the dimensions along which Facebook users tend to express themselves via status updates using the semi-automated text analysis approach, the Meaning Extraction Method (MEM). First, we examined dimensions of self-expression in all status updates from a sample of four million Facebook users from four English-speaking countries (the United States, Canada, the United Kingdom, and Australia) in order to examine how these countries vary in their self-expressions. All four countries showed a basic three-component structure, indicating that the medium is a stronger influence than country characteristics or demographics on how people use Facebook status updates. In each country, people vary in terms of the extent to which they use Informal Speech, share Positive Events, and discuss School in their Facebook status updates. Together, these factors tell us how users differ in their self-expression, and thus illustrate meaningful use cases for the product: Talking about whatโ€™s going on tends to be positive, and people vary in terms of the extent to which their status updates are short, slangy emotional expressions and topics regarding school. The specific words that define these factors showed subtle differences across countries: The use of profanity indicates fewer school words (but only in Australia), whereas the UK shows greater use of slang terms (rather than profanity) when speaking informally. The MEM also identified English-language dialects as a meaningful dimension along which the countries varied. In sum, beyond simply indicating topicality of posts, this study provides insight into how status updates are used for self-expression. We discuss several theoretical frameworks that could produce these results, and more broadly discuss the generation of theoretical frameworks from wholly empirical data (such as naturalistic Internet speech) using the MEM.


Reconstruction of Threaded Conversations in Online Discussion Forums

AAAI Conferences

Online discussion boards, or Internet forums, are a signi๏ฌcant part of the Internet. People use Internet forums to post questions, provide advice and participate in discussions. These online conversations are represented as threads, and the conversation trees within these threads are important in understanding the behaviour of online users. Unfortunately, the reply structures of these threads are generally not publicly accessible or not maintained. Hence, in this paper, we introduce an ef๏ฌcient and simple approach to reconstruct the reply structure in threaded conversations. We contrast its accuracy against three baseline algorithms, and show that our algorithm can accurately recreate the in and out degree distributions of forum reply graphs built from the reconstructed reply structures.


Trust Amongst Rogues? A Hypergraph Approach for Comparing Clandestine Trust Networks in MMOGs

AAAI Conferences

Gold farming and real money trade refer to a set of illicit practices in massively multiplayer online games (MMOGs) whereby players accumulate virtual resources to sell for โ€œreal worldโ€ money. Prior work has examined trade relationships formed by gold farmers but not the trust relationships which exist between members of these organizations. We adopt a hypergraph approach to model the multi-modal relationships of gold farmers granting other players permission to use and modify objects they own. We argue these permissions reflect underlying trust relationships which can be analyzed using network analysis methods. We compare farmersโ€™ trust networks to the trust networks of both unidentified farmers and typical players. Our results demonstrate that gold farmersโ€™ networks are different from trust networks of normal players whereby farmers trust highly-central non-farmer players but not each other. These findings have implications for augmenting detection methods and re-evaluating theories of clandestine behavior.


NPCEditor: Creating Virtual Human Dialogue Using Information Retrieval Techniques

AI Magazine

See Leuski et al. (2006) and to the same question -- for example, "What Leuski and Traum (2008) for more details. is your name?" -- depending on who the interactor The final parameter is the classification threshold is looking at. NPCEditor's user interface allows the on the KL-divergence value: only answers that designer to define arbitrary annotation classes or score above the threshold value are returned from categories and specify which of these annotation the classifier. The threshold is determined by tuning categories should be used in classification.


On Improving the Quality of Solutions in Large-Scale Cooperative Multi-Agent Pathfinding

AAAI Conferences

Scaling up the number of simultaneously moving units in pathfinding problems to hundreds, or even thousands, is well beyond the capability of theoretically optimal algorithms in practice, which is consistent with existing intractability results. Significant scalability can be achieved by trading off solution optimality, which motivates evaluating the quality of suboptimal solutions, especially in instances much larger than can be handled by optimal algorithms. We consider pathfinding in uniform cost grid maps, and we study the solution quality using the three most common quality criteria, total travel distance , sum of actions , and makespan . We focus on MAPP, which has been shown as state-of-the-art in terms of scalability and success ratio (i.e., percentage of solved units) on realistic game grid maps. Until now, the quality of MAPP's solutions had not been as extensively analyzed. We introduced enhancements that significantly improve MAPP's solution quality. For example, its sum of actions is cut to half on average. MAPP becomes competitive in terms of solution quality with FAR and WHCA*, two successful algorithms from the literature. To evaluate the quality of suboptimal solutions in instances beyond the capability of optimal algorithms, we use lower bounds of optimal values to show our solutions have a reasonable quality. For example, MAPP's average total travel distance is 19 percent longer than the lower bound.


Sequential Diagnosis by Abstraction

Journal of Artificial Intelligence Research

When a system behaves abnormally, sequential diagnosis takes a sequence of measurements of the system until the faults causing the abnormality are identified, and the goal is to reduce the diagnostic cost, defined here as the number of measurements. To propose measurement points, previous work employs a heuristic based on reducing the entropy over a computed set of diagnoses. This approach generally has good performance in terms of diagnostic cost, but can fail to diagnose large systems when the set of diagnoses is too large. Focusing on a smaller set of probable diagnoses scales the approach but generally leads to increased average diagnostic costs. In this paper, we propose a new diagnostic framework employing four new techniques, which scales to much larger systems with good performance in terms of diagnostic cost. First, we propose a new heuristic for measurement point selection that can be computed efficiently, without requiring the set of diagnoses, once the system is modeled as a Bayesian network and compiled into a logical form known as d-DNNF. Second, we extend hierarchical diagnosis, a technique based on system abstraction from our previous work, to handle probabilities so that it can be applied to sequential diagnosis to allow larger systems to be diagnosed. Third, for the largest systems where even hierarchical diagnosis fails, we propose a novel method that converts the system into one that has a smaller abstraction and whose diagnoses form a superset of those of the original system; the new system can then be diagnosed and the result mapped back to the original system. Finally, we propose a novel cost estimation function which can be used to choose an abstraction of the system that is more likely to provide optimal average cost. Experiments with ISCAS-85 benchmark circuits indicate that our approach scales to all circuits in the suite except one that has a flat structure not susceptible to useful abstraction.


A Comparison of Lex Bounds for Multiset Variables in Constraint Programming

arXiv.org Artificial Intelligence

Set and multiset variables in constraint programming have typically been represented using subset bounds. However, this is a weak representation that neglects potentially useful information about a set such as its cardinality. For set variables, the length-lex (LL) representation successfully provides information about the length (cardinality) and position in the lexicographic ordering. For multiset variables, where elements can be repeated, we consider richer representations that take into account additional information. We study eight different representations in which we maintain bounds according to one of the eight different orderings: length- (co)lex (LL/LC), variety-(co)lex (VL/VC), length-variety- (co)lex (LVL/LVC), and variety-length-(co)lex (VLL/VLC) orderings. These representations integrate together information about the cardinality, variety (number of distinct elements in the multiset), and position in some total ordering. Theoretical and empirical comparisons of expressiveness and compactness of the eight representations suggest that length-variety-(co)lex (LVL/LVC) and variety-length-(co)lex (VLL/VLC) usually give tighter bounds after constraint propagation. We implement the eight representations and evaluate them against the subset bounds representation with cardinality and variety reasoning. Results demonstrate that they offer significantly better pruning and runtime.


Dominating Manipulations in Voting with Partial Information

arXiv.org Artificial Intelligence

We consider manipulation problems when the manipulator only has partial information about the votes of the nonmanipulators. Such partial information is described by an information set, which is the set of profiles of the nonmanipulators that are indistinguishable to the manipulator. Given such an information set, a dominating manipulation is a non-truthful vote that the manipulator can cast which makes the winner at least as preferable (and sometimes more preferable) as the winner when the manipulator votes truthfully. When the manipulator has full information, computing whether or not there exists a dominating manipulation is in P for many common voting rules (by known results). We show that when the manipulator has no information, there is no dominating manipulation for many common voting rules. When the manipulator's information is represented by partial orders and only a small portion of the preferences are unknown, computing a dominating manipulation is NP-hard for many common voting rules. Our results thus throw light on whether we can prevent strategic behavior by limiting information about the votes of other voters.


Manipulation of Nanson's and Baldwin's Rules

arXiv.org Artificial Intelligence

Nanson's and Baldwin's voting rules select a winner by successively eliminating candidates with low Borda scores. We show that these rules have a number of desirable computational properties. In particular, with unweighted votes, it is NP-hard to manipulate either rule with one manipulator, whilst with weighted votes, it is NP-hard to manipulate either rule with a small number of candidates and a coalition of manipulators. As only a couple of other voting rules are known to be NP-hard to manipulate with a single manipulator, Nanson's and Baldwin's rules appear to be particularly resistant to manipulation from a theoretical perspective. We also propose a number of approximation methods for manipulating these two rules. Experiments demonstrate that both rules are often difficult to manipulate in practice. These results suggest that elimination style voting rules deserve further study.