Goto

Collaborating Authors

 Country


Materializing and Persisting Inferred and Uncertain Knowledge in RDF Datasets

AAAI Conferences

As the semantic web grows in popularity and enters the mainstream of computer technology, RDF (Resource Description Framework) datasets are becoming larger and more complex. Advanced semantic web ontologies, especially in medicine and science, are developing. As more complex ontologies are developed, there is a growing need for efficient queries that handle inference. In areas such as research, it is vital to be able to perform queries that retrieve not just facts but also inferred knowledge and uncertain information. OWL (Web Ontology Language) defines rules that govern provable inference in semantic web datasets. In this paper, we detail a database schema using bit vectors that is designed specifically for RDF datasets. We introduce a framework for materializing and storing inferred triples. Our bit vector schema enables storage of inferred knowledge without a query performance penalty. Inference queries are simplified and performance is improved. Our evaluation results demonstrate that our inference solution is more scalable and efficient than the current state-of-the-art. There are also standards being developed for representing probabilistic reasoning within OWL ontologies. We specify a framework for materializing uncertain information and probabilities using these ontologies. We define a multiple vector schema for representing probabilities and classifying uncertain knowledge using thresholds. This solution increases the breadth of information that can be efficiently retrieved.


Diversifying Query Suggestion Results

AAAI Conferences

In order to improve the user search experience, Query Suggestion, a technique for generating alternative queries to Web users, has become an indispensable feature for commercial search engines. However, previous work mainly focuses on suggesting relevant queries to the original query while ignoring the diversity in the suggestions, which will potentially dissatisfy Web users' information needs. In this paper, we present a novel unified method to suggest both semantically relevant and diverse queries to Web users. The proposed approach is based on Markov random walk and hitting time analysis on the query-URL bipartite graph. It can effectively prevent semantically redundant queries from receiving a high rank, hence encouraging diversities in the results. We evaluate our method on a large commercial clickthrough dataset in terms of relevance measurement and diversity measurement. The experimental results show that our method is very effective in generating both relevant and diverse query suggestions.


Optimal Social Trust Path Selection in Complex Social Networks

AAAI Conferences

Online social networks are becoming increasingly popular and are being used as the means for a variety of rich activities. This demands the evaluation of the trustworthiness between two unknown participants along a certain social trust path between them in the social network. However, there are usually many social trust paths between participants. Thus, a challenging problem is finding which social trust path is the optimal one that can yield the most trustworthy evaluation result. In this paper, we first present a new complex social network structure and a new concept of Quality of Trust (QoT) to illustrate the ability to guarantee a certain level of trustworthiness in trust evaluation. We then model the optimal social trust path selection as a Multi-Constrained Optimal Path (MCOP) selection problem which is NP-Complete. For solving this problem, we propose an efficient approximation algorithm MONTE K based on the Monte Carlo method. The results of our experiments conducted on a real dataset of social networks illustrate that our proposed algorithm significantly outperforms existing approaches in both efficiency and the quality of selected social trust paths.


Temporal Information Extraction

AAAI Conferences

Research on information extraction (IE) seeks to distill relational tuples from natural language text, such as the contents of the WWW. Most IE work has focussed on identifying static facts, encoding them as binary relations. This is unfortunate, because the vast majority of facts are fluents, only holding true during an interval of time. It is less helpful to extract PresidentOf(Bill-Clinton, USA) without the temporal scope 1/20/93 — 1/20/01. This paper presents TIE, a novel, information-extraction system, which distills facts from text while inducing as much temporal information as possible. In addition to recognizing temporal relations between times and events, TIE performs global inference, enforcing transitivity to bound the start and ending times for each event. We introduce the notion of temporal entropy as a way to evaluate the performance of temporal IE systems and present experiments showing that TIE outperforms three alternative approaches.


Subjective Trust Inference in Composite Services

AAAI Conferences

In Service-Oriented Computing (SOC) environments, the trustworthiness of each service is critical for a service client when selecting one from a large pool of services. The trust value of a service is usually in the range of [0,1] and is evaluated from the ratings given by service clients, which represent the subjective belief of these service clients on the satisfaction of delivered services. So a trust value can be taken as the subjective probability, with which one party believes that another party can perform an action in a certain situation. Hence, subjective probability theory should be adopted in trust evaluation. In addition, in SOC environments, a service usually invokes other services offered by different service providers forming a composite service. Thus, the global trust of a composite service should be evaluated based on complex invocation structures. In this paper, firstly, based on Bayesian inference, we propose a novel method to evaluate the subjective trustworthiness of a service component from a series of ratings given by service clients. Secondly, we interpret the trust dependency caused by service invocations as conditional probability, which is evaluated based on the subjective trust values of service components. Furthermore, we propose a joint subjective probability method to evaluate the subjective global trust of a composite service on the basis of trust dependency. Finally, we introduce the results of our conducted experiments to illustrate the properties of our proposed subjective global trust inference method.


Sentiment Analysis with Global Topics and Local Dependency

AAAI Conferences

With the development of Web 2.0, sentiment analysis has now become a popular research problem to tackle. Recently, topic models have been introduced for the simultaneous analysis for topics and the sentiment in a document. These studies, which jointly model topic and sentiment, take the advantage of the relationship between topics and sentiment, and are shown to be superior to traditional sentiment analysis tools. However, most of them make the assumption that, given the parameters, the sentiments of the words in the document are all independent. In our observation, in contrast, sentiments are expressed in a coherent way. The local conjunctive words, such as “and” or “but”, are often indicative of sentiment transitions. In this paper, we propose a major departure from the previous approaches by making two linked contributions. First, we assume that the sentiments are related to the topic in the document, and put forward a joint sentiment and topic model, i.e. Sentiment-LDA. Second, we observe that sentiments are dependent on local context. Thus, we further extend the Sentiment-LDA model to Dependency-Sentiment-LDA model by relaxing the sentiment independent assumption in Sentiment-LDA. The sentiments of words are viewed as a Markov chain in Dependency-Sentiment-LDA. Through experiments, we show that exploiting the sentiment dependency is clearly advantageous, and that the Dependency-Sentiment-LDA is an effective approach for sentiment analysis.


Learning to Predict Opinion Share in Social Networks

AAAI Conferences

Blogosphere and sites such as for social networking, There has been a variety of work on the voter model. Dynamical knowledge-sharing and media-sharing in the World Wide properties of the basic model, including how the degree Web have enabled to form various kinds of large social distribution and the network size affect the mean time networks, through which behaviors, ideas and opinions to reach consensus, have been extensively studied (Liggett can spread. Thus, substantial attention has been directed 1999; Sood and Redner 2005) from mathematical point to investigating the spread of influence in these networks of view. Several variants of the voter model are also investigated (Leskovec, Adamic, and Huberman 2007; Crandall et al.


Towards an Intelligent Code Search Engine

AAAI Conferences

Software developers increasingly rely on information from the Web, such as documents or code examples on Application Programming Interfaces (APIs), to facilitate their development processes. However, API documents often do not include enough information for developers to fully understand the API usages, while searching for good code examples requires non-trivial efforts. To address this problem, we propose a novel code search engine, combining the strength of browsing documents and searching for code examples, by returning documents embedded with high-quality code example summaries mined from the Web. Our evaluation results show that our approach provides code examples with high precision and boosts programmer productivity.


On the Reputation of Agent-Based Web Services

AAAI Conferences

Maintaining a sound reputation mechanism requires a robust control and investigation. In this paper, we propose a game-theoretic analysis of a reputation mechanism that objectively maintains accurate reputation evaluation of selfish agent-based web services. In this framework, web services are ranked using their reputation as a result of provided feedback reflecting consumers' satisfaction about the offered services. However, selfish web services may alter their public reputation level by managing to get fake feedback. In this paper, game-theoretic analysis investigates the payoffs of different situations and elaborates on the facts that discourage web services to act maliciously.


PR + RQ ≈ PQ: Transliteration Mining Using Bridge Language

AAAI Conferences

We address the problem of mining name transliterations from comparable corpora in languages P and Q in the following resource-poor scenario: Parallel names in PQ are not available for training. Parallel names in PR and RQ are available for training. We propose a novel solution for the problem by computing a common geometric feature space for P,Q and R where name transliterations are mapped to similar vectors. We employ Canonical Correlation Analysis (CCA) to compute the common geometric feature space using only parallel names in PR and RQ and without requiring parallel names in  PQ. We test our algorithm on data sets in several languages and show that it gives results comparable to the state-of-the-art transliteration mining algorithms that use parallel names in PQ for training.