AITopics | Sun, Jiankai

Collaborating Authors

Sun, Jiankai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Vertical Federated Learning without Revealing Intersection Membership

Sun, Jiankai, Yang, Xin, Yao, Yuanshun, Zhang, Aonan, Gao, Weihao, Xie, Junyuan, Wang, Chong

arXiv.org Artificial IntelligenceJun-10-2021

Vertical Federated Learning (vFL) allows multiple parties that own different attributes (e.g. features and labels) of the same data entity (e.g. a person) to jointly train a model. To prepare the training data, vFL needs to identify the common data entities shared by all parties. It is usually achieved by Private Set Intersection (PSI) which identifies the intersection of training samples from all parties by using personal identifiable information (e.g. email) as sample IDs to align data instances. As a result, PSI would make sample IDs of the intersection visible to all parties, and therefore each party can know that the data entities shown in the intersection also appear in the other parties, i.e. intersection membership. However, in many real-world privacy-sensitive organizations, e.g. banks and hospitals, revealing membership of their data entities is prohibited. In this paper, we propose a vFL framework based on Private Set Union (PSU) that allows each party to keep sensitive membership information to itself. Instead of identifying the intersection of all training samples, our PSU protocol generates the union of samples as training instances. In addition, we propose strategies to generate synthetic features and labels to handle samples that belong to the union but not the intersection. Through extensive experiments on two real-world datasets, we show our framework can protect the privacy of the intersection membership while maintaining the model utility.

compute, health & medicine, neural network, (20 more...)

arXiv.org Artificial Intelligence

2106.05508

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Deep Retrieval: An End-to-End Learnable Structure Model for Large-Scale Recommendations

Gao, Weihao, Fan, Xiangjun, Sun, Jiankai, Jia, Kai, Xiao, Wenzhi, Wang, Chong, Liu, Xiaobing

arXiv.org Machine LearningJul-12-2020

One of the core problems in large-scale recommendations is to retrieve top relevant candidates accurately and efficiently, preferably in sub-linear time. Previous approaches are mostly based on a two-step procedure: first learn an inner-product model and then use maximum inner product search (MIPS) algorithms to search top candidates, leading to potential loss of retrieval accuracy. In this paper, we present Deep Retrieval (DR), an end-to-end learnable structure model for large-scale recommendations. DR encodes all candidates into a discrete latent space. Those latent codes for the candidates are model parameters and to be learnt together with other neural network parameters to maximize the same objective function. With the model learnt, a beam search over the latent codes is performed to retrieve the top candidates. Empirically, we showed that DR, with sub-linear computational complexity, can achieve almost the same accuracy as the brute-force baseline.

algorithm, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

2007.07203

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation

Sun, Jiankai, Bandyopadhyay, Bortik, Bashizade, Armin, Liang, Jiongqian, Sadayappan, P., Parthasarathy, Srinivasan

arXiv.org Artificial IntelligenceNov-6-2018

Directed graphs have been widely used in Community Question Answering services (CQAs) to model asymmetric relationships among different types of nodes in CQA graphs, e.g., question, answer, user. Asymmetric transitivity is an essential property of directed graphs, since it can play an important role in downstream graph inference and analysis. Question difficulty and user expertise follow the characteristic of asymmetric transitivity. Maintaining such properties, while reducing the graph to a lower dimensional vector embedding space, has been the focus of much recent research. In this paper, we tackle the challenge of directed graph embedding with asymmetric transitivity preservation and then leverage the proposed embedding method to solve a fundamental task in CQAs: how to appropriately route and assign newly posted questions to users with the suitable expertise and interest in CQAs. The technique incorporates graph hierarchy and reachability information naturally by relying on a non-linear transformation that operates on the core reachability and implicit hierarchy within such graphs. Subsequently, the methodology levers a factorization-based approach to generate two embedding vectors for each node within the graph, to capture the asymmetric transitivity. Extensive experiments show that our framework consistently and significantly outperforms the state-of-the-art baselines on two diverse real-world tasks: link prediction, and question difficulty estimation and expert finding in online forums like Stack Exchange. Particularly, our framework can support inductive embedding learning for newly posted questions (unseen nodes during training), and therefore can properly route and assign these kinds of questions to experts in CQAs.

artificial intelligence, graph, survey article, (19 more...)

arXiv.org Artificial Intelligence

1811.00839

Country: North America > United States (0.28)

Genre:

Workflow (0.93)
Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (0.68)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Symmetrization for Embedding Directed Graphs

Sun, Jiankai, Parthasarathy, Srinivasan

arXiv.org Artificial IntelligenceNov-6-2018

Recently, one has seen a surge of interest in developing such methods including ones for learning such representations for(undirected) graphs (while preserving important properties) (Liang et al. 2018). However, most of the work to date on embedding graphs has targeted undirected networks and very little has focused on the thorny issue of embedding directed networks. In this paper, we instead propose to solve the directed graph embedding problem via a two-stage approach: inthe first stage, the graph is symmetrized in one of several possible ways, and in the second stage, the soobtained symmetrizedgraph is embedded using any state-ofthe-art (undirected) graph embedding algorithm. Note that it is not the objective of this paper to propose a new (undirected) graphembedding algorithm or discuss the strengths and weaknesses of existing ones; all we are saying is that whichever be the suitable graph embedding algorithm, it will fit in the above proposed symmetrization framework. Satuluri et al. proposed various ways (such as Bibliometric andDegree-discounted symmetrization) of symmetrizing a directed graph into an undirected graph, while information aboutdirectionality is incorporated via weights on the edges of the transformed graph (or applying a re-weighting scheme in case of already weighted graphs) (Satuluri and Parthasarathy 2011).

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1811.12164

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

ColdRoute: Effective Routing of Cold Questions in Stack Exchange Sites

Sun, Jiankai, Vishnu, Abhinav, Chakrabarti, Aniket, Siegel, Charles, Parthasarathy, Srinivasan

arXiv.org Artificial IntelligenceJul-2-2018

Routing questions in Community Question Answer services (CQAs) such as Stack Exchange sites is a well-studied problem. Yet, cold-start -- a phenomena observed when a new question is posted is not well addressed by existing approaches. Additionally, cold questions posted by new askers present significant challenges to state-of-the-art approaches. We propose ColdRoute to address these challenges. ColdRoute is able to handle the task of routing cold questions posted by new or existing askers to matching experts. Specifically, we use Factorization Machines on the one-hot encoding of critical features such as question tags and compare our approach to well-studied techniques such as CQARank and semantic matching (LDA, BoW, and Doc2Vec). Using data from eight stack exchange sites, we are able to improve upon the routing metrics (Precision$@1$, Accuracy, MRR) over the state-of-the-art models such as semantic matching by $159.5\%$,$31.84\%$, and $40.36\%$ for cold questions posted by existing askers, and $123.1\%$, $27.03\%$, and $34.81\%$ for cold questions posted by new askers respectively.

answerer, deep learning, neural network, (23 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10618-018-0577-7

1807.00462

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

Sun, Jiankai (The Ohio State University) | Moosavi, Sobhan (The Ohio State University) | Ramnath, Rajiv (The Ohio State University) | Parthasarathy, Srinivasan (The Ohio State University)

AAAI ConferencesJun-20-2018

In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of ``social agony'' to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models.

qdee, question difficulty and expertise estimation

AAAI Conferences

Twelfth International AAAI Conference on Web and Social Media

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.60)

Add feedback

Semi-supervised Embedding in Attributed Networks with Outliers

Liang, Jiongqian, Jacobs, Peter, Sun, Jiankai, Parthasarathy, Srinivasan

arXiv.org Artificial IntelligenceApr-26-2018

In this paper, we propose a novel framework, called Semi-supervised Embedding in Attributed Networks with Outliers (SEANO), to learn a low-dimensional vector representation that systematically captures the topological proximity, attribute affinity and label similarity of vertices in a partially labeled attributed network (PLAN). Our method is designed to work in both transductive and inductive settings while explicitly alleviating noise effects from outliers. Experimental results on various datasets drawn from the web, text and image domains demonstrate the advantages of SEANO over state-of-the-art methods in semi-supervised classification under transductive as well as inductive settings. We also show that a subset of parameters in SEANO is interpretable as outlier score and can significantly outperform baseline methods when applied for detecting network outliers. Finally, we present the use of SEANO in a challenging real-world setting -- flood mapping of satellite images and show that it is able to outperform modern remote sensing algorithms for this task.

artificial intelligence, neural network, vertex, (19 more...)

arXiv.org Artificial Intelligence

1703.081

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

Sun, Jiankai, Moosavi, Sobhan, Ramnath, Rajiv, Parthasarathy, Srinivasan

arXiv.org Artificial IntelligenceApr-20-2018

In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of "social agony" to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.

artificial intelligence, difficulty level, social media, (19 more...)

arXiv.org Artificial Intelligence

1804.00109

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Add feedback