AITopics | Mukherjee, Subhabrata

Collaborating Authors

Mukherjee, Subhabrata

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Smart To-Do : Automatic Generation of To-Do Items from Emails

Mukherjee, Sudipto, Mukherjee, Subhabrata, Hasegawa, Marcello, Awadallah, Ahmed Hassan, White, Ryen

arXiv.org Artificial IntelligenceMay-4-2020

Intelligent features in email service applications aim to increase productivity by helping people organize their folders, compose their emails and respond to pending tasks. In this work, we explore a new application, Smart-To-Do, that helps users with task management over emails. We introduce a new task and dataset for automatically generating To-Do items from emails where the sender has promised to perform an action. We design a two-stage process leveraging recent advances in neural text generation and sequence-to-sequence learning, obtaining BLEU and ROUGE scores of 0:23 and 0:63 for this task. To the best of our knowledge, this is the first work to address the problem of composing To-Do items from emails.

deep learning, neural network, to-do item, (21 more...)

arXiv.org Artificial Intelligence

2005.06282

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

STANCY: Stance Classification Based on Consistency Cues

Popat, Kashyap, Mukherjee, Subhabrata, Yates, Andrew, Weikum, Gerhard

arXiv.org Artificial IntelligenceOct-14-2019

Controversial claims are abundant in online media and discussion forums. A better understanding of such claims requires analyzing them from different perspectives. Stance classification is a necessary step for inferring these perspectives in terms of supporting or opposing the claim. In this work, we present a neural network model for stance classification leveraging BERT representations and augmenting them with a novel consistency constraint. Experiments on the Perspectrum dataset, consisting of claims and users' perspectives from various debate websites, demonstrate the effectiveness of our approach over state-of-the-art baselines.

computational linguistics, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1910.06048

Country:

Europe (0.69)
Asia (0.68)
North America > United States > California (0.29)

Genre: Research Report (0.83)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Distilling Transformers into Simple Neural Networks with Unlabeled Transfer Data

Mukherjee, Subhabrata, Awadallah, Ahmed Hassan

arXiv.org Machine LearningOct-3-2019

Recent advances in pre-training huge models on large amounts of text through self supervision have obtained state-of-the-art results in various natural language processing tasks. However, these huge and expensive models are difficult to use in practise for downstream tasks. Some recent efforts use knowledge distillation to compress these models. However, we see a gap between the performance of the smaller student models as compared to that of the large teacher. In this work, we leverage large amounts of in-domain unlabeled transfer data in addition to a limited amount of labeled training instances to bridge this gap. We show that simple RNN based student models even with hard distillation can perform at par with the huge teachers given the transfer set. The student performance can be further improved with soft distillation and leveraging teacher intermediate representations. We show that our student models can compress the huge teacher by up to 26x while still matching or even marginally exceeding the teacher performance in low-resource settings with small amount of labeled data.

deep learning, distillation, educational technology, (23 more...)

arXiv.org Machine Learning

1910.01769

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (0.40)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference

Zhang, Dongxu, Mukherjee, Subhabrata, Lockard, Colin, Dong, Xin Luna, McCallum, Andrew

arXiv.org Machine LearningApr-12-2019

In this paper, we consider advancing web-scale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB). Traditional techniques from universal schema and from schema mapping fall in two extremes: either they perform instance-level inference relying on embedding for (subject, object) pairs, thus cannot handle pairs absent in any existing triples; or they perform predicate-level mapping and completely ignore background evidence from individual entities, thus cannot achieve satisfying quality. We propose OpenKI to handle sparsity of OpenIE extractions by performing instance-level inference: for each entity, we encode the rich information in its neighborhood in both KB and OpenIE extractions, and leverage this information in relation inference by exploring different methods of aggregation and attention. In order to handle unseen entities, our model is designed without creating entity-specific parameters. Extensive experiments show that this method not only significantly improves state-of-the-art for conventional OpenIE extractions like ReVerb, but also boosts the performance on OpenIE from semi-structured data, where new entity pairs are abundant and data are fairly sparse.

bayesian inference, relation, survey article, (21 more...)

arXiv.org Machine Learning

1904.12606

Country:

North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Media > Film (0.47)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.71)
(2 more...)

Add feedback

OpenTag: Open Attribute Value Extraction from Product Profiles

Zheng, Guineng, Mukherjee, Subhabrata, Dong, Xin Luna, Li, Feifei

arXiv.org Machine LearningJun-1-2018

Extraction of missing attribute values is to find values describing an attribute of interest from a free text input. Most past related work on extraction of missing attribute values work with a closed world assumption with the possible set of values known beforehand, or use dictionaries of values and hand-crafted features. How can we discover new attribute values that we have never seen before? Can we do this with limited human annotation or supervision? We study this problem in the context of product catalogs that often have missing values for many attributes of interest. In this work, we leverage product profile information such as titles and descriptions to discover missing values of product attributes. We develop a novel deep tagging model OpenTag for this extraction problem with the following contributions: (1) we formalize the problem as a sequence tagging task, and propose a joint model exploiting recurrent neural networks (specifically, bidirectional LSTM) to capture context and semantics, and Conditional Random Fields (CRF) to enforce tagging consistency, (2) we develop a novel attention mechanism to provide interpretable explanation for our model's decisions, (3) we propose a novel sampling strategy exploring active learning to reduce the burden of human annotation. OpenTag does not use any dictionary or hand-crafted features as in prior works. Extensive experiments in real-life datasets in different domains show that OpenTag with our active learning strategy discovers new attribute values from as few as 150 annotated samples (reduction in 3.3x amount of annotation effort) with a high F-score of 83%, outperforming state-of-the-art models.

deep learning, neural network, sequence, (19 more...)

arXiv.org Machine Learning

doi: 10.1145/3219819.3219839

1806.01264

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Item Recommendation with Continuous Experience Evolution of Users using Brownian Motion

Mukherjee, Subhabrata, Guennemann, Stephan, Weikum, Gerhard

arXiv.org Artificial IntelligenceAug-9-2017

Online review communities are dynamic as users join and leave, adopt new vocabulary, and adapt to evolving trends. Recent work has shown that recommender systems benefit from explicit consideration of user experience. However, prior work assumes a fixed number of discrete experience levels, whereas in reality users gain experience and mature continuously over time. This paper presents a new model that captures the continuous evolution of user experience, and the resulting language model in reviews and other posts. Our model is unsupervised and combines principles of Geometric Brownian Motion, Brownian Motion, and Latent Dirichlet Allocation to trace a smooth temporal progression of user experience and language model respectively. We develop practical algorithms for estimating the model parameters from data and for inference with our model (e.g., to recommend items). Extensive experiments with five real-world datasets show that our model not only fits data better than discrete-model baselines, but also outperforms state-of-the-art methods for predicting item ratings.

artificial intelligence, evolution, upstream oil & gas, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/2939672.2939780

1705.02669

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.84)

Industry:

Energy > Oil & Gas > Upstream (0.82)
Media > News (0.68)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities

Mukherjee, Subhabrata

arXiv.org Machine LearningJul-26-2017

One of the major hurdles preventing the full exploitation of information from online communities is the widespread concern regarding the quality and credibility of user-contributed content. Prior works in this domain operate on a static snapshot of the community, making strong assumptions about the structure of the data (e.g., relational tables), or consider only shallow features for text classification. To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation. To this end, we devise new models based on Conditional Random Fields for different settings like incorporating partial expert knowledge for semi-supervised learning, and handling discrete labels as well as numeric ratings for fine-grained analysis. This enables applications such as extracting reliable side-effects of drugs from user-contributed posts in healthforums, and identifying credible content in news communities. Online communities are dynamic, as users join and leave, adapt to evolving trends, and mature over time. To capture this dynamics, we propose generative models based on Hidden Markov Model, Latent Dirichlet Allocation, and Brownian Motion to trace the continuous evolution of user expertise and their language model over time. This allows us to identify expert users and credible content jointly over time, improving state-of-the-art recommender systems by explicitly considering the maturity of users. This also enables applications such as identifying helpful product reviews, and detecting fake and anomalous reviews with limited information.

book review, cristian danescu-niculescu-mizil, us government, (32 more...)

arXiv.org Machine Learning

1707.08309

Country:

Asia (1.00)
Europe > Germany (0.92)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
(10 more...)

Genre:

Summary/Review (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Media > News (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)
(10 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(9 more...)

Add feedback

People on Media: Jointly Identifying Credible News and Trustworthy Citizen Journalists in Online Communities

Mukherjee, Subhabrata, Weikum, Gerhard

arXiv.org Machine LearningMay-9-2017

Media seems to have become more partisan, often providing a biased coverage of news catering to the interest of specific groups. It is therefore essential to identify credible information content that provides an objective narrative of an event. News communities such as digg, reddit, or newstrust offer recommendations, reviews, quality ratings, and further insights on journalistic works. However, there is a complex interaction between different factors in such online communities: fairness and style of reporting, language clarity and objectivity, topical perspectives (like political viewpoint), expertise and bias of community members, and more. This paper presents a model to systematically analyze the different interactions in a news community between users, news, and sources. We develop a probabilistic graphical model that leverages this joint interaction to identify 1) highly credible news articles, 2) trustworthy news sources, and 3) expert users who perform the role of "citizen journalists" in the community. Our method extends CRF models to incorporate real-valued ratings, as some communities have very fine-grained scales that cannot be easily discretized without losing information. To the best of our knowledge, this paper is the first full-fledged analysis of credibility, trust, and expertise in news communities.

news article, social media, us government, (21 more...)

arXiv.org Machine Learning

doi: 10.1145/2806416.2806537

1705.02667

Country:

Asia (0.67)
North America > United States (0.67)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)

Add feedback

Credible Review Detection with Limited Information using Consistency Analysis

Mukherjee, Subhabrata, Dutta, Sourav, Weikum, Gerhard

arXiv.org Machine LearningMay-7-2017

Online reviews provide viewpoints on the strengths and shortcomings of products/services, influencing potential customers' purchasing decisions. However, the proliferation of non-credible reviews -- either fake (promoting/ demoting an item), incompetent (involving irrelevant aspects), or biased -- entails the problem of identifying credible reviews. Prior works involve classifiers harnessing rich information about items/users -- which might not be readily available in several domains -- that provide only limited interpretability as to why a review is deemed non-credible. This paper presents a novel approach to address the above issues. We utilize latent topic models leveraging review texts, item ratings, and timestamps to derive consistency features without relying on item/user histories, unavailable for "long-tail" items/users. We develop models, for computing review credibility scores to provide interpretable evidence for non-credible reviews, that are also transferable to other domains -- addressing the scarcity of labeled data. Experiments on real-world datasets demonstrate improvements over state-of-the-art baselines.

amazon, artificial intelligence, natural language, (19 more...)

arXiv.org Machine Learning

1705.02668

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Exploring Latent Semantic Factors to Find Useful Product Reviews

Mukherjee, Subhabrata, Popat, Kashyap, Weikum, Gerhard

arXiv.org Machine LearningMay-6-2017

Online reviews provided by consumers are a valuable asset for e-Commerce platforms, influencing potential consumers in making purchasing decisions. However, these reviews are of varying quality, with the useful ones buried deep within a heap of non-informative reviews. In this work, we attempt to automatically identify review quality in terms of its helpfulness to the end consumers. In contrast to previous works in this domain exploiting a variety of syntactic and community-level features, we delve deep into the semantics of reviews as to what makes them useful, providing interpretable explanation for the same. We identify a set of consistency and semantic factors, all from the text, ratings, and timestamps of user-generated reviews, making our approach generalizable across all communities and domains. We explore review semantics in terms of several latent factors like the expertise of its author, his judgment about the fine-grained facets of the underlying product, and his writing style. These are cast into a Hidden Markov Model -- Latent Dirichlet Allocation (HMM-LDA) based model to jointly infer: (i) reviewer expertise, (ii) item facets, and (iii) review helpfulness. Large-scale experiments on five real-world datasets from Amazon show significant improvement over state-of-the-art baselines in predicting and ranking useful reviews.

artificial intelligence, helpfulness, information technology services, (20 more...)

arXiv.org Machine Learning

1705.02518

Country:

North America (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.64)

Industry:

Media > Film (0.46)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Add feedback