AITopics | Canim, Mustafa

Collaborating Authors

Canim, Mustafa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Topic Transferable Table Question Answering

Chemmengath, Saneem Ahmed, Kumar, Vishwajeet, Bharadwaj, Samarth, Sen, Jaydeep, Canim, Mustafa, Chakrabarti, Soumen, Gliozzo, Alfio, Sankaranarayanan, Karthik

arXiv.org Artificial IntelligenceSep-15-2021

Weakly-supervised table question-answering(TableQA) models have achieved state-of-art performance by using pre-trained BERT transformer to jointly encoding a question and a table to produce structured query for the question. However, in practical settings TableQA systems are deployed over table corpora having topic and word distributions quite distinct from BERT's pretraining corpus. In this work we simulate the practical topic shift scenario by designing novel challenge benchmarks WikiSQL-TS and WikiTQ-TS, consisting of train-dev-test splits in five distinct topic groups, based on the popular WikiSQL and WikiTableQuestions datasets. We empirically show that, despite pre-training on large open-domain text, performance of models degrades significantly when they are evaluated on unseen topics. In response, we propose T3QA (Topic Transferable Table Question Answering) a pragmatic adaptation framework for TableQA comprising of: (1) topic-specific vocabulary injection into BERT, (2) a novel text-to-text transformer generator (such as T5, GPT2) based natural language question generation pipeline focused on generating topic specific training data, and (3) a logical form reranker. We show that T3QA provides a reasonably good baseline for our topic shift benchmarks. We believe our topic split benchmarks will lead to robust TableQA solutions that are better suited for practical deployment.

artificial intelligence, natural language, question answering

arXiv.org Artificial Intelligence

2109.07377

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Capturing Row and Column Semantics in Transformer Based Question Answering over Tables

Glass, Michael, Canim, Mustafa, Gliozzo, Alfio, Chemmengath, Saneem, Kumar, Vishwajeet, Chakravarti, Rishav, Sil, Avi, Pan, Feifei, Bharadwaj, Samarth, Fauceglia, Nicolas Rodolfo

arXiv.org Artificial IntelligenceApr-26-2021

Transformer based architectures are recently used for the task of answering questions over tables. In order to improve the accuracy on this task, specialized pre-training techniques have been developed and applied on millions of open-domain web tables. In this paper, we propose two novel approaches demonstrating that one can achieve superior performance on table QA task without even using any of these specialized pre-training techniques. The first model, called RCI interaction, leverages a transformer based architecture that independently classifies rows and columns to identify relevant cells. While this model yields extremely high accuracy at finding cell values on recent benchmarks, a second model we propose, called RCI representation, provides a significant efficiency advantage for online QA systems over tables by materializing embeddings for existing tables. Experiments on recent benchmarks prove that the proposed methods can effectively locate cell values on tables (up to ~98% Hit@1 accuracy on WikiSQL lookup questions). Also, the interaction model outperforms the state-of-the-art transformer based approaches, pre-trained on very large table corpora (TAPAS and TaBERT), achieving ~3.4% and ~18.86% additional precision improvement on the standard WikiSQL benchmark.

artificial intelligence, natural language, representation, (20 more...)

arXiv.org Artificial Intelligence

2104.08303

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre:

Overview (0.66)
Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncheatable Machine Learning Inference

Canim, Mustafa, Kundu, Ashish, Payne, Josh

arXiv.org Machine LearningAug-8-2019

Classification-as-a-Service (CaaS) is widely deployed today in machine intelligence stacks for a vastly diverse set of applications including anything from medical prognosis to computer vision tasks to natural language processing to identity fraud detection. The computing power required for training complex models on large datasets to perform inference to solve these problems can be very resource-intensive. A CaaS provider may cheat a customer by fraudulently bypassing expensive training procedures in favor of weaker, less computationally-intensive algorithms which yield results of reduced quality. Given a classification service supplier $S$, intermediary CaaS provider $P$ claiming to use $S$ as a classification backend, and customer $C$, our work addresses the following questions: (i) how can $P$'s claim to be using $S$ be verified by $C$? (ii) how might $S$ make performance guarantees that may be verified by $C$? and (iii) how might one design a decentralized system that incentivizes service proofing and accountability? To this end, we propose a variety of methods for $C$ to evaluate the service claims made by $P$ using probabilistic performance metrics, instance seeding, and steganography. We also propose a method of measuring the robustness of a model using a blackbox adversarial procedure, which may then be used as a benchmark or comparison to a claim made by $S$. Finally, we propose the design of a smart contract-based decentralized system that incentivizes service accountability to serve as a trusted Quality of Service (QoS) auditor.

deep learning, law enforcement, scenario, (22 more...)

arXiv.org Machine Learning

1908.0327

Country: North America > United States (0.14)

Genre: Research Report (0.41)

Industry:

Law Enforcement & Public Safety > Fraud (0.54)
Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback