C, Rajmohan
The Mighty ToRR: A Benchmark for Table Reasoning and Robustness
Ashury-Tahan, Shir, Mai, Yifan, C, Rajmohan, Gera, Ariel, Perlitz, Yotam, Yehudai, Asaf, Bandel, Elron, Choshen, Leshem, Shnarch, Eyal, Liang, Percy, Shmueli-Scheuer, Michal
Despite its real-world significance, model performance on tabular data remains underexplored, leaving uncertainty about which model to rely on and which prompt configuration to adopt. To address this gap, we create ToRR, a benchmark for Table Reasoning and Robustness, measuring model performance and robustness on table-related tasks. The benchmark includes 10 datasets that cover different types of table reasoning capabilities across varied domains. ToRR goes beyond model performance rankings, and is designed to reflect whether models can handle tabular data consistently and robustly, across a variety of common table representation formats. We present a leaderboard as well as comprehensive analyses of the results of leading models over ToRR. Our results reveal a striking pattern of brittle model behavior, where even strong models are unable to perform robustly on tabular data tasks. Although no specific table format leads to consistently better performance, we show that testing over multiple formats is crucial for reliably estimating model capabilities. Moreover, we show that the reliability boost from testing multiple prompts can be equivalent to adding more test examples. Overall, our findings show that table understanding and reasoning tasks remain a significant challenge.
Automated Question Generation on Tabular Data for Conversational Data Exploration
Chaudhuri, Ritwik, C, Rajmohan, DB, Kirushikesh, Agarwal, Arvind
Exploratory data analysis (EDA) is an essential step for analyzing a dataset to derive insights. Several EDA techniques have been explored in the literature. Many of them leverage visualizations through various plots. But it is not easy to interpret them for a non-technical user, and producing appropriate visualizations is also tough when there are a large number of columns. Few other works provide a view of some interesting slices of data but it is still difficult for the user to draw relevant insights from them. Of late, conversational data exploration is gaining a lot of traction among non-technical users. It helps the user to explore the dataset without having deep technical knowledge about the data. Towards this, we propose a system that recommends interesting questions in natural language based on relevant slices of a dataset in a conversational setting. Specifically, given a dataset, we pick a select set of interesting columns and identify interesting slices of such columns and column combinations based on few interestingness measures. We use our own fine-tuned variation of a pre-trained language model(T5) to generate natural language questions in a specific manner. We then slot-fill values in the generated questions and rank them for recommendations. We show the utility of our proposed system in a coversational setting with a collection of real datasets.
Data Readiness Report
Afzal, Shazia, C, Rajmohan, Kesarwani, Manish, Mehta, Sameep, Patel, Hima
Data exploration and quality analysis is an important yet tedious process in the AI pipeline. Current practices of data cleaning and data readiness assessment for machine learning tasks are mostly conducted in an arbitrary manner which limits their reuse and results in loss of productivity. We introduce the concept of a Data Readiness Report as an accompanying documentation to a dataset that allows data consumers to get detailed insights into the quality of input data. Data characteristics and challenges on various quality dimensions are identified and documented keeping in mind the principles of transparency and explainability. The Data Readiness Report also serves as a record of all data assessment operations including applied transformations. This provides a detailed lineage for the purpose of data governance and management. In effect, the report captures and documents the actions taken by various personas in a data readiness and assessment workflow. Overtime this becomes a repository of best practices and can potentially drive a recommendation system for building automated data readiness workflows on the lines of AutoML [8]. We anticipate that together with the Datasheets [9], Dataset Nutrition Label [11], FactSheets [1] and Model Cards [15], the Data Readiness Report makes significant progress towards Data and AI lifecycle documentation.