Automated Question Generation on Tabular Data for Conversational Data Exploration

Chaudhuri, Ritwik, C, Rajmohan, DB, Kirushikesh, Agarwal, Arvind

Jul-10-2024–arXiv.org Artificial Intelligence

Exploratory data analysis (EDA) is an essential step for analyzing a dataset to derive insights. Several EDA techniques have been explored in the literature. Many of them leverage visualizations through various plots. But it is not easy to interpret them for a non-technical user, and producing appropriate visualizations is also tough when there are a large number of columns. Few other works provide a view of some interesting slices of data but it is still difficult for the user to draw relevant insights from them. Of late, conversational data exploration is gaining a lot of traction among non-technical users. It helps the user to explore the dataset without having deep technical knowledge about the data. Towards this, we propose a system that recommends interesting questions in natural language based on relevant slices of a dataset in a conversational setting. Specifically, given a dataset, we pick a select set of interesting columns and identify interesting slices of such columns and column combinations based on few interestingness measures. We use our own fine-tuned variation of a pre-trained language model(T5) to generate natural language questions in a specific manner. We then slot-fill values in the generated questions and rank them for recommendations. We show the utility of our proposed system in a coversational setting with a collection of real datasets.

dataset, operator, salary, (13 more...)

arXiv.org Artificial Intelligence

Jul-10-2024

arXiv.org PDF

Add feedback

Country:
- Asia > India (0.05)
- North America > United States
  - New York (0.05)
  - North Carolina > Orange County
    - Chapel Hill (0.04)
  - Massachusetts > Hampshire County
    - Amherst (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language > Question Answering (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found