Human-Machine Collaboration for Democratizing Data Science

Gautrais, Clément, Dauxais, Yann, Teso, Stefano, Kolb, Samuel, Verbruggen, Gust, De Raedt, Luc

arXiv.org Artificial Intelligence 

Data science is a cornerstone of current business practices. A major obstacle to its adoption is that most data analysis techniques are beyond the reach of typical end-users. Spreadsheets are a prime example of this phenomenon: despite being central in all sorts of data processing pipelines, the functionality necessary for processing and analyzing spreadsheets is hidden behind the high wall of spreadsheet formulas, which most end-users can neither write nor understand [Chambers and Scaffidi, 2010]. As a result, spreadsheets are often manipulated and analyzed manually. This increases the chance of making mistakes and prevents scaling beyond small data sets. Lowering the barrier to entry for specifying and solving data science tasks would help ameliorating these issues. Making data science tools more accessible would lower the cost of designing data processing pipelines and taking datadriven decisions. At the same time, accessible data science tools can prevent non-experts from relying on fragile heuristics and improvised solutions. The question we ask is then: is it possible to enable nontechnical end-users to specify and solve data science tasks that match their needs?

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found