Close this search box.

Ramos D., Pereira J., Lynce I., Manquinho V., Martins R.

ASE 2020


Charts are commonly used for data visualization. Generating a chart usually involves performing data transformations, including data pre-processing and aggregation. These tasks can be cumbersome and time-consuming, even for experienced data scientists. Reproducing existing charts can also be a challenging task when information about data transformations is no longer available.

In this paper, we tackle the problem of recovering data transformations from existing charts. Given an input table and a chart, our goal is to automatically recover the data transformation program underlying the chart. We divide our approach into four steps: (1) data extraction, (2) candidate generation, (3) candidate ranking, and (4) candidate disambiguation. We implemented our approach in a tool called UnchartIt and evaluated it on a set of $50$ benchmarks from Kaggle. Experimental results show that UnchartIt successfully ranks the correct data transformation program in the top-10 in $92%$ of the instances. To disambiguate those programs, we use our new interactive disambiguation procedure, which successfully returns the correct program on 98% of the ambiguous instances by asking on average fewer than 2 questions to the user.