Rao, Huaming
DataLab: A Unified Platform for LLM-Powered Business Intelligence
Weng, Luoxuan, Tang, Yinghao, Feng, Yingchaojie, Chang, Zhuo, Chen, Peng, Chen, Ruiqin, Feng, Haozhe, Hou, Chen, Huang, Danqing, Li, Yang, Rao, Huaming, Wang, Haonan, Wei, Canshi, Yang, Xiaofeng, Zhang, Yuhui, Zheng, Yifeng, Huang, Xiuqi, Zhu, Minfeng, Ma, Yuxin, Cui, Bin, Chen, Wei
Business intelligence (BI) transforms large volumes of data within modern organizations into actionable insights for informed decision-making. Recently, large language model (LLM)-based agents have streamlined the BI workflow by automatically performing task planning, reasoning, and actions in executable environments based on natural language (NL) queries. However, existing approaches primarily focus on individual BI tasks such as NL2SQL and NL2VIS. The fragmentation of tasks across different data roles and tools lead to inefficiencies and potential errors due to the iterative and collaborative nature of BI. In this paper, we introduce DataLab, a unified BI platform that integrates a one-stop LLM-based agent framework with an augmented computational notebook interface. DataLab supports a wide range of BI tasks for different data roles by seamlessly combining LLM assistance with user customization within a single environment. To achieve this unification, we design a domain knowledge incorporation module tailored for enterprise-specific BI tasks, an inter-agent communication mechanism to facilitate information sharing across the BI workflow, and a cell-based context management strategy to enhance context utilization efficiency in BI notebooks. Extensive experiments demonstrate that DataLab achieves state-of-the-art performance on various BI tasks across popular research benchmarks. Moreover, DataLab maintains high effectiveness and efficiency on real-world datasets from Tencent, achieving up to a 58.58% increase in accuracy and a 61.65% reduction in token cost on enterprise-specific BI tasks.
What Will Others Choose? How a Majority Vote Reward Scheme Can Improve Human Computation in a Spatial Location Identification Task
Rao, Huaming (Nanjing University of Science and Technology) | Huang, Shih-Wen (University of Illinois at Urbana-Champaign) | Fu, Wai-Tat (University of Illinois at Urbana-Champaign)
We created a spatial location identification task (SpLIT) in which workers recruited from Amazon Mechanical Turk were presented with a camera view of a location, and were asked to identify the location on a two-dimensional map. In cases where these cues were ambiguous or did not provide enough information to pinpoint the exact location, workers had to make a best guess. We tested the effects of two reward schemes. In the “ground truth” scheme, workers were rewarded if their answers were close enough to the correct locations. In the “majority vote” scheme, workers were told that they would be rewarded if their answers were similar to the majority of other workers. Results showed that the majority vote reward scheme led to consistently more accurate answers. Cluster analysis further showed that the majority vote reward scheme led to answers with higher reliability (a higher percentage of answers in the correct clusters) and precision (a smaller average distance to the cluster centers). Possible reasons for why the majority voting reward scheme was better were discussed.