spreadsheet
SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models
Computer end users have spent billions of hours completing daily tasks like tabular data processing and project timeline scheduling. Most of these tasks are repetitive and error-prone, yet most end users lack the skill to automate these burdensome works. With the advent of large language models (LLMs), directing software with natural language user requests become a reachable goal. In this work, we propose a SheetCopilot agent that takes natural language task and control spreadsheet to fulfill the requirements. We propose a set of atomic actions as an abstraction of spreadsheet software functionalities.
SQuARE: Structured Query & Adaptive Retrieval Engine For Tabular Formats
Gondhalekar, Chinmay, Patel, Urjitkumar, Yeh, Fang-Chun
Abstract--Accurate question answering over real spreadsheets remains difficult due to multirow headers, merged cells, and unit annotations that disrupt naive chunking, while rigid SQL views fail on files lacking consistent schemas. It computes a continuous score based on header depth and merge density, then routes queries either through structure-preserving chunk retrieval or SQL over an automatically constructed relational representation. A lightweight agent supervises retrieval, refinement, or combination of results across both paths when confidence is low. This design maintains header hierarchies, time labels, and units, ensuring that returned values are faithful to the original cells and straightforward to verify. Evaluated on multi-header corporate balance sheets, a heavily merged World Bank workbook, and diverse public datasets, SQuARE consistently surpasses single-strategy baselines and ChatGPT-4o on both retrieval precision and end-to-end answer accuracy while keeping latency predictable. By decoupling retrieval from model choice, the system is compatible with emerging tabular foundation models and offers a practical bridge toward a more robust table understanding. I. Introduction Spreadsheets constitute the predominant medium for quantitative analysis across numerous disciplines, particularly in the field of finance.
- North America > United States (0.05)
- Asia > Macao (0.04)
- Asia > China (0.04)
- Information Technology > Software (1.00)
- Information Technology > Information Management (1.00)
- Information Technology > Data Science (1.00)
- (3 more...)
Oh That Looks Familiar: A Novel Similarity Measure for Spreadsheet Template Discovery
Krishnakumar, Anand, Ravikumaran, Vengadesh
Traditional methods for identifying structurally similar spreadsheets fail to capture the spatial layouts and type patterns defining templates. To quantify spreadsheet similarity, we introduce a hybrid distance metric that combines semantic embeddings, data type information, and spatial positioning. In order to calculate spreadsheet similarity, our method converts spreadsheets into cell-level embeddings and then uses aggregation techniques like Chamfer and Hausdorff distances. Experiments across template families demonstrate superior unsupervised clustering performance compared to the graph-based Mondrian baseline, achieving perfect template reconstruction (Adjusted Rand Index of 1.00 versus 0.90) on the FUSTE dataset. Our approach facilitates large-scale automated template discovery, which in turn enables downstream applications such as retrieval-augmented generation over tabular collections, model training, and bulk data cleaning.
SODBench: A Large Language Model Approach to Documenting Spreadsheet Operations
Numerous knowledge workers utilize spreadsheets in business, accounting, and finance. However, a lack of systematic documentation methods for spreadsheets hinders automation, collaboration, and knowledge transfer, which risks the loss of crucial institutional knowledge. This paper introduces Spreadsheet Operations Documentation (SOD), an AI task that involves generating human-readable explanations from spreadsheet operations. Many previous studies have utilized Large Language Models (LLMs) for generating spreadsheet manipulation code; however, translating that code into natural language for SOD is a less-explored area. To address this, we present a benchmark of 111 spreadsheet manipulation code snippets, each paired with a corresponding natural language summary. We evaluate five LLMs, GPT-4o, GPT-4o-mini, LLaMA-3.3-70B, Mixtral-8x7B, and Gemma2-9B, using BLEU, GLEU, ROUGE-L, and METEOR metrics. Our findings suggest that LLMs can generate accurate spreadsheet documentation, making SOD a feasible prerequisite step toward enhancing reproducibility, maintainability, and collaborative workflows in spreadsheets, although there are challenges that need to be addressed.
- North America > United States > Hawaii (0.04)
- North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology (1.00)
- Education > Educational Setting (0.46)
- Information Technology > Software (1.00)
- Information Technology > Information Management (1.00)
- Information Technology > Data Science (1.00)
- (3 more...)
MMORE: Massive Multimodal Open RAG & Extraction
Sallinen, Alexandre, Krsteski, Stefan, Teiletche, Paul, Allard, Marc-Antoine, Lecoeur, Baptiste, Zhang, Michael, Nemo, Fabrice, Kalajdzic, David, Meyer, Matthias, Hartley, Mary-Anne
We introduce MMORE, an open-source pipeline for Massive Multimodal Open RetrievalAugmented Generation and Extraction, designed to ingest, transform, and retrieve knowledge from heterogeneous document formats at scale. MMORE supports more than fifteen file types, including text, tables, images, emails, audio, and video, and processes them into a unified format to enable downstream applications for LLMs. The architecture offers modular, distributed processing, enabling scalable parallelization across CPUs and GPUs. On processing benchmarks, MMORE demonstrates a 3.8-fold speedup over single-node baselines and 40% higher accuracy than Docling on scanned PDFs. The pipeline integrates hybrid dense-sparse retrieval and supports both interactive APIs and batch RAG endpoints. Evaluated on PubMedQA, MMORE-augmented medical LLMs improve biomedical QA accuracy with increasing retrieval depth. MMORE provides a robust, extensible foundation for deploying task-agnostic RAG systems on diverse, real-world multimodal data. The codebase is available at https://github.com/swiss-ai/mmore.
- Europe > Switzerland (0.04)
- North America > United States (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
SheetDesigner: MLLM-Powered Spreadsheet Layout Generation with Rule-Based and Vision-Based Reflection
Chen, Qin, Ren, Yuanyi, Ma, Xiaojun, Liu, Mugeng, Shi, Han, Zhang, Dongmei
Spreadsheets are critical to data-centric tasks, with rich, structured layouts that enable efficient information transmission. Given the time and expertise required for manual spreadsheet layout design, there is an urgent need for automated solutions. However, existing automated layout models are ill-suited to spreadsheets, as they often (1) treat components as axis-aligned rectangles with continuous coordinates, overlooking the inherently discrete, grid-based structure of spreadsheets; and (2) neglect interrelated semantics, such as data dependencies and contextual links, unique to spreadsheets. In this paper, we first formalize the spreadsheet layout generation task, supported by a seven-criterion evaluation protocol and a dataset of 3,326 spreadsheets. We then introduce SheetDesigner, a zero-shot and training-free framework using Multimodal Large Language Models (MLLMs) that combines rule and vision reflection for component placement and content population. SheetDesigner outperforms five baselines by at least 22.6\%. We further find that through vision modality, MLLMs handle overlap and balance well but struggle with alignment, necessitates hybrid rule and visual reflection strategies. Our codes and data is available at Github.
- Europe > Austria > Vienna (0.14)
- North America > Canada > British Columbia > Vancouver (0.04)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Expediting data extraction using a large language model (LLM) and scoping review protocol: a methodological study within a complex scoping review
Stewart-Evans, James, Wilson, Emma, Langley, Tessa, Prayle, Andrew, Hands, Angela, Exley, Karen, Leonardi-Bee, Jo
The data extraction stages of reviews are resource-intensive, and researchers may seek to expediate data extraction using online (large language models) LLMs and review protocols. Claude 3.5 Sonnet was used to trial two approaches that used a review protocol to prompt data extraction from 10 evidence sources included in a case study scoping review. A protocol-based approach was also used to review extracted data. Limited performance evaluation was undertaken which found high accuracy for the two extraction approaches (83.3% and 100%) when extracting simple, well-defined citation details; accuracy was lower (9.6% and 15.8%) when extracting more complex, subjective data items. Considering all data items, both approaches had precision >90% but low recall (<25%) and F1 scores (<40%). The context of a complex scoping review, open response types and methodological approach likely impacted performance due to missed and misattributed data. LLM feedback considered the baseline extraction accurate and suggested minor amendments: four of 15 (26.7%) to citation details and 8 of 38 (21.1%) to key findings data items were considered to potentially add value. However, when repeating the process with a dataset featuring deliberate errors, only 2 of 39 (5%) errors were detected. Review-protocol-based methods used for expediency require more robust performance evaluation across a range of LLMs and review contexts with comparison to conventional prompt engineering approaches. We recommend researchers evaluate and report LLM performance if using them similarly to conduct data extraction or review extracted data. LLM feedback contributed to protocol adaptation and may assist future review protocol drafting.
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.67)
'Alexa, what do you know about us?' What I discovered when I asked Amazon to tell me everything my family's smart speaker had heard
She needs to be spoken to slowly and clearly, as you'd talk to an aged relative with diminished faculties. '"Alexa, how long do wasps live for?" "Alexa, how long do wasps live if you hit them with a tea towel and then a saucepan?" In September 2016, a new presence appears in our house, squatting on the kitchen counter between the kettle and the coffee machine. It is blandly futuristic, a minimal cylinder with an LED ring that glows blue to alert us to the fact that it is ready, poised to answer our questions or carry out our instructions, as long as those instructions are clearly stated and fall within a narrow band of available "skills".
- Europe > United Kingdom (0.15)
- North America > Canada (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Leisure & Entertainment (0.48)
- Media (0.47)
- Appliances & Durable Goods (0.41)