AITopics | spreadsheet

Collaborating Authors

spreadsheet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

This 25 lifetime office suite puts docs, spreadsheets, slides, and email in one place

PCWorldMay-17-2026, 08:00:00 GMT

When you purchase through links in our articles, we may earn a small commission. Switching between multiple apps just to finish basic work gets old fast -- one for writing, one for spreadsheets, another for slides, and somehow email still lives somewhere else entirely. MobiOffice Premium pulls everything into a single suite so you can create, calculate, present, and manage tasks without the constant app-hopping. Right now, the lifetime subscription is available for $24.97 (MSRP $119.97). You get MobiDocs for polished documents with templates, styles, images, tables, spell check, and even an AI paraphraser when you're stuck on phrasing.

artificial intelligence, buyer, consumer ai performance privacy productivity, (11 more...)

PCWorld

Industry:

Information Technology > Security & Privacy (0.77)
Leisure & Entertainment > Games > Computer Games (0.57)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Towards Challenging Real World Spreadsheet Manipulation

Neural Information Processing SystemsFeb-17-2026, 09:22:56 GMT

Work was done when interned at Zhipu AI.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Education > Educational Setting (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
(3 more...)

Add feedback

SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models

Neural Information Processing SystemsDec-23-2025, 22:18:48 GMT

Computer end users have spent billions of hours completing daily tasks like tabular data processing and project timeline scheduling. Most of these tasks are repetitive and error-prone, yet most end users lack the skill to automate these burdensome works. With the advent of large language models (LLMs), directing software with natural language user requests become a reachable goal. In this work, we propose a SheetCopilot agent that takes natural language task and control spreadsheet to fulfill the requirements. We propose a set of atomic actions as an abstraction of spreadsheet software functionalities.

bringing software productivity, name change, sheetcopilot, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

SQuARE: Structured Query & Adaptive Retrieval Engine For Tabular Formats

Gondhalekar, Chinmay, Patel, Urjitkumar, Yeh, Fang-Chun

arXiv.org Artificial IntelligenceDec-5-2025

Abstract--Accurate question answering over real spreadsheets remains difficult due to multirow headers, merged cells, and unit annotations that disrupt naive chunking, while rigid SQL views fail on files lacking consistent schemas. It computes a continuous score based on header depth and merge density, then routes queries either through structure-preserving chunk retrieval or SQL over an automatically constructed relational representation. A lightweight agent supervises retrieval, refinement, or combination of results across both paths when confidence is low. This design maintains header hierarchies, time labels, and units, ensuring that returned values are faithful to the original cells and straightforward to verify. Evaluated on multi-header corporate balance sheets, a heavily merged World Bank workbook, and diverse public datasets, SQuARE consistently surpasses single-strategy baselines and ChatGPT-4o on both retrieval precision and end-to-end answer accuracy while keeping latency predictable. By decoupling retrieval from model choice, the system is compatible with emerging tabular foundation models and offers a practical bridge toward a more robust table understanding. I. Introduction Spreadsheets constitute the predominant medium for quantitative analysis across numerous disciplines, particularly in the field of finance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2512.04292

Genre: Research Report (0.65)

Industry: Banking & Finance (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Oh That Looks Familiar: A Novel Similarity Measure for Spreadsheet Template Discovery

Krishnakumar, Anand, Ravikumaran, Vengadesh

arXiv.org Artificial IntelligenceNov-12-2025

Traditional methods for identifying structurally similar spreadsheets fail to capture the spatial layouts and type patterns defining templates. To quantify spreadsheet similarity, we introduce a hybrid distance metric that combines semantic embeddings, data type information, and spatial positioning. In order to calculate spreadsheet similarity, our method converts spreadsheets into cell-level embeddings and then uses aggregation techniques like Chamfer and Hausdorff distances. Experiments across template families demonstrate superior unsupervised clustering performance compared to the graph-based Mondrian baseline, achieving perfect template reconstruction (Adjusted Rand Index of 1.00 versus 0.90) on the FUSTE dataset. Our approach facilitates large-scale automated template discovery, which in turn enables downstream applications such as retrieval-augmented generation over tabular collections, model training, and bulk data cleaning.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.06973

Genre: Research Report (0.52)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

SODBench: A Large Language Model Approach to Documenting Spreadsheet Operations

Indika, Amila, Molybog, Igor

arXiv.org Artificial IntelligenceOct-24-2025

Numerous knowledge workers utilize spreadsheets in business, accounting, and finance. However, a lack of systematic documentation methods for spreadsheets hinders automation, collaboration, and knowledge transfer, which risks the loss of crucial institutional knowledge. This paper introduces Spreadsheet Operations Documentation (SOD), an AI task that involves generating human-readable explanations from spreadsheet operations. Many previous studies have utilized Large Language Models (LLMs) for generating spreadsheet manipulation code; however, translating that code into natural language for SOD is a less-explored area. To address this, we present a benchmark of 111 spreadsheet manipulation code snippets, each paired with a corresponding natural language summary. We evaluate five LLMs, GPT-4o, GPT-4o-mini, LLaMA-3.3-70B, Mixtral-8x7B, and Gemma2-9B, using BLEU, GLEU, ROUGE-L, and METEOR metrics. Our findings suggest that LLMs can generate accurate spreadsheet documentation, making SOD a feasible prerequisite step toward enhancing reproducibility, maintainability, and collaborative workflows in spreadsheets, although there are challenges that need to be addressed.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.19864

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Challenging Real World Spreadsheet Manipulation

Neural Information Processing SystemsOct-10-2025, 12:55:56 GMT

Work was done when interned at Zhipu AI.

benchmark, instruction, spreadsheet, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Education > Educational Setting (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
(3 more...)

Add feedback

MMORE: Massive Multimodal Open RAG & Extraction

Sallinen, Alexandre, Krsteski, Stefan, Teiletche, Paul, Allard, Marc-Antoine, Lecoeur, Baptiste, Zhang, Michael, Nemo, Fabrice, Kalajdzic, David, Meyer, Matthias, Hartley, Mary-Anne

arXiv.org Artificial IntelligenceSep-16-2025

We introduce MMORE, an open-source pipeline for Massive Multimodal Open RetrievalAugmented Generation and Extraction, designed to ingest, transform, and retrieve knowledge from heterogeneous document formats at scale. MMORE supports more than fifteen file types, including text, tables, images, emails, audio, and video, and processes them into a unified format to enable downstream applications for LLMs. The architecture offers modular, distributed processing, enabling scalable parallelization across CPUs and GPUs. On processing benchmarks, MMORE demonstrates a 3.8-fold speedup over single-node baselines and 40% higher accuracy than Docling on scanned PDFs. The pipeline integrates hybrid dense-sparse retrieval and supports both interactive APIs and batch RAG endpoints. Evaluated on PubMedQA, MMORE-augmented medical LLMs improve biomedical QA accuracy with increasing retrieval depth. MMORE provides a robust, extensible foundation for deploying task-agnostic RAG systems on diverse, real-world multimodal data. The codebase is available at https://github.com/swiss-ai/mmore.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.11937

Country: Europe (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SheetDesigner: MLLM-Powered Spreadsheet Layout Generation with Rule-Based and Vision-Based Reflection

Chen, Qin, Ren, Yuanyi, Ma, Xiaojun, Liu, Mugeng, Shi, Han, Zhang, Dongmei

arXiv.org Artificial IntelligenceSep-10-2025

Spreadsheets are critical to data-centric tasks, with rich, structured layouts that enable efficient information transmission. Given the time and expertise required for manual spreadsheet layout design, there is an urgent need for automated solutions. However, existing automated layout models are ill-suited to spreadsheets, as they often (1) treat components as axis-aligned rectangles with continuous coordinates, overlooking the inherently discrete, grid-based structure of spreadsheets; and (2) neglect interrelated semantics, such as data dependencies and contextual links, unique to spreadsheets. In this paper, we first formalize the spreadsheet layout generation task, supported by a seven-criterion evaluation protocol and a dataset of 3,326 spreadsheets. We then introduce SheetDesigner, a zero-shot and training-free framework using Multimodal Large Language Models (MLLMs) that combines rule and vision reflection for component placement and content population. SheetDesigner outperforms five baselines by at least 22.6\%. We further find that through vision modality, MLLMs handle overlap and balance well but struggle with alignment, necessitates hybrid rule and visual reflection strategies. Our codes and data is available at Github.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.07473

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Expediting data extraction using a large language model (LLM) and scoping review protocol: a methodological study within a complex scoping review

Stewart-Evans, James, Wilson, Emma, Langley, Tessa, Prayle, Andrew, Hands, Angela, Exley, Karen, Leonardi-Bee, Jo

arXiv.org Artificial IntelligenceJul-10-2025

The data extraction stages of reviews are resource-intensive, and researchers may seek to expediate data extraction using online (large language models) LLMs and review protocols. Claude 3.5 Sonnet was used to trial two approaches that used a review protocol to prompt data extraction from 10 evidence sources included in a case study scoping review. A protocol-based approach was also used to review extracted data. Limited performance evaluation was undertaken which found high accuracy for the two extraction approaches (83.3% and 100%) when extracting simple, well-defined citation details; accuracy was lower (9.6% and 15.8%) when extracting more complex, subjective data items. Considering all data items, both approaches had precision >90% but low recall (<25%) and F1 scores (<40%). The context of a complex scoping review, open response types and methodological approach likely impacted performance due to missed and misattributed data. LLM feedback considered the baseline extraction accurate and suggested minor amendments: four of 15 (26.7%) to citation details and 8 of 38 (21.1%) to key findings data items were considered to potentially add value. However, when repeating the process with a dataset featuring deliberate errors, only 2 of 39 (5%) errors were detected. Review-protocol-based methods used for expediency require more robust performance evaluation across a range of LLMs and review contexts with comparison to conventional prompt engineering approaches. We recommend researchers evaluate and report LLM performance if using them similarly to conduct data extraction or review extracted data. LLM feedback contributed to protocol adaptation and may assist future review protocol drafting.

extraction, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.06623

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback