AITopics | table detection

Collaborating Authors

table detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

0d97fe65d7a1dc12a05642d9fa4cd578-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 20:55:00 GMT

In this paper, we present a novel large vision-language model, TabPedia, equipped with aconcept synergy mechanism.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Table Detection with Active Learning

Gautam, Somraj, Purohit, Nachiketa, Harit, Gaurav

arXiv.org Artificial IntelligenceSep-25-2025

Efficient data annotation remains a critical challenge in machine learning, particularly for object detection tasks requiring extensive labeled data. Active learning (AL) has emerged as a promising solution to minimize annotation costs by selecting the most informative samples. While traditional AL approaches primarily rely on uncertainty-based selection, recent advances suggest that incorporating diversity-based strategies can enhance sampling efficiency in object detection tasks. Our approach ensures the selection of representative examples that improve model generalization. We evaluate our method on two benchmark datasets (TableBank-LaTeX, TableBank-Word) using state-of-the-art table detection architectures, CascadeTabNet and YOLOv9. Our results demonstrate that AL-based example selection significantly outperforms random sampling, reducing annotation effort given a limited budget while maintaining comparable performance to fully supervised models. Our method achieves higher mAP scores within the same annotation budget.

active learning, artificial intelligence, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2509.20003

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Synthetic Data Augmentation for Table Detection: Re-evaluating TableNet's Performance with Automatically Generated Document Images

Sahukara, Krishna, Bettouche, Zineddine, Fischer, Andreas

arXiv.org Artificial IntelligenceJun-18-2025

Document pages captured by smartphones or scanners often contain tables, yet manual extraction is slow and error-prone. We introduce an automated LaTeX-based pipeline that synthesizes realistic two-column pages with visually diverse table layouts and aligned ground-truth masks. The generated corpus augments the real-world Marmot benchmark and enables a systematic resolution study of TableNet. Training TableNet on our synthetic data achieves a pixel-wise XOR error of 4.04% on our synthetic test set with a 256x256 input resolution, and 4.33% with 1024x1024. The best performance on the Marmot benchmark is 9.18% (at 256x256), while cutting manual annotation effort through automation.

machine learning, natural language, table detection, (15 more...)

arXiv.org Artificial Intelligence

2506.14583

Country: Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Add feedback

SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table Extraction

Bradley, Ethan, Roman, Muhammad, Rafferty, Karen, Devereux, Barry

arXiv.org Artificial IntelligenceDec-5-2024

Table extraction from document images is a challenging AI problem, and labelled data for many content domains is difficult to come by. Existing table extraction datasets often focus on scientific tables due to the vast amount of academic articles that are readily available, along with their source code. However, there are significant layout and typographical differences between tables found across scientific, financial, and other domains. Current datasets often lack the words, and their positions, contained within the tables, instead relying on unreliable OCR to extract these features for training modern machine learning models on natural language processing tasks. Therefore, there is a need for a more general method of obtaining labelled data. We present SynFinTabs, a large-scale, labelled dataset of synthetic financial tables. Our hope is that our method of generating these synthetic tables is transferable to other domains. To demonstrate the effectiveness of our dataset in training models to extract information from table images, we create FinTabQA, a layout large language model trained on an extractive question-answering task. We test our model using real-world financial tables and compare it to a state-of-the-art generative model and discuss the results. We make the dataset, model, and dataset generation code publicly available.

dataset, end position, start and end position, (15 more...)

arXiv.org Artificial Intelligence

2412.04262

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)

Add feedback

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Tian, Yuzhang, Zhao, Jianbo, Dong, Haoyu, Xiong, Junyu, Xia, Shiyu, Zhou, Mengyu, Lin, Yun, Cambronero, José, He, Yeye, Han, Shi, Zhang, Dongmei

arXiv.org Artificial IntelligenceJul-12-2024

Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language models (LLMs). In response, we introduce SpreadsheetLLM, pioneering an efficient encoding method designed to unleash and optimize LLMs' powerful understanding and reasoning capability on spreadsheets. Initially, we propose a vanilla serialization approach that incorporates cell addresses, values, and formats. However, this approach was limited by LLMs' token constraints, making it impractical for most applications. To tackle this challenge, we develop SheetCompressor, an innovative encoding framework that compresses spreadsheets effectively for LLMs. It comprises three modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. It significantly improves performance in spreadsheet table detection task, outperforming the vanilla approach by 25.6% in GPT4's in-context learning setting. Moreover, fine-tuned LLM with SheetCompressor has an average compression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score, surpassing the best existing models by 12.3%. Finally, we propose Chain of Spreadsheet for downstream tasks of spreadsheet understanding and validate in a new and demanding spreadsheet QA task. We methodically leverage the inherent layout and structure of spreadsheets, demonstrating that SpreadsheetLLM is highly effective across a variety of spreadsheet tasks.

boundary, llm, spreadsheet, (12 more...)

arXiv.org Artificial Intelligence

2407.09025

Country:

North America > United States > District of Columbia > Washington (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > France (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ClusterTabNet: Supervised clustering method for table detection and table structure recognition

Polewczyk, Marek, Spinaci, Marco

arXiv.org Artificial IntelligenceFeb-12-2024

We present a novel deep-learning-based method to cluster words in documents which we apply to detect and recognize tables given the OCR output. We interpret table structure bottom-up as a graph of relations between pairs of words (belonging to the same row, column, header, as well as to the same table) and use a transformer encoder model to predict its adjacency matrix. We demonstrate the performance of our method on the PubTables-1M dataset as well as PubTabNet and FinTabNet datasets. Compared to the current state-of-the-art detection methods such as DETR and Faster R-CNN, our method achieves similar or better accuracy, while requiring a significantly smaller model.

dataset, recognition, table detection, (14 more...)

arXiv.org Artificial Intelligence

2402.07502

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.64)

Add feedback

TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement

Fan, Yang, Wu, Xiangping, Chen, Qingcai, Li, Heng, Huang, Yan, Cai, Zhixiang, Wu, Qitian

arXiv.org Artificial IntelligenceDec-18-2023

The diversity of tables makes table detection a great challenge, leading to existing models becoming more tedious and complex. Despite achieving high performance, they often overfit to the table style in training set, and suffer from significant performance degradation when encountering out-of-distribution tables in other domains. To tackle this problem, we start from the essence of the table, which is a set of text arranged in rows and columns. Based on this, we propose a novel, light-weighted and robust Table Detection method based on Learning Text Arrangement, namely TDeLTA. TDeLTA takes the text blocks as input, and then models the arrangement of them with a sequential encoder and an attention module. To locate the tables precisely, we design a text-classification task, classifying the text blocks into 4 categories according to their semantic roles in the tables. Experiments are conducted on both the text blocks parsed from PDF and extracted by open-source OCR tools, respectively. Compared to several state-of-the-art methods, TDeLTA achieves competitive results with only 3.1M model parameters on the large-scale public datasets. Moreover, when faced with the cross-domain data under the 0-shot setting, TDeLTA outperforms baselines by a large margin of nearly 7%, which shows the strong robustness and transferability of the proposed model.

information, tdelta, text block, (16 more...)

arXiv.org Artificial Intelligence

2312.11043

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

A Review On Table Recognition Based On Deep Learning

Jiyuan, Shi, chunqi, Shi

arXiv.org Artificial IntelligenceDec-7-2023

Table recognition is using the computer to automatically understand the table, to detect the position of the table from the document or picture, and to correctly extract and identify the internal structure and content of the table. After earlier mainstream approaches based on heuristic rules and machine learning, the development of deep learning techniques has brought a new paradigm to this field. This review mainly discusses the table recognition problem from five aspects. The first part introduces data sets, benchmarks, and commonly used evaluation indicators. This section selects representative data sets, benchmarks, and evaluation indicators that are frequently used by researchers. The second part introduces the table recognition model. This survey introduces the development of the table recognition model, especially the table recognition model based on deep learning. It is generally accepted that table recognition is divided into two stages: table detection and table structure recognition. This section introduces the models that follow this paradigm (TD and TSR). The third part is the End-to-End method, this section introduces some scholars' attempts to use an end-to-end approach to solve the table recognition problem once and for all and the part are Data-centric methods, such as data augmentation, aligning benchmarks, and other methods. The fourth part is the data-centric approach, such as data enhancement, alignment benchmark, and so on. The fifth part summarizes and compares the experimental data in the field of form recognition, and analyzes the mainstream and more advantageous methods. Finally, this paper also discusses the possible development direction and trend of form processing in the future, to provide some ideas for researchers in the field of table recognition. (Resource will be released at https://github.com/Wa1den-jy/Topic-on-Table-Recognition .)

arxiv, document analysis, recognition, (12 more...)

arXiv.org Artificial Intelligence

2312.04808

Country:

Europe > Switzerland (0.04)
North America > United States > Massachusetts (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
(4 more...)

Genre: Overview (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sketch2FullStack: Generating Skeleton Code of Full Stack Website and Application from Sketch using Deep Learning and Computer Vision

Barua, Somoy Subandhu, Zulkarnain, Imam Mohammad, Roy, Abhishek, Alam, Md. Golam Rabiul, Uddin, Md Zia

arXiv.org Artificial IntelligenceNov-26-2022

For a full-stack web or app development, it requires a software firm or more specifically a team of experienced developers to contribute a large portion of their time and resources to design the website and then convert it to code. As a result, the efficiency of the development team is significantly reduced when it comes to converting UI wireframes and database schemas into an actual working system. It would save valuable resources and fasten the overall workflow if the clients or developers can automate this process of converting the pre-made full-stack website design to get a partially working if not fully working code. In this paper, we present a novel approach of generating the skeleton code from sketched images using Deep Learning and Computer Vision approaches. The dataset for training are first-hand sketched images of low fidelity wireframes, database schemas and class diagrams. The approach consists of three parts. First, the front-end or UI elements detection and extraction from custom-made UI wireframes. Second, individual database table creation from schema designs and lastly, creating a class file from class diagrams.

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2211.14607

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre:

Workflow (0.88)
Research Report (0.84)
Overview > Innovation (0.34)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DATa: Domain Adaptation-Aided Deep Table Detection Using Visual-Lexical Representations

Kwon, Hyebin, An, Joungbin, Lee, Dongwoo, Shin, Won-Yong

arXiv.org Artificial IntelligenceNov-12-2022

Considerable research attention has been paid to table detection by developing not only rule-based approaches reliant on hand-crafted heuristics but also deep learning approaches. Although recent studies successfully perform table detection with enhanced results, they often experience performance degradation when they are used for transferred domains whose table layout features might differ from the source domain in which the underlying model has been trained. To overcome this problem, we present DATa, a novel Domain Adaptation-aided deep Table detection method that guarantees satisfactory performance in a specific target domain where few trusted labels are available. To this end, we newly design lexical features and an augmented model used for re-training. More specifically, after pre-training one of state-of-the-art vision-based models as our backbone network, we re-train our augmented model, consisting of the vision-based model and the multilayer perceptron (MLP) architecture. Using new confidence scores acquired based on the trained MLP architecture as well as an initial prediction of bounding boxes and their confidence scores, we calculate each confidence score more accurately. To validate the superiority of DATa, we perform experimental evaluations by adopting a real-world benchmark dataset in a source domain and another dataset in our target domain consisting of materials science articles. Experimental results demonstrate that the proposed DATa method substantially outperforms competing methods that only utilize visual representations in the target domain. Such gains are possible owing to the capability of eliminating high false positives or false negatives according to the setting of a confidence score threshold.

artificial intelligence, machine learning, table detection, (15 more...)

arXiv.org Artificial Intelligence

2211.06648

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback