tablenet
Synthetic Data Augmentation for Table Detection: Re-evaluating TableNet's Performance with Automatically Generated Document Images
Sahukara, Krishna, Bettouche, Zineddine, Fischer, Andreas
Document pages captured by smartphones or scanners often contain tables, yet manual extraction is slow and error-prone. We introduce an automated LaTeX-based pipeline that synthesizes realistic two-column pages with visually diverse table layouts and aligned ground-truth masks. The generated corpus augments the real-world Marmot benchmark and enables a systematic resolution study of TableNet. Training TableNet on our synthetic data achieves a pixel-wise XOR error of 4.04% on our synthetic test set with a 256x256 input resolution, and 4.33% with 1024x1024. The best performance on the Marmot benchmark is 9.18% (at 256x256), while cutting manual annotation effort through automation.
Table Detection and Extraction -- TableNet, Deep Learning model with PyTorch from images
The loss function that will be used for this model is torch.nn.BCEWithLogitsLoss() this loss function is a combination of the Sigmoid and the Binary Cross Entropy Loss functions, you can read more about it here. The train function returns a metric dictionary containing the F1 Score, Accuracy, Precision, Recall, and Loss for the current epoch. Note that F1 Score as I said takes into account the recall and precision but I wanted to know which one of these is better or worse. The test function is very similar to the train function and returns the F1 Score, Accuracy, Precision, Recall, and Loss for the current epoch. The model is trained for about 100 epochs with early stopping. In each epoch, I use both the train_on_epoch and the test_on_epoch functions, display them, and check them against the last epoch scores.
Deep Dive of TableNet
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. To obtain data for their own goals, humans rely largely on technology. With the increasing use of mobile phones and scanners to capture and upload documents in this modern age, the necessity for extracting information locked in unstructured document pictures such as shop receipts, insurance claim forms, and bank invoices is growing. One major hurdle to achieving this aim is that these pictures usually contain information in the form of tables, and extracting data from tabular sub-images provides a distinct set of issues.