Improving Text Relationship Modeling with Artificial Data

Oct-27-2020–arXiv.org Artificial Intelligence

Identifying whole/part relationships between books in digital libraries can be a valuable tool for better understanding and cataloging the works found in bibliographic collections, irrespective of the form in which they were printed. However, this relationship is difficult to learn computationally because of limited ground truth availability. In this paper, we present an approach for data augmentation of whole/part training data through the use of artificially generated books. Artificial data is found to be a robust approach to training deep neural network classifiers on books with limited real ground truth, working to prevent over-fitting and improving classification by 91.0%. Modern cataloging standards support encoding complex work-level relationships, opening the possibility for bibliographic collections that better represent the complex ways that works are changed, iterated, and collated in library books.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Oct-27-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- Europe
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia > China
  - Hubei Province > Wuhan (0.04)

Genre:
- Research Report (1.00)

Industry:
- Education (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found