Goto

Collaborating Authors

 Surinta, Olarik


Multi-language Video Subtitle Dataset for Image-based Text Recognition

arXiv.org Artificial Intelligence

The Multi-language Video Subtitle Dataset is a comprehensive collection designed to support research in text recognition across multiple languages. This dataset includes 4,224 subtitle images extracted from 24 videos sourced from online platforms. It features a wide variety of characters, including Thai consonants, vowels, tone marks, punctuation marks, numerals, Roman characters, and Arabic numerals. With 157 unique characters, the dataset provides a resource for addressing challenges in text recognition within complex backgrounds. It addresses the growing need for high-quality, multilingual text recognition data, particularly as videos with embedded subtitles become increasingly dominant on platforms like YouTube and Facebook. The variability in text length, font, and placement within these images adds complexity, offering a valuable resource for developing and evaluating deep learning models. The dataset facilitates accurate text transcription from video content while providing a foundation for improving computational efficiency in text recognition systems. As a result, it holds significant potential to drive advancements in research and innovation across various computer science disciplines, including artificial intelligence, deep learning, computer vision, and pattern recognition.


EcoCropsAID: Economic Crops Aerial Image Dataset for Land Use Classification

arXiv.org Artificial Intelligence

The EcoCropsAID dataset is a comprehensive collection of 5,400 aerial images captured between 2014 and 2018 using the Google Earth application. This dataset focuses on five key economic crops in Thailand: rice, sugarcane, cassava, rubber, and longan. The images were collected at various crop growth stages: early cultivation, growth, and harvest, resulting in significant variability within each category and similarities across different categories. These variations, coupled with differences in resolution, color, and contrast introduced by multiple remote imaging sensors, present substantial challenges for land use classification. The dataset is an interdisciplinary resource that spans multiple research domains, including remote sensing, geoinformatics, artificial intelligence, and computer vision. The unique features of the EcoCropsAID dataset offer opportunities for researchers to explore novel approaches, such as extracting spatial and temporal features, developing deep learning architectures, and implementing transformer-based models. The EcoCropsAID dataset provides a valuable platform for advancing research in land use classification, with implications for optimizing agricultural practices and enhancing sustainable development. This study explicitly investigates the use of deep learning algorithms to classify economic crop areas in northeastern Thailand, utilizing satellite imagery to address the challenges posed by diverse patterns and similarities across categories.