Goto

Collaborating Authors

 Gomułka, Jakub


Microsoft Cloud-based Digitization Workflow with Rich Metadata Acquisition for Cultural Heritage Objects

arXiv.org Artificial Intelligence

In response to several cultural heritage initiatives at the Jagiellonian University, we have developed a new digitization workflow in collaboration with the Jagiellonian Library (JL). The solution is based on easy-to-access technological solutions -- Microsoft 365 cloud with MS Excel files as metadata acquisition interfaces, Office Script for validation, and MS Sharepoint for storage -- that allows metadata acquisition by domain experts (philologists, historians, philosophers, librarians, archivists, curators, etc.) regardless of their experience with information systems. The ultimate goal is to create a knowledge graph that describes the analyzed holdings, linked to general knowledge bases, as well as to other cultural heritage collections, so careful attention is paid to the high accuracy of metadata and proper links to external sources. The workflow has already been evaluated in two pilots in the DiHeLib project focused on digitizing the so-called "Berlin Collection" and in two workshops with international guests, which allowed for its refinement and confirmation of its correctness and usability for JL. As the proposed workflow does not interfere with existing systems or domain guidelines regarding digitization and basic metadata collection in a given institution (e.g., file type, image quality, use of Dublin Core/MARC-21), but extends them in order to enable rich metadata collection, not previously possible, we believe that it could be of interest to all GLAMs (galleries, libraries, archives, and museums).


Efficient argument classification with compact language models and ChatGPT-4 refinements

arXiv.org Artificial Intelligence

Argument mining (AM) is a multidisciplinary research field encompassing diverse areas such as logic and philosophy, language, rhetoric and law, psychology, and computer science. The theory of argumentation and the use of logical reasoning to justify claims and conclusions is an extensively studied field, but the application of data science methods to automate these processes is a relatively recent development. In nearly every field, the ability to automatically extract arguments and their relationships from the input source is of significant importance. Over the last decade, AM has become one of the core studies within artificial intelligence [1, 2] due to its ability to conjugate representational needs with user-related cognitive models and computational models for automated reasoning [3]. As a subfield of Natural Language Processing (NLP) and computational linguistics, AM focuses on automatically identifying, extracting, and analyzing argumentative structures within natural language texts, which includes recognizing core components of arguments, such as claims and evidence [4].