8th international conference
Harmonizing Metadata of Language Resources for Enhanced Querying and Accessibility
This paper addresses the harmonization of metadata from diverse repositories of language resources (LRs). Leveraging linked data and RDF techniques, we integrate data from multiple sources into a unified model based on DCAT and META-SHARE OWL ontology. Our methodology supports text-based search, faceted browsing, and advanced SPARQL queries through Linghub, a newly developed portal. Real user queries from the Corpora Mailing List (CML) were evaluated to assess Linghub capability to satisfy actual user needs. Results indicate that while some limitations persist, many user requests can be successfully addressed. The study highlights significant metadata issues and advocates for adherence to open vocabularies and standards to enhance metadata harmonization. This initial research underscores the importance of API-based access to LRs, promoting machine usability and data subset extraction for specific purposes, paving the way for more efficient and standardized LR utilization.
Task segmentation based on transition state clustering for surgical robot assistance
Yamada, Yutaro, Colan, Jacinto, Davila, Ana, Hasegawa, Yasuhisa
Understanding surgical tasks represents an important challenge for autonomy in surgical robotic systems. To achieve this, we propose an online task segmentation framework that uses hierarchical transition state clustering to activate predefined robot assistance. Our approach involves performing a first clustering on visual features and a subsequent clustering on robot kinematic features for each visual cluster. This enables to capture relevant task transition information on each modality independently. The approach is implemented for a pick-and-place task commonly found in surgical training. The validation of the transition segmentation showed high accuracy and fast computation time. We have integrated the transition recognition module with predefined robot-assisted tool positioning. The complete framework has shown benefits in reducing task completion time and cognitive workload.
Constrained Motion Planning for a Robotic Endoscope Holder based on Hierarchical Quadratic Programming
Colan, Jacinto, Davila, Ana, Hasegawa, Yasuhisa
Minimally Invasive Surgeries (MIS) are challenging for surgeons due to the limited field of view and constrained range of motion imposed by narrow access ports. These challenges can be addressed by robot-assisted endoscope systems which provide precise and stabilized positioning, as well as constrained and smooth motion control of the endoscope. In this work, we propose an online hierarchical optimization framework for visual servoing control of the endoscope in MIS. The framework prioritizes maintaining a remote-center-of-motion (RCM) constraint to prevent tissue damage, while a visual tracking task is defined as a secondary task to enable autonomous tracking of visual features of interest. We validated our approach using a 6-DOF Denso VS050 manipulator and achieved optimization solving times under 0.4 ms and maximum RCM deviation of approximately 0.4 mm. Our results demonstrate the effectiveness of the proposed approach in addressing the constrained motion planning challenges of MIS, enabling precise and autonomous endoscope positioning and visual tracking.
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
Chen, Jiayu, Ganguly, Bhargav, Xu, Yang, Mei, Yongsheng, Lan, Tian, Aggarwal, Vaneet
Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.
A Use Case: Reformulating Query Rewriting as a Statistical Machine Translation Problem
Algan, Abdullah Can, Yürekli, Emre, Çayır, Aykut
One of the most important challenges for modern search engines is to retrieve relevant web content based on user queries. In order to achieve this challenge, search engines have a module to rewrite user queries. That is why modern web search engines utilize some statistical and neural models used in the natural language processing domain. Statistical machine translation is a well-known NLP method among them. The paper proposes a query rewriting pipeline based on a monolingual machine translation model that learns to rewrite Arabic user search queries. This paper also describes preprocessing steps to create a mapping between user queries and web page titles.
User-Centered Security in Natural Language Processing
This dissertation proposes a framework of user-centered security in Natural Language Processing (NLP), and demonstrates how it can improve the accessibility of related research. Accordingly, it focuses on two security domains within NLP with great public interest. First, that of author profiling, which can be employed to compromise online privacy through invasive inferences. Without access and detailed insight into these models' predictions, there is no reasonable heuristic by which Internet users might defend themselves from such inferences. Secondly, that of cyberbullying detection, which by default presupposes a centralized implementation; i.e., content moderation across social platforms. As access to appropriate data is restricted, and the nature of the task rapidly evolves (both through lexical variation, and cultural shifts), the effectiveness of its classifiers is greatly diminished and thereby often misrepresented. Under the proposed framework, we predominantly investigate the use of adversarial attacks on language; i.e., changing a given input (generating adversarial samples) such that a given model does not function as intended. These attacks form a common thread between our user-centered security problems; they are highly relevant for privacy-preserving obfuscation methods against author profiling, and adversarial samples might also prove useful to assess the influence of lexical variation and augmentation on cyberbullying detection.
Mining Duplicate Questions of Stack Overflow
Kale, Mihir, Rayasam, Anirudha, Parik, Radhika, Dheram, Pranav
There has a been a significant rise in the use of Community Question Answering sites (CQAs) over the last decade owing primarily to their ability to leverage the wisdom of the crowd. Duplicate questions have a crippling effect on the quality of these sites. Tackling duplicate questions is therefore an important step towards improving quality of CQAs. In this regard, we propose two neural network based architectures for duplicate question detection on Stack Overflow. We also propose explicitly modeling the code present in questions to achieve results that surpass the state of the art.
Bayesian Deep Learning for Graphs
The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.