Hạ Long
CON: Continual Object Navigation via Data-Free Inter-Agent Knowledge Transfer in Unseen and Unfamiliar Places
Terashima, Kouki, Iwata, Daiki, Tanaka, Kanji
This work explores the potential of brief inter-agent knowledge transfer (KT) to enhance the robotic object goal navigation (ON) in unseen and unfamiliar environments. Drawing on the analogy of human travelers acquiring local knowledge, we propose a framework in which a traveler robot (student) communicates with local robots (teachers) to obtain ON knowledge through minimal interactions. We frame this process as a data-free continual learning (CL) challenge, aiming to transfer knowledge from a black-box model (teacher) to a new model (student). In contrast to approaches like zero-shot ON using large language models (LLMs), which utilize inherently communication-friendly natural language for knowledge representation, the other two major ON approaches -- frontier-driven methods using object feature maps and learning-based ON using neural state-action maps -- present complex challenges where data-free KT remains largely uncharted. To address this gap, we propose a lightweight, plug-and-play KT module targeting non-cooperative black-box teachers in open-world settings. Using the universal assumption that every teacher robot has vision and mobility capabilities, we define state-action history as the primary knowledge base. Our formulation leads to the development of a query-based occupancy map that dynamically represents target object locations, serving as an effective and communication-friendly knowledge representation. We validate the effectiveness of our method through experiments conducted in the Habitat environment.
A Hybrid-Layered System for Image-Guided Navigation and Robot Assisted Spine Surgery
T, Suhail Ansari, Maik, Vivek, Naheem, Minhas, Ram, Keerthi, Lakshmanan, Manojkumar, Sivaprakasam, Mohanasankar
In response to the growing demand for precise and affordable solutions for Image-Guided Spine Surgery (IGSS), this paper presents a comprehensive development of a Robot-Assisted and Navigation-Guided IGSS System. The endeavor involves integrating cutting-edge technologies to attain the required surgical precision and limit user radiation exposure, thereby addressing the limitations of manual surgical methods. We propose an IGSS workflow and system architecture employing a hybrid-layered approach, combining modular and integrated system architectures in distinctive layers to develop an affordable system for seamless integration, scalability, and reconfigurability. We developed and integrated the system and extensively tested it on phantoms and cadavers. The proposed system's accuracy using navigation guidance is 1.020 mm, and robot assistance is 1.11 mm on phantoms. Observing a similar performance in cadaveric validation where 84% of screw placements were grade A, 10% were grade B using navigation guidance, 90% were grade A, and 10% were grade B using robot assistance as per the Gertzbein-Robbins scale, proving its efficacy for an IGSS. The evaluated performance is adequate for an IGSS and at par with the existing systems in literature and those commercially available. The user radiation is lower than in the literature, given that the system requires only an average of 3 C-Arm images per pedicle screw placement and verification
AAPMT: AGI Assessment Through Prompt and Metric Transformer
The emergence of text-to-image models marks a significant milestone in the evolution of AI-generated images (AGIs), expanding their use in diverse domains like design, entertainment, and more. Despite these breakthroughs, the quality of AGIs often remains suboptimal, highlighting the need for effective evaluation methods. These methods are crucial for assessing the quality of images relative to their textual descriptions, and they must accurately mirror human perception. Substantial progress has been achieved in this domain, with innovative techniques such as BLIP and DBCNN contributing significantly. However, recent studies, including AGIQA-3K, reveal a notable discrepancy between current methods and state-of-the-art (SOTA) standards. This gap emphasizes the necessity for a more sophisticated and precise evaluation metric. In response, our objective is to develop a model that could give ratings for metrics, which focuses on parameters like perceptual quality, authenticity, and the correspondence between text and image, that more closely aligns with human perception. In our paper, we introduce a range of effective methods, including prompt designs and the Metric Transformer. The Metric Transformer is a novel structure inspired by the complex interrelationships among various AGI quality metrics. The code is available at https://github.com/huskydoge/CS3324-Digital-Image-Processing/tree/main/Assignment1
From Lengthy to Lucid: A Systematic Literature Review on NLP Techniques for Taming Long Sentences
Passali, Tatiana, Chatzikyriakidis, Efstathios, Andreadis, Stelios, Stavropoulos, Thanos G., Matonaki, Anastasia, Fachantidis, Anestis, Tsoumakas, Grigorios
Long sentences have been a persistent issue in written communication for many years since they make it challenging for readers to grasp the main points or follow the initial intention of the writer. This survey, conducted using the PRISMA guidelines, systematically reviews two main strategies for addressing the issue of long sentences: a) sentence compression and b) sentence splitting. An increased trend of interest in this area has been observed since 2005, with significant growth after 2017. Current research is dominated by supervised approaches for both sentence compression and splitting. Yet, there is a considerable gap in weakly and self-supervised techniques, suggesting an opportunity for further research, especially in domains with limited data. In this survey, we categorize and group the most representative methods into a comprehensive taxonomy. We also conduct a comparative evaluation analysis of these methods on common sentence compression and splitting datasets. Finally, we discuss the challenges and limitations of current methods, providing valuable insights for future research directions. This survey is meant to serve as a comprehensive resource for addressing the complexities of long sentences. We aim to enable researchers to make further advancements in the field until long sentences are no longer a barrier to effective communication.
VariTex: Variational Neural Face Textures
Bühler, Marcel C., Meka, Abhimitra, Li, Gengyan, Beeler, Thabo, Hilliges, Otmar
Deep generative models have recently demonstrated the ability to synthesize photorealistic images of human faces with novel identities. A key challenge to the wide applicability of such techniques is to provide independent control over semantically meaningful parameters: appearance, head pose, face shape, and facial expressions. In this paper, we propose VariTex - to the best of our knowledge the first method that learns a variational latent feature space of neural face textures, which allows sampling of novel identities. We combine this generative model with a parametric face model and gain explicit control over head pose and facial expressions. To generate images of complete human heads, we propose an additive decoder that generates plausible additional details such as hair. A novel training scheme enforces a pose independent latent space and in consequence, allows learning of a one-to-many mapping between latent codes and pose-conditioned exterior regions. The resulting method can generate geometrically consistent images of novel identities allowing fine-grained control over head pose, face shape, and facial expressions, facilitating a broad range of downstream tasks, like sampling novel identities, re-posing, expression transfer, and more.
6 Privacy Solutions for Big Data and Machine Learning
Travelers who wander the banana pancake trail through Southeast Asia will all get roughly the same experience. They'll eat crummy food on one of fifty boats floating around Ha Long Bay, then head up to the highlands of Sa Pa for a faux cultural experience with hill tribes that grow dreadful cannabis. After that, it's on to Laos to float the river in Vang Vieng while smashed on opium tea. Eventually, you'll see someone wearing a t-shirt with the classic slogan – "same same, but different." The origins of this phrase surround the Southeast Asian vendors who often respond to queries about the authenticity of fake goods they're selling with "same same, but different." It's a phrase that appropriately describes how the technology world loves to spin things as fresh and new when they've hardly changed at all.
Multifactorial Evolutionary Algorithm For Clustered Minimum Routing Cost Problem
Trung, Tran Ba, Binh, Huynh Thi Thanh, Thanh, Le Tien, Hieu, Ly Trung, Thanh, Pham Dinh
Minimum Routing Cost Clustered Tree Problem (CluMRCT) is applied in various fields in both theory and application. Because the CluMRCT is NP-Hard, the approximate approaches are suitable to find the solution for this problem. Recently, Multifactorial Evolutionary Algorithm (MFEA) has emerged as one of the most efficient approximation algorithms to deal with many different kinds of problems. Therefore, this paper studies to apply MFEA for solving CluMRCT problems. In the proposed MFEA, we focus on crossover and mutation operators which create a valid solution of CluMRCT problem in two levels: first level constructs spanning trees for graphs in clusters while the second level builds a spanning tree for connecting among clusters. To reduce the consuming resources, we will also introduce a new method of calculating the cost of CluMRCT solution. The proposed algorithm is experimented on numerous types of datasets. The experimental results demonstrate the effectiveness of the proposed algorithm, partially on large instances
Using Frame Semantics for Knowledge Extraction from Twitter
Søgaard, Anders (University of Copenhagen) | Plank, Barbara (University of Copenhagen) | Alonso, Hector Martinez (University of Copenhagen)
Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.