Goto

Collaborating Authors

 Overview


V-RECS, a Low-Cost LLM4VIS Recommender with Explanations, Captioning and Suggestions

arXiv.org Artificial Intelligence

NL2VIS (natural language to visualization) is a promising and recent research area that involves interpreting natural language queries and translating them into visualizations that accurately represent the underlying data. As we navigate the era of big data, NL2VIS holds considerable application potential since it greatly facilitates data exploration by non-expert users. Following the increasingly widespread usage of generative AI in NL2VIS applications, in this paper we present V-RECS, the first LLM-based Visual Recommender augmented with explanations(E), captioning(C), and suggestions(S) for further data exploration. V-RECS' visualization narratives facilitate both response verification and data exploration by non-expert users. Furthermore, our proposed solution mitigates computational, controllability, and cost issues associated with using powerful LLMs by leveraging a methodology to effectively fine-tune small models. To generate insightful visualization narratives, we use Chain-of-Thoughts (CoT), a prompt engineering technique to help LLM identify and generate the logical steps to produce a correct answer. Since CoT is reported to perform poorly with small LLMs, we adopted a strategy in which a large LLM (GPT-4), acting as a Teacher, generates CoT-based instructions to fine-tune a small model, Llama-2-7B, which plays the role of a Student. Extensive experiments-based on a framework for the quantitative evaluation of AI-based visualizations and on manual assessment by a group of participants-show that V-RECS achieves performance scores comparable to GPT-4, at a much lower cost. The efficacy of the V-RECS teacher-student paradigm is also demonstrated by the fact that the un-tuned Llama fails to perform the task in the vast majority of test cases. We release V-RECS for the visualization community to assist visualization designers throughout the entire visualization generation process.


A Survey on Intelligent Internet of Things: Applications, Security, Privacy, and Future Directions

arXiv.org Artificial Intelligence

The rapid advances in the Internet of Things (IoT) have promoted a revolution in communication technology and offered various customer services. Artificial intelligence (AI) techniques have been exploited to facilitate IoT operations and maximize their potential in modern application scenarios. In particular, the convergence of IoT and AI has led to a new networking paradigm called Intelligent IoT (IIoT), which has the potential to significantly transform businesses and industrial domains. This paper presents a comprehensive survey of IIoT by investigating its significant applications in mobile networks, as well as its associated security and privacy issues. Specifically, we explore and discuss the roles of IIoT in a wide range of key application domains, from smart healthcare and smart cities to smart transportation and smart industries. Through such extensive discussions, we investigate important security issues in IIoT networks, where network attacks, confidentiality, integrity, and intrusion are analyzed, along with a discussion of potential countermeasures. Privacy issues in IIoT networks were also surveyed and discussed, including data, location, and model privacy leakage. Finally, we outline several key challenges and highlight potential research directions in this important area.


Online detection and infographic explanation of spam reviews with data drift adaptation

arXiv.org Artificial Intelligence

Spam reviews are a pervasive problem on online platforms due to its significant impact on reputation. However, research into spam detection in data streams is scarce. Another concern lies in their need for transparency. Consequently, this paper addresses those problems by proposing an online solution for identifying and explaining spam reviews, incorporating data drift adaptation. It integrates (i) incremental profiling, (ii) data drift detection & adaptation, and (iii) identification of spam reviews employing Machine Learning. The explainable mechanism displays a visual and textual prediction explanation in a dashboard. The best results obtained reached up to 87 % spam F-measure. Key words: Data drift, interpretability and explainability, Natural Language Processing, online Machine Learning, spam detection.


MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens

arXiv.org Artificial Intelligence

Numerous advanced Large Language Models (LLMs) now support context lengths up to 128K, and some extend to 200K. Some benchmarks in the generic domain have also followed up on evaluating long-context capabilities. In the medical domain, tasks are distinctive due to the unique contexts and need for domain expertise, necessitating further evaluation. However, despite the frequent presence of long texts in medical scenarios, evaluation benchmarks of long-context capabilities for LLMs in this field are still rare. In this paper, we propose MedOdyssey, the first medical long-context benchmark with seven length levels ranging from 4K to 200K tokens. MedOdyssey consists of two primary components: the medical-context "needles in a haystack" task and a series of tasks specific to medical applications, together comprising 10 datasets. The first component includes challenges such as counter-intuitive reasoning and novel (unknown) facts injection to mitigate knowledge leakage and data contamination of LLMs. The second component confronts the challenge of requiring professional medical expertise. Especially, we design the ``Maximum Identical Context'' principle to improve fairness by guaranteeing that different LLMs observe as many identical contexts as possible. Our experiment evaluates advanced proprietary and open-source LLMs tailored for processing long contexts and presents detailed performance analyses. This highlights that LLMs still face challenges and need for further research in this area. Our code and data are released in the repository: \url{https://github.com/JOHNNY-fans/MedOdyssey.}


Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild

arXiv.org Artificial Intelligence

Virtual Try-On (VTON) is a highly active line of research, with increasing demand. It aims to replace a piece of garment in an image with one from another, while preserving person and garment characteristics as well as image fidelity. Current literature takes a supervised approach for the task, impairing generalization and imposing heavy computation. In this paper, we present a novel zero-shot training-free method for inpainting a clothing garment by reference. Our approach employs the prior of a diffusion model with no additional training, fully leveraging its native generalization capabilities. The method employs extended attention to transfer image information from reference to target images, overcoming two significant challenges. We first initially warp the reference garment over the target human using deep features, alleviating "texture sticking". We then leverage the extended attention mechanism with careful masking, eliminating leakage of reference background and unwanted influence. Through a user study, qualitative, and quantitative comparison to state-of-the-art approaches, we demonstrate superior image quality and garment preservation compared unseen clothing pieces or human figures.


Detecting AI-Generated Text: Factors Influencing Detectability with Current Methods

arXiv.org Artificial Intelligence

Large language models (LLMs) have advanced to a point that even humans have difficulty discerning whether a text was generated by another human, or by a computer. However, knowing whether a text was produced by human or artificial intelligence (AI) is important to determining its trustworthiness, and has applications in many domains including detecting fraud and academic dishonesty, as well as combating the spread of misinformation and political propaganda. The task of AI-generated text (AIGT) detection is therefore both very challenging, and highly critical. In this survey, we summarize state-of-the art approaches to AIGT detection, including watermarking, statistical and stylistic analysis, and machine learning classification. We also provide information about existing datasets for this task. Synthesizing the research findings, we aim to provide insight into the salient factors that combine to determine how "detectable" AIGT text is under different scenarios, and to make practical recommendations for future work towards this significant technical and societal challenge.


A survey on fairness of large language models in e-commerce: progress, application, and challenge

arXiv.org Artificial Intelligence

This survey explores the fairness of large language models (LLMs) in e-commerce, examining their progress, applications, and the challenges they face. LLMs have become pivotal in the e-commerce domain, offering innovative solutions and enhancing customer experiences. This work presents a comprehensive survey on the applications and challenges of LLMs in e-commerce. The paper begins by introducing the key principles underlying the use of LLMs in e-commerce, detailing the processes of pretraining, fine-tuning, and prompting that tailor these models to specific needs. It then explores the varied applications of LLMs in e-commerce, including product reviews, where they synthesize and analyze customer feedback; product recommendations, where they leverage consumer data to suggest relevant items; product information translation, enhancing global accessibility; and product question and answer sections, where they automate customer support. The paper critically addresses the fairness challenges in e-commerce, highlighting how biases in training data and algorithms can lead to unfair outcomes, such as reinforcing stereotypes or discriminating against certain groups. These issues not only undermine consumer trust, but also raise ethical and legal concerns. Finally, the work outlines future research directions, emphasizing the need for more equitable and transparent LLMs in e-commerce. It advocates for ongoing efforts to mitigate biases and improve the fairness of these systems, ensuring they serve diverse global markets effectively and ethically. Through this comprehensive analysis, the survey provides a holistic view of the current landscape of LLMs in e-commerce, offering insights into their potential and limitations, and guiding future endeavors in creating fairer and more inclusive e-commerce environments.


Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

arXiv.org Artificial Intelligence

Recent breakthroughs in large language modeling have facilitated rigorous exploration of their application in diverse tasks related to tabular data modeling, such as prediction, tabular data synthesis, question answering, and table understanding. Each task presents unique challenges and opportunities. However, there is currently a lack of comprehensive review that summarizes and compares the key techniques, metrics, datasets, models, and optimization approaches in this research domain. This survey aims to address this gap by consolidating recent progress in these areas, offering a thorough survey and taxonomy of the datasets, metrics, and methodologies utilized. It identifies strengths, limitations, unexplored territories, and gaps in the existing literature, while providing some insights for future research directions in this vital and rapidly evolving field. It also provides relevant code and datasets references. Through this comprehensive review, we hope to provide interested readers with pertinent references and insightful perspectives, empowering them with the necessary tools and knowledge to effectively navigate and address the prevailing challenges in the field.


Graph Neural Networks in Histopathology: Emerging Trends and Future Directions

arXiv.org Artificial Intelligence

Histopathological analysis of Whole Slide Images (WSIs) has seen a surge in the utilization of deep learning methods, particularly Convolutional Neural Networks (CNNs). However, CNNs often fall short in capturing the intricate spatial dependencies inherent in WSIs. Graph Neural Networks (GNNs) present a promising alternative, adept at directly modeling pairwise interactions and effectively discerning the topological tissue and cellular structures within WSIs. Recognizing the pressing need for deep learning techniques that harness the topological structure of WSIs, the application of GNNs in histopathology has experienced rapid growth. In this comprehensive review, we survey GNNs in histopathology, discuss their applications, and explore emerging trends that pave the way for future advancements in the field. We begin by elucidating the fundamentals of GNNs and their potential applications in histopathology. Leveraging quantitative literature analysis, we identify four emerging trends: Hierarchical GNNs, Adaptive Graph Structure Learning, Multimodal GNNs, and Higher-order GNNs. Through an in-depth exploration of these trends, we offer insights into the evolving landscape of GNNs in histopathological analysis. Based on our findings, we propose future directions to propel the field forward. Our analysis serves to guide researchers and practitioners towards innovative approaches and methodologies, fostering advancements in histopathological analysis through the lens of graph neural networks.


Underwater Human-Robot and Human-Swarm Interaction: A Review and Perspective

arXiv.org Artificial Intelligence

There has been a growing interest in extending the capabilities of autonomous underwater vehicles (AUVs) in subsea missions, particularly in integrating underwater human-robot interaction (UHRI) for control. UHRI and its subfield,underwater gesture recognition (UGR), play a significant role in enhancing diver-robot communication for marine research. This review explores the latest developments in UHRI and examines its promising applications for multi-robot systems. With the developments in UGR, opportunities are presented for underwater robots to work alongside human divers to increase their functionality. Human gestures creates a seamless and safe collaborative environment where divers and robots can interact more efficiently. By highlighting the state-of-the-art in this field, we can potentially encourage advancements in underwater multi-robot system (UMRS) blending the natural communication channels of human-robot interaction with the multi-faceted coordination capabilities of underwater swarms,thus enhancing robustness in complex aquatic environments.