AITopics | Huang, Jiaqi

Collaborating Authors

Huang, Jiaqi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model

Chen, Haonan, Li, Junxiao, Wu, Ruihai, Liu, Yiwei, Hou, Yiwen, Xu, Zhixuan, Guo, Jingxiang, Gao, Chongkai, Wei, Zhenyu, Xu, Shensi, Huang, Jiaqi, Shao, Lin

arXiv.org Artificial IntelligenceMar-11-2025

Garment folding is a common yet challenging task in robotic manipulation. The deformability of garments leads to a vast state space and complex dynamics, which complicates precise and fine-grained manipulation. Previous approaches often rely on predefined key points or demonstrations, limiting their generalization across diverse garment categories. This paper presents a framework, MetaFold, that disentangles task planning from action prediction, learning each independently to enhance model generalization. It employs language-guided point cloud trajectory generation for task planning and a low-level foundation model for action prediction. This structure facilitates multi-category learning, enabling the model to adapt flexibly to various user instructions and folding tasks. Experimental results demonstrate the superiority of our proposed framework. Supplementary materials are available on our website: https://meta-fold.github.io/.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.08372

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Detecting Backdoor Attacks via Similarity in Semantic Communication Systems

Wei, Ziyang, Jiang, Yili, Huang, Jiaqi, Zhong, Fangtian, Gyawali, Sohan

arXiv.org Artificial IntelligenceFeb-5-2025

Semantic communication systems, which leverage Generative AI (GAI) to transmit semantic meaning rather than raw data, are poised to revolutionize modern communications. However, they are vulnerable to backdoor attacks, a type of poisoning manipulation that embeds malicious triggers into training datasets. As a result, Backdoor attacks mislead the inference for poisoned samples while clean samples remain unaffected. The existing defenses may alter the model structure (such as neuron pruning that potentially degrades inference performance on clean inputs, or impose strict requirements on data formats (such as ``Semantic Shield" that requires image-text pairs). To address these limitations, this work proposes a defense mechanism that leverages semantic similarity to detect backdoor attacks without modifying the model structure or imposing data format constraints. By analyzing deviations in semantic feature space and establishing a threshold-based detection framework, the proposed approach effectively identifies poisoned samples. The experimental results demonstrate high detection accuracy and recall across varying poisoning ratios, underlining the significant effectiveness of our proposed solution.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.03721

Country:

North America > United States > Montana (0.14)
North America > United States > North Carolina (0.14)
North America > United States > Missouri (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

IDEA: Image Description Enhanced CLIP-Adapter

Ye, Zhipeng, Jiang, Feng, Wang, Qiufeng, Huang, Kaizhu, Huang, Jiaqi

arXiv.org Artificial IntelligenceJan-15-2025

CLIP (Contrastive Language-Image Pre-training) has attained great success in pattern recognition and computer vision. Transferring CLIP to downstream tasks (e.g. zero- or few-shot classification) is a hot topic in multimodal learning. However, current studies primarily focus on either prompt learning for text or adapter tuning for vision, without fully exploiting the complementary information and correlations among image-text pairs. In this paper, we propose an Image Description Enhanced CLIP-Adapter (IDEA) method to adapt CLIP to few-shot image classification tasks. This method captures fine-grained features by leveraging both visual features and textual descriptions of images. IDEA is a training-free method for CLIP, and it can be comparable to or even exceeds state-of-the-art models on multiple tasks. Furthermore, we introduce Trainable-IDEA (T-IDEA), which extends IDEA by adding two lightweight learnable components (i.e., a projector and a learnable latent space), further enhancing the model's performance and achieving SOTA results on 11 datasets. As one important contribution, we employ the Llama model and design a comprehensive pipeline to generate textual descriptions for images of 11 datasets, resulting in a total of 1,637,795 image-text pairs, named "IMD-11". Our code and data are released at https://github.com/FourierAI/IDEA.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2501.08816

Country:

Asia > China (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DSDE: Using Proportion Estimation to Improve Model Selection for Out-of-Distribution Detection

Geng, Jingyao, Zhang, Yuan, Huang, Jiaqi, Xue, Feng, Tan, Falong, Xie, Chuanlong, Zhang, Shumei

arXiv.org Machine LearningNov-3-2024

Model library is an effective tool for improving the performance of single-model Out-of-Distribution (OoD) detector, mainly through model selection and detector fusion. However, existing methods in the literature do not provide uncertainty quantification for model selection results. Additionally, the model ensemble process primarily focuses on controlling the True Positive Rate (TPR) while neglecting the False Positive Rate (FPR). In this paper, we emphasize the significance of the proportion of models in the library that identify the test sample as an OoD sample. This proportion holds crucial information and directly influences the error rate of OoD detection.To address this, we propose inverting the commonly-used sequential p-value strategies. We define the rejection region initially and then estimate the error rate. Furthermore, we introduce a novel perspective from change-point detection and propose an approach for proportion estimation with automatic hyperparameter selection. We name the proposed approach as DOS-Storey-based Detector Ensemble (DSDE). Experimental results on CIFAR10 and CIFAR100 demonstrate the effectiveness of our approach in tackling OoD detection challenges. Specifically, the CIFAR10 experiments show that DSDE reduces the FPR from 11.07% to 3.31% compared to the top-performing single-model detector.

artificial intelligence, detector, machine learning, (14 more...)

arXiv.org Machine Learning

2411.01487

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.38)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Subtoxic Questions: Dive Into Attitude Change of LLM's Response in Jailbreak Attempts

Zhang, Tianyu, Zhao, Zixuan, Huang, Jiaqi, Hua, Jingyu, Zhong, Sheng

arXiv.org Artificial IntelligenceApr-12-2024

As Large Language Models (LLMs) of Prompt Jailbreaking are getting more and more attention, it is of great significance to raise a generalized research paradigm to evaluate attack strengths and a basic model to conduct subtler experiments. In this paper, we propose a novel approach by focusing on a set of target questions that are inherently more sensitive to jailbreak prompts, aiming to circumvent the limitations posed by enhanced LLM security. Through designing and analyzing these sensitive questions, this paper reveals a more effective method of identifying vulnerabilities in LLMs, thereby contributing to the advancement of LLM security. This research not only challenges existing jailbreaking methodologies but also fortifies LLMs against potential exploits.

large language model, machine learning, subtoxic question, (18 more...)

arXiv.org Artificial Intelligence

2404.08309

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback