AITopics | Zeng, Xinyi

Collaborating Authors

Zeng, Xinyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

YanTian: An Application Platform for AI Global Weather Forecasting Models

Cheng, Wencong, Xia, Jiangjiang, Qu, Chang, Wang, Zhigang, Zeng, Xinyi, Huang, Fang, Li, Tianye

arXiv.org Artificial IntelligenceOct-13-2024

To promote the practical application of AI Global Weather Forecasting Models (AIGWFM), we have developed an adaptable application platform named 'YanTian'. This platform enhances existing open-source AIGWFM with a suite of capability-enhancing modules and is constructed by a "loosely coupled" plug-in architecture. The goal of 'YanTian' is to address the limitations of current open-source AIGWFM in operational application, including improving local forecast accuracy, providing spatial high-resolution forecasts, increasing density of forecast intervals, and generating diverse products with the provision of AIGC capabilities. 'YianTian' also provides a simple, visualized user interface, allowing meteorologists easily access both basic and extended capabilities of the platform by simply configuring the platform UI. Users do not need to possess the complex artificial intelligence knowledge and the coding techniques. Additionally, 'YianTian' can be deployed on a PC with GPUs. We hope 'YianTian' can facilitate the operational widespread adoption of AIGWFMs.

aigwfm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.04539

Country: Asia > China (0.29)

Genre: Research Report (0.50)

Industry: Energy > Renewable (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level

Zeng, Xinyi, Shang, Yuying, Zhu, Yutao, Chen, Jiawei, Tian, Yu

arXiv.org Artificial IntelligenceOct-9-2024

Large language models (LLMs) have demonstrated immense utility across various industries. However, as LLMs advance, the risk of harmful outputs increases due to incorrect or malicious instruction prompts. While current methods effectively address jailbreak risks, they share common limitations: 1) Judging harmful responses from the prefill-level lacks utilization of the model's decoding outputs, leading to relatively lower effectiveness and robustness. This paper examines the LLMs' capability to recognize harmful outputs, revealing and quantifying their proficiency in assessing the danger of previous tokens. Our novel decoder-oriented, step-bystep defense architecture corrects harmful queries directly rather than rejecting them outright. We introduce speculative decoding to enhance usability and facilitate deployment to boost secure decoding speed. Extensive experiments demonstrate that our approach improves model security without compromising reasoning speed. Notably, our method leverages the model's ability to discern hazardous information, maintaining its helpfulness compared to existing methods. In recent years, significant progress has been made in developing large language models (LLMs). Meanwhile, the safety of LLMs has attracted significant attention from the research community and industry (Weidinger et al., 2021; Achiam et al., 2023; Wu et al., 2023b). One of the primary safety concerns is jailbreaking, where malicious actors or errant inputs prompt LLMs to produce harmful or inappropriate content, effectively bypassing ethical guidelines.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.06809

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models

Shang, Yuying, Zeng, Xinyi, Zhu, Yutao, Yang, Xiao, Fang, Zhengwei, Zhang, Jingyuan, Chen, Jiawei, Liu, Zinan, Tian, Yu

arXiv.org Artificial IntelligenceOct-9-2024

Hallucinations in large vision-language models (LVLMs) are a significant challenge, i.e., generating objects that are not presented in the visual input, which impairs their reliability. Recent studies often attribute hallucinations to a lack of understanding of visual input, yet ignore a more fundamental issue: the model's inability to effectively extract or decouple visual features. In this paper, we revisit the hallucinations in LVLMs from an architectural perspective, investigating whether the primary cause lies in the visual encoder (feature extraction) or the modal alignment module (feature decoupling). Motivated by our findings on the preliminary investigation, we propose a novel tuning strategy, PATCH, to mitigate hallucinations in LVLMs. This plug-and-play method can be integrated into various LVLMs, utilizing adaptive virtual tokens to extract object features from bounding boxes, thereby addressing hallucinations caused by insufficient decoupling of visual features. PATCH achieves state-of-the-art performance on multiple multi-modal hallucination datasets. We hope this approach provides researchers with deeper insights into the underlying causes of hallucinations in LVLMs, fostering further advancements and innovation in this field. Large vision-language models (LVLMs) have demonstrated remarkable performance across a broad range of tasks, even surpassing human capabilities in specific scenarios (Xu et al., 2023; Li et al., 2023a; Zhang et al., 2024a). However, their practical applications are hindered by multi-modal hallucinations, where models generate factually incorrect, inconsistent, or entirely fictitious outputs when interpreting visual features.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2410.06795

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognition

Yang, Yuxiang, Wen, Lu, Zeng, Xinyi, Xu, Yuanyuan, Wu, Xi, Zhou, Jiliu, Wang, Yan

arXiv.org Artificial IntelligenceJul-8-2024

Facial Expression Recognition (FER) holds significant importance in human-computer interactions. Existing cross-domain FER methods often transfer knowledge solely from a single labeled source domain to an unlabeled target domain, neglecting the comprehensive information across multiple sources. Nevertheless, cross-multidomain FER (CMFER) is very challenging for (i) the inherent inter-domain shifts across multiple domains and (ii) the intra-domain shifts stemming from the ambiguous expressions and low inter-class distinctions. In this paper, we propose a novel Learning with Alignments CMFER framework, named LA-CMFER, to handle both inter- and intra-domain shifts. Specifically, LA-CMFER is constructed with a global branch and a local branch to extract features from the full images and local subtle expressions, respectively. Based on this, LA-CMFER presents a dual-level inter-domain alignment method to force the model to prioritize hard-to-align samples in knowledge transfer at a sample level while gradually generating a well-clustered feature space with the guidance of class attributes at a cluster level, thus narrowing the inter-domain shifts. To address the intra-domain shifts, LA-CMFER introduces a multi-view intra-domain alignment method with a multi-view clustering consistency constraint where a prediction similarity matrix is built to pursue consistency between the global and local views, thus refining pseudo labels and eliminating latent noise. Extensive experiments on six benchmark datasets have validated the superiority of our LA-CMFER.

artificial intelligence, facial expression recognition, recognition, (15 more...)

arXiv.org Artificial Intelligence

2407.05688

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Add feedback