AITopics | Zhang, Weiwei

Collaborating Authors

Zhang, Weiwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Is AI Robust Enough for Scientific Research?

Zhang, Jun-Jie, Song, Jiahao, Wang, Xiu-Cheng, Li, Fu-Peng, Liu, Zehan, Chen, Jian-Nan, Dang, Haoning, Wang, Shiyao, Zhang, Yiyan, Xu, Jianhui, Shi, Chunxiang, Wang, Fei, Pang, Long-Gang, Cheng, Nan, Zhang, Weiwei, Zhang, Duo, Meng, Deyu

arXiv.org Artificial IntelligenceDec-18-2024

Artificial Intelligence (AI) has become a transformative tool in scientific research, driving breakthroughs across numerous disciplines [5-11]. Despite these achievements, neural networks, which form the backbone of many AI systems, exhibit significant vulnerabilities. One of the most concerning issues is their susceptibility to adversarial attacks [1, 2, 12, 13]. These attacks involve making small, often imperceptible changes to the input data, causing AI systems to make incorrect predictions (Figure 1), highlighting a critical weakness: AI systems can fail under minimal perturbations - a phenomenon completely unseen in classical methods. The impact of adversarial attacks has been extensively studied in the context of image classification [14-16].

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2412.16234

Country: Asia > China (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.87)
Energy > Oil & Gas (0.68)
Government > Military (0.55)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.87)

Add feedback

LLM-R: A Framework for Domain-Adaptive Maintenance Scheme Generation Combining Hierarchical Agents and RAG

Tao, Laifa, Huang, Qixuan, Wu, Xianjun, Zhang, Weiwei, Wu, Yunlong, Li, Bin, Lu, Chen, Hai, Xingshuo

arXiv.org Artificial IntelligenceNov-7-2024

The increasing use of smart devices has emphasized the critical role of maintenance in production activities. Interactive Electronic Technical Manuals (IETMs) are vital tools that support the maintenance of smart equipment. However, traditional IETMs face challenges such as transitioning from Graphical User Interfaces (GUIs) to natural Language User Interfaces (LUIs) and managing complex logical relationships. Additionally, they must meet the current demands for higher intelligence. This paper proposes a Maintenance Scheme Generation Method based on Large Language Models (LLM-R). The proposed method includes several key innovations: We propose the Low Rank Adaptation-Knowledge Retention (LORA-KR) loss technology to proportionally adjust mixed maintenance data for fine-tuning the LLM. This method prevents knowledge conflicts caused by mixed data, improving the model's adaptability and reasoning ability in specific maintenance domains, Besides, Hierarchical Task-Based Agent and Instruction-level Retrieval-Augmented Generation (RAG) technologies are adopted to optimize the generation steps and mitigate the phenomenon of hallucination caused by the model's Inability to access contextual information. This enhancement improves the model's flexibility and accuracy in handling known or unknown maintenance objects and maintenance scheme scenarios. To validate the proposed method's effectiveness in maintenance tasks, a maintenance scheme dataset was constructed using objects from different fields. The experimental results show that the accuracy of the maintenance schemes generated by the proposed method reached 91.59%, indicating which improvement enhances the intelligence of maintenance schemes and introduces novel technical approaches for equipment maintenance.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.04476

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

Tao, Laifa, Li, Shangyu, Liu, Haifei, Huang, Qixuan, Ma, Liang, Ning, Guoao, Chen, Yiling, Wu, Yunlong, Li, Bin, Zhang, Weiwei, Zhao, Zhengduo, Zhan, Wenchao, Cao, Wenyan, Wang, Chao, Liu, Hongmei, Ma, Jian, Suo, Mingliang, Cheng, Yujie, Ding, Yu, Song, Dengwei, Lu, Chen

arXiv.org Artificial IntelligenceJul-1-2024

Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Large Model, heralds a technological revolution with the potential to fundamentally reshape traditional technological fields and human production methods. Its capabilities, including strong generalization, reasoning, and generative attributes, present opportunities to address PHM's bottlenecks. To this end, based on a systematic analysis of the current challenges and bottlenecks in PHM, as well as the research status and advantages of Large Model, we propose a novel concept and three progressive paradigms of Prognosis and Health Management Large Model (PHM-LM) through the integration of the Large Model with PHM. Subsequently, we provide feasible technical approaches for PHM-LM to bolster PHM's core capabilities within the framework of the three paradigms. Moreover, to address core issues confronting PHM, we discuss a series of technical challenges of PHM-LM throughout the entire process of construction and application. This comprehensive effort offers a holistic PHM-LM technical framework, and provides avenues for new PHM technologies, methodologies, tools, platforms and applications, which also potentially innovates design, research & development, verification and application mode of PHM. And furthermore, a new generation of PHM with AI will also capably be realized, i.e., from custom to generalized, from discriminative to generative, and from theoretical conditions to practical applications.

knowledge management, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2407.03374

Country: Asia > China (0.67)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.65)

Industry:

Health & Medicine > Consumer Health (1.00)
Aerospace & Defense (1.00)
Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
(8 more...)

Add feedback

DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Feng, Yuxi, Li, Raymond, Fan, Zhenan, Carenini, Giuseppe, Pourreza, Mohammadreza, Zhang, Weiwei, Zhang, Yong

arXiv.org Artificial IntelligenceJun-12-2024

While in-context Learning (ICL) has proven to be an effective technique to improve the performance of Large Language Models (LLMs) in a variety of complex tasks, notably in translating natural language questions into Structured Query Language (NL2SQL), the question of how to select the most beneficial demonstration examples remains an open research problem. While prior works often adapted off-the-shelf encoders to retrieve examples dynamically, an inherent discrepancy exists in the representational capacities between the external retrievers and the LLMs. Further, optimizing the selection of examples is a non-trivial task, since there are no straightforward methods to assess the relative benefits of examples without performing pairwise inference. To address these shortcomings, we propose DeTriever, a novel demonstration retrieval framework that learns a weighted combination of LLM hidden states, where rich semantic information is encoded. To train the model, we propose a proxy score that estimates the relative benefits of examples based on the similarities between output queries. Experiments on two popular NL2SQL benchmarks demonstrate that our method significantly outperforms the state-of-the-art baselines on one-shot NL2SQL tasks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.07913

Country:

Asia (0.93)
North America > Canada > Alberta (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Deploying Graph Neural Networks in Wireless Networks: A Link Stability Viewpoint

Li, Jun, Zhang, Weiwei, Wei, Kang, Chen, Guangji, Shi, Long, Chen, Wen

arXiv.org Artificial IntelligenceMay-9-2024

As an emerging artificial intelligence technology, graph neural networks (GNNs) have exhibited promising performance across a wide range of graph-related applications. However, information exchanges among neighbor nodes in GNN pose new challenges in the resource-constrained scenario, especially in wireless systems. In practical wireless systems, the communication links among nodes are usually unreliable due to wireless fading and receiver noise, consequently resulting in performance degradation of GNNs. To improve the learning performance of GNNs, we aim to maximize the number of long-term average (LTA) communication links by the optimized power control under energy consumption constraints. Using the Lyapunov optimization method, we first transform the intractable long-term problem into a deterministic problem in each time slot by converting the long-term energy constraints into the objective function. In spite of this non-convex combinatorial optimization problem, we address this problem via equivalently solving a sequence of convex feasibility problems together with a greedy based solver. Simulation results demonstrate the superiority of our proposed scheme over the baselines.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2405.05802

Country: Asia > China (0.29)

Genre: Research Report (0.70)

Industry:

Telecommunications (0.70)
Energy (0.51)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder

Pourreza, Mohammadreza, Rafiei, Davood, Feng, Yuxi, Li, Raymond, Fan, Zhenan, Zhang, Weiwei

arXiv.org Artificial IntelligenceMar-24-2024

Detecting structural similarity between queries is essential for selecting examples in in-context learning models. However, assessing structural similarity based solely on the natural language expressions of queries, without considering SQL queries, presents a significant challenge. This paper explores the significance of this similarity metric and proposes a model for accurately estimating it. To achieve this, we leverage a dataset comprising 170k question pairs, meticulously curated to train a similarity prediction model. Our comprehensive evaluation demonstrates that the proposed model adeptly captures the structural similarity between questions, as evidenced by improvements in Kendall-Tau distance and precision@k metrics. Notably, our model outperforms strong competitive embedding models from OpenAI and Cohere. Furthermore, compared to these competitive models, our proposed encoder enhances the downstream performance of NL2SQL models in 1-shot in-context learning scenarios by 1-2\% for GPT-3.5-turbo, 4-8\% for CodeLlama-7B, and 2-3\% for CodeLlama-13B.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2403.16204

Country: North America > Canada > Alberta (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Effective Quantization for Diffusion Models on CPUs

Chang, Hanwen, Shen, Haihao, Cai, Yiyang, Ye, Xinyu, Xu, Zhenzhong, Cheng, Wenhua, Lv, Kaokao, Zhang, Weiwei, Lu, Yintong, Guo, Heng

arXiv.org Artificial IntelligenceNov-29-2023

Diffusion models have gained popularity for generating images from textual descriptions. Nonetheless, the substantial need for computational resources continues to present a noteworthy challenge, contributing to time-consuming processes. Quantization, a technique employed to compress deep learning models for enhanced efficiency, presents challenges when applied to diffusion models. These models are notably more sensitive to quantization compared to other model types, potentially resulting in a degradation of image quality. In this paper, we introduce a novel approach to quantize the diffusion models by leveraging both quantization-aware training and distillation. Our results show the quantized models can maintain the high image quality while demonstrating the inference efficiency on CPUs.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2311.16133

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

TSONN: Time-stepping-oriented neural network for solving partial differential equations

Cao, Wenbo, Zhang, Weiwei

arXiv.org Artificial IntelligenceOct-25-2023

Deep neural networks (DNNs), especially physics-informed neural networks (PINNs), have recently become a new popular method for solving forward and inverse problems governed by partial differential equations (PDEs). However, these methods still face challenges in achieving stable training and obtaining correct results in many problems, since minimizing PDE residuals with PDE-based soft constraint make the problem ill-conditioned. Different from all existing methods that directly minimize PDE residuals, this work integrates time-stepping method with deep learning, and transforms the original ill-conditioned optimization problem into a series of well-conditioned sub-problems over given pseudo time intervals. The convergence of model training is significantly improved by following the trajectory of the pseudo time-stepping process, yielding a robust optimization-based PDE solver. Our results show that the proposed method achieves stable training and correct results in many problems that standard PINNs fail to solve, requiring only a simple modification on the loss function. In addition, we demonstrate several novel properties and advantages of time-stepping methods within the framework of neural network-based optimization approach, in comparison to traditional grid-based numerical method. Specifically, explicit scheme allows significantly larger time step, while implicit scheme can be implemented as straightforwardly as explicit scheme.

deep learning, machine learning, time-stepping-oriented neural network, (3 more...)

arXiv.org Artificial Intelligence

2310.16491

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Cheng, Wenhua, Zhang, Weiwei, Shen, Haihao, Cai, Yiyang, He, Xin, Lv, Kaokao

arXiv.org Artificial IntelligenceSep-28-2023

Large Language Models (LLMs) have proven their exceptional capabilities in performing language-related tasks. However, their deployment poses significant challenges due to their considerable memory and storage requirements. In response to this issue, weight-only quantization, particularly 3 and 4-bit weight-only quantization, has emerged as one of the most viable solutions. As the number of bits decreases, the quantization grid broadens, thus emphasizing the importance of up and down rounding. While previous studies have demonstrated that fine-tuning up and down rounding with the addition of perturbations can enhance accuracy in some scenarios, our study is driven by the precise and limited boundary of these perturbations, where only the threshold for altering the rounding value is of significance. Consequently, we propose a concise and highly effective approach for optimizing the weight rounding task. Our method, named SignRound, involves lightweight block-wise tuning using signed gradient descent, enabling us to achieve outstanding results within 400 steps. SignRound competes impressively against recent methods without introducing additional inference overhead. The source code will be publicly available at \url{https://github.com/intel/neural-compressor} soon.

large language model, natural language, optimize weight rounding, (4 more...)

arXiv.org Artificial Intelligence

2309.05516

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

PUMGPT: A Large Vision-Language Model for Product Understanding

Wu, Shuhui, Tang, Zengming, Guo, Zongyi, Zhang, Weiwei, Cui, Baoliang, Tang, Haihong, Lu, Weiming

arXiv.org Artificial IntelligenceAug-18-2023

Recent developments of multi-modal large language models have demonstrated its strong ability in solving vision-language tasks. In this paper, we focus on the product understanding task, which plays an essential role in enhancing online shopping experience. Product understanding task includes a variety of sub-tasks, which require models to respond diverse queries based on multi-modal product information. Traditional methods design distinct model architectures for each sub-task. On the contrary, we present PUMGPT, a large vision-language model aims at unifying all product understanding tasks under a singular model structure. To bridge the gap between vision and text representations, we propose Layer-wise Adapters (LA), an approach that provides enhanced alignment with fewer visual tokens and enables parameter-efficient fine-tuning. Moreover, the inherent parameter-efficient fine-tuning ability allows PUMGPT to be readily adapted to new product understanding tasks and emerging products. We design instruction templates to generate diverse product instruction datasets. Simultaneously, we utilize open-domain datasets during training to improve the performance of PUMGPT and its generalization ability. Through extensive evaluations, PUMGPT demonstrates its superior performance across multiple product understanding tasks, including product captioning, category question-answering, attribute extraction, attribute question-answering, and even free-form question-answering about products.

artificial intelligence, natural language, pumgpt, (14 more...)

arXiv.org Artificial Intelligence

2308.09568

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Services > e-Commerce Services (0.34)
Retail > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback