AITopics | Gao, Ming

Collaborating Authors

Gao, Ming

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Circuit Diagram Retrieval Based on Hierarchical Circuit Graph Representation

Gao, Ming, Qiu, Ruichen, Chang, Zeng Hui, Zhang, Kanjian, Wei, Haikun, Chen, Hong Cai

arXiv.org Artificial IntelligenceFeb-5-2025

In the domain of analog circuit design, the retrieval of circuit diagrams has drawn a great interest, primarily due to its vital role in the consultation of legacy designs and the detection of design plagiarism. Existing image retrieval techniques are adept at handling natural images, which converts images into feature vectors and retrieval similar images according to the closeness of these vectors. Nonetheless, these approaches exhibit limitations when applied to the more specialized and intricate domain of circuit diagrams. This paper presents a novel approach to circuit diagram retrieval by employing a graph representation of circuit diagrams, effectively reformulating the retrieval task as a graph retrieval problem. The proposed methodology consists of two principal components: a circuit diagram recognition algorithm designed to extract the circuit components and topological structure of the circuit using proposed GAM-YOLO model and a 2-step connected domain filtering algorithm, and a hierarchical retrieval strategy based on graph similarity and different graph representation methods for analog circuits. Our methodology pioneers the utilization of graph representation in the retrieval of circuit diagrams, incorporating topological features that are commonly overlooked by standard image retrieval methods. The results of our experiments substantiate the efficacy of our approach in retrieving circuit diagrams across of different types.

artificial intelligence, circuit diagram, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2503.11658

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization

Wu, Jiayi, Cai, Hengyi, Yan, Lingyong, Sun, Hao, Li, Xiang, Wang, Shuaiqiang, Yin, Dawei, Gao, Ming

arXiv.org Artificial IntelligenceDec-18-2024

The emergence of Retrieval-augmented generation (RAG) has alleviated the issues of outdated and hallucinatory content in the generation of large language models (LLMs), yet it still reveals numerous limitations. When a general-purpose LLM serves as the RAG generator, it often suffers from inadequate response informativeness, response robustness, and citation quality. Past approaches to tackle these limitations, either by incorporating additional steps beyond generating responses or optimizing the generator through supervised fine-tuning (SFT), still failed to align with the RAG requirement thoroughly. Consequently, optimizing the RAG generator from multiple preference perspectives while maintaining its end-to-end LLM form remains a challenge. To bridge this gap, we propose Multiple Perspective Preference Alignment for Retrieval-Augmented Generation (PA-RAG), a method for optimizing the generator of RAG systems to align with RAG requirements comprehensively. Specifically, we construct high-quality instruction fine-tuning data and multi-perspective preference data by sampling varied quality responses from the generator across different prompt documents quality scenarios. Subsequently, we optimize the generator using SFT and Direct Preference Optimization (DPO). Extensive experiments conducted on four question-answer datasets across three LLMs demonstrate that PA-RAG can significantly enhance the performance of RAG generators. Our code and datasets are available at https://github.com/wujwyi/PA-RAG.

generator, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.1451

Country:

Asia (1.00)
Europe > United Kingdom > England (0.47)
North America > United States > Alaska (0.28)

Genre:

Personal (1.00)
Research Report (0.63)

Industry:

Media (1.00)
Leisure & Entertainment > Sports > Olympic Games (1.00)
Leisure & Entertainment > Sports > Motorsports > Formula One (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

DocEDA: Automated Extraction and Design of Analog Circuits from Documents with Large Language Model

Chen, Hong Cai, Wu, Longchang, Gao, Ming, Shen, Lingrui, Zhong, Jiarui, Xu, Yipin

arXiv.org Artificial IntelligenceNov-25-2024

Efficient and accurate extraction of electrical parameters from circuit datasheets and design documents is critical for accelerating circuit design in Electronic Design Automation (EDA). Traditional workflows often rely on engineers manually searching and extracting these parameters, which is time-consuming, and prone to human error. To address these challenges, we introduce DocEDA, an automated system that leverages advanced computer vision techniques and Large Language Models (LLMs) to extract electrical parameters seamlessly from documents. The layout analysis model specifically designed for datasheet is proposed to classify documents into circuit-related parts. Utilizing the inherent Chain-of-Thought reasoning capabilities of LLMs, DocEDA automates the extraction of electronic component parameters from documents. For circuit diagrams parsing, an improved GAM-YOLO model is hybrid with topology identification to transform diagrams into circuit netlists. Then, a space mapping enhanced optimization framework is evoked for optimization the layout in the document. Experimental evaluations demonstrate that DocEDA significantly enhances the efficiency of processing circuit design documents and the accuracy of electrical parameter extraction. It exhibits adaptability to various circuit design scenarios and document formats, offering a novel solution for EDA with the potential to transform traditional methodologies.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.05301

Country: Asia > China (0.16)

Genre: Research Report > Promising Solution (0.34)

Industry: Semiconductors & Electronics (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Cross-model Control: Improving Multiple Large Language Models in One-time Training

Wu, Jiayi, Sun, Hao, Cai, Hengyi, Su, Lixin, Wang, Shuaiqiang, Yin, Dawei, Li, Xiang, Gao, Ming

arXiv.org Artificial IntelligenceOct-23-2024

The number of large language models (LLMs) with varying parameter scales and vocabularies is increasing. While they deliver powerful performance, they also face a set of common optimization needs to meet specific requirements or standards, such as instruction following or avoiding the output of sensitive information from the real world. However, how to reuse the fine-tuning outcomes of one model to other models to reduce training costs remains a challenge. To bridge this gap, we introduce Cross-model Control (CMC), a method that improves multiple LLMs in one-time training with a portable tiny language model. Specifically, we have observed that the logit shift before and after fine-tuning is remarkably similar across different models. Based on this insight, we incorporate a tiny language model with a minimal number of parameters. By training alongside a frozen template LLM, the tiny model gains the capability to alter the logits output by the LLMs. To make this tiny language model applicable to models with different vocabularies, we propose a novel token mapping strategy named PM-MinED. We have conducted extensive experiments on instruction tuning and unlearning tasks, demonstrating the effectiveness of CMC. Our code is available at https://github.com/wujwyi/CMC.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.17599

Country:

Asia (0.68)
North America > United States (0.46)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction

Wang, Guangyu, Chen, Yujie, Gao, Ming, Wu, Zhiqiao, Tang, Jiafu, Zhao, Jiabi

arXiv.org Artificial IntelligenceSep-25-2024

Accurate traffic prediction faces significant challenges, necessitating a deep understanding of both temporal and spatial cues and their complex interactions across multiple variables. Recent advancements in traffic prediction systems are primarily due to the development of complex sequence-centric models. However, existing approaches often embed multiple variables and spatial relationships at each time step, which may hinder effective variable-centric learning, ultimately leading to performance degradation in traditional traffic prediction tasks. To overcome these limitations, we introduce variable-centric and prior knowledge-centric modeling techniques. Specifically, we propose a Heterogeneous Mixture of Experts (TITAN) model for traffic flow prediction. TITAN initially consists of three experts focused on sequence-centric modeling. Then, designed a low-rank adaptive method, TITAN simultaneously enables variable-centric modeling. Furthermore, we supervise the gating process using a prior knowledge-centric modeling strategy to ensure accurate routing. Experiments on two public traffic network datasets, METR-LA and PEMS-BAY, demonstrate that TITAN effectively captures variable-centric dependencies while ensuring accurate routing. Consequently, it achieves improvements in all evaluation metrics, ranging from approximately 4.37\% to 11.53\%, compared to previous state-of-the-art (SOTA) models. The code is open at \href{https://github.com/sqlcow/TITAN}{https://github.com/sqlcow/TITAN}.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2409.1744

Country:

North America > United States > California (0.14)
Asia > China > Liaoning Province (0.14)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Consumer Products & Services > Travel (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(3 more...)

Add feedback

Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design

Gao, Ming, Chen, Hang, Du, Jun, Xu, Xin, Guo, Hongxiao, Bu, Hui, Yang, Jianxing, Li, Ming, Lee, Chin-Hui

arXiv.org Artificial IntelligenceJun-13-2024

Smart home technology has gained widespread adoption, facilitating effortless control of devices through voice commands. However, individuals with dysarthria, a motor speech disorder, face challenges due to the variability of their speech. This paper addresses the wake-up word spotting (WWS) task for dysarthric individuals, aiming to integrate them into real-world applications. To support this, we release the open-source Mandarin Dysarthria Speech Corpus (MDSC), a dataset designed for dysarthric individuals in home environments. MDSC encompasses information on age, gender, disease types, and intelligibility evaluations. Furthermore, we perform comprehensive experimental analysis on MDSC, highlighting the challenges encountered. We also develop a customized dysarthria WWS system that showcases robustness in handling intelligibility and achieving exceptional performance. MDSC will be released on https://www.aishelltech.com/AISHELL_6B.

artificial intelligence, dysarthria, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2406.10304

Country:

Asia > China (0.15)
North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.48)

Add feedback

A method for quantifying the generalization capabilities of generative models for solving Ising models

Ma, Qunlong, Ma, Zhi, Gao, Ming

arXiv.org Artificial IntelligenceMay-6-2024

For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilities. Here we design a Hamming distance regularizer in the framework of a class of generative models, variational autoregressive networks (VAN), to quantify the generalization capabilities of various network architectures combined with VAN. The regularizer can control the size of the overlaps between the ground state and the training datasets generated by networks, which, together with the success rates of finding the ground state, form a quantitative metric to quantify their generalization capabilities. We conduct numerical experiments on several prototypical network architectures combined with VAN, including feed-forward neural networks, recurrent neural networks, and graph neural networks, to quantify their generalization capabilities when solving Ising models. Moreover, considering the fact that the quantification of the generalization capabilities of networks on small-scale problems can be used to predict their relative performance on large-scale problems, our method is of great significance for assisting in the Neural Architecture Search field of searching for the optimal network architectures when solving large-scale Ising models.

artificial intelligence, machine learning, network architecture, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/ad3710

2405.03435

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Structure-aware Fine-tuning for Code Pre-trained Models

Wu, Jiayi, Zhu, Renyu, Chen, Nuo, Sun, Qiushi, Li, Xiang, Gao, Ming

arXiv.org Artificial IntelligenceApr-11-2024

Over the past few years, we have witnessed remarkable advancements in Code Pre-trained Models (CodePTMs). These models achieved excellent representation capabilities by designing structure-based pre-training tasks for code. However, how to enhance the absorption of structural knowledge when fine-tuning CodePTMs still remains a significant challenge. To fill this gap, in this paper, we present Structure-aware Fine-tuning (SAT), a novel structure-enhanced and plug-and-play fine-tuning method for CodePTMs. We first propose a structure loss to quantify the difference between the information learned by CodePTMs and the knowledge extracted from code structure. Specifically, we use the attention scores extracted from Transformer layer as the learned structural information, and the shortest path length between leaves in abstract syntax trees as the structural knowledge. Subsequently, multi-task learning is introduced to improve the performance of fine-tuning. Experiments conducted on four pre-trained models and two generation tasks demonstrate the effectiveness of our proposed method as a plug-and-play solution. Furthermore, we observed that SAT can benefit CodePTMs more with limited training data.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.07471

Country:

North America > Canada (0.28)
Asia > Japan (0.28)
Asia > Middle East > UAE (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Message Passing Variational Autoregressive Network for Solving Intractable Ising Models

Ma, Qunlong, Ma, Zhi, Xu, Jinlong, Zhang, Hairui, Gao, Ming

arXiv.org Artificial IntelligenceApr-9-2024

Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks. Learning a probability distribution of energy configuration or finding the ground states of a disordered, fully connected Ising model is essential for statistical mechanics and NP-hard problems. Despite tremendous efforts, a neural network architecture with the ability to high-accurately solve these fully connected and extremely intractable problems on larger systems is still lacking. Here we propose a variational autoregressive architecture with a message passing mechanism, which can effectively utilize the interactions between spin variables. The new network trained under an annealing framework outperforms existing methods in solving several prototypical Ising spin Hamiltonians, especially for larger spin systems at low temperatures. The advantages also come from the great mitigation of mode collapse during the training process of deep neural networks. Considering these extremely difficult problems to be solved, our method extends the current computational limits of unsupervised neural networks to solve combinatorial optimization problems.

artificial intelligence, machine learning, mechanism, (19 more...)

arXiv.org Artificial Intelligence

2404.06225

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Personalized Programming Guidance based on Deep Programming Learning Style Capturing

Liu, Yingfan, Zhu, Renyu, Gao, Ming

arXiv.org Artificial IntelligenceFeb-20-2024

With the rapid development of big data and AI technology, programming is in high demand and has become an essential skill for students. Meanwhile, researchers also focus on boosting the online judging system's guidance ability to reduce students' dropout rates. Previous studies mainly targeted at enhancing learner engagement on online platforms by providing personalized recommendations. However, two significant challenges still need to be addressed in programming: C1) how to recognize complex programming behaviors; C2) how to capture intrinsic learning patterns that align with the actual learning process. To fill these gaps, in this paper, we propose a novel model called Programming Exercise Recommender with Learning Style (PERS), which simulates learners' intricate programming behaviors. Specifically, since programming is an iterative and trial-and-error process, we first introduce a positional encoding and a differentiating module to capture the changes of consecutive code submissions (which addresses C1). To better profile programming behaviors, we extend the Felder-Silverman learning style model, a classical pedagogical theory, to perceive intrinsic programming patterns. Based on this, we align three latent vectors to record and update programming ability, processing style, and understanding style, respectively (which addresses C2). We perform extensive experiments on two real-world datasets to verify the rationality of modeling programming learning styles and the effectiveness of PERS for personalized programming guidance.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2403.14638

Country:

Asia > China > Zhejiang Province (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting > Online (0.69)
Education > Educational Technology > Educational Software > Computer Based Training (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback