AITopics | Zhang, Ruizhe

Collaborating Authors

Zhang, Ruizhe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RAGraph: A General Retrieval-Augmented Graph Learning Framework

Jiang, Xinke, Qiu, Rihong, Xu, Yongxin, Zhang, Wentao, Zhu, Yichen, Zhang, Ruizhe, Fang, Yuchen, Chu, Xu, Zhao, Junfeng, Wang, Yasha

arXiv.org Artificial IntelligenceDec-7-2024

Graph Neural Networks (GNNs) have become essential in interpreting relational data across various domains, yet, they often struggle to generalize to unseen graph data that differs markedly from training instances. In this paper, we introduce a novel framework called General Retrieval-Augmented Graph Learning (RAGraph), which brings external graph data into the general graph foundation model to improve model generalization on unseen scenarios. On the top of our framework is a toy graph vector library that we established, which captures key attributes, such as features and task-specific label information. During inference, the RAGraph adeptly retrieves similar toy graphs based on key similarities in downstream tasks, integrating the retrieved data to enrich the learning context via the message-passing prompting mechanism. Our extensive experimental evaluations demonstrate that RAGraph significantly outperforms state-of-the-art graph learning methods in multiple tasks such as node classification, link prediction, and graph classification across both dynamic and static datasets. Furthermore, extensive testing confirms that RAGraph consistently maintains high performance without the need for task-specific fine-tuning, highlighting its adaptability, robustness, and broad applicability.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.23855

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology (0.92)
Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models

Li, Haitao, Chen, You, Ai, Qingyao, Wu, Yueyue, Zhang, Ruizhe, Liu, Yiqun

arXiv.org Artificial IntelligenceNov-26-2024

Large language models (LLMs) have made significant progress in natural language processing tasks and demonstrate considerable potential in the legal domain. However, legal applications demand high standards of accuracy, reliability, and fairness. Applying existing LLMs to legal systems without careful evaluation of their potential and limitations could pose significant risks in legal practice. To this end, we introduce a standardized comprehensive Chinese legal benchmark LexEval. This benchmark is notable in the following three aspects: (1) Ability Modeling: We propose a new taxonomy of legal cognitive abilities to organize different tasks. (2) Scale: To our knowledge, LexEval is currently the largest Chinese legal evaluation dataset, comprising 23 tasks and 14,150 questions. (3) Data: we utilize formatted existing datasets, exam datasets and newly annotated datasets by legal experts to comprehensively evaluate the various capabilities of LLMs. LexEval not only focuses on the ability of LLMs to apply fundamental legal knowledge but also dedicates efforts to examining the ethical issues involved in their application. We evaluated 38 open-source and commercial LLMs and obtained some interesting findings. The experiments and findings offer valuable insights into the challenges and potential solutions for developing Chinese legal systems and LLM evaluation pipelines. The LexEval dataset and leaderboard are publicly available at \url{https://github.com/CSHaitao/LexEval} and will be continuously updated.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.20288

Country: Asia > China > Liaoning Province (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Law > Litigation (1.00)
Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning

Xu, Yongxin, Zhang, Ruizhe, Jiang, Xinke, Feng, Yujie, Xiao, Yuzhen, Ma, Xinyu, Zhu, Runchuan, Chu, Xu, Zhao, Junfeng, Wang, Yasha

arXiv.org Artificial IntelligenceOct-20-2024

Retrieval-Augmented Generation (RAG) offers an effective solution to the issues faced by Large Language Models (LLMs) in hallucination generation and knowledge obsolescence by incorporating externally retrieved knowledge. However, existing methods lack effective control mechanisms for integrating internal and external knowledge. Inspired by human cognitive processes, we propose Parenting, a novel framework that decouples, identifies, and purposefully optimizes parameter subspaces related to adherence and robustness. Specifically, Parenting utilizes a key parameter mining method that combines forward and backward propagation signals to localize subspaces representing different capabilities. Then, Parenting employs a type-tailored tuning strategy, applying specific and appropriate optimizations to different subspaces, aiming to achieve a balanced enhancement of both adherence and robustness. Extensive experiments on various datasets and models validate the effectiveness and generalizability of our method.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.1036

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation Ethics of LLMs in Legal Domain

Zhang, Ruizhe, Li, Haitao, Wu, Yueyue, Ai, Qingyao, Liu, Yiqun, Zhang, Min, Ma, Shaoping

arXiv.org Artificial IntelligenceMar-17-2024

In recent years, the utilization of large language models for natural language dialogue has gained momentum, leading to their widespread adoption across various domains. However, their universal competence in addressing challenges specific to specialized fields such as law remains a subject of scrutiny. The incorporation of legal ethics into the model has been overlooked by researchers. We asserts that rigorous ethic evaluation is essential to ensure the effective integration of large language models in legal domains, emphasizing the need to assess domain-specific proficiency and domain-specific ethic. To address this, we propose a novelty evaluation methodology, utilizing authentic legal cases to evaluate the fundamental language abilities, specialized legal knowledge and legal robustness of large language models (LLMs). The findings from our comprehensive evaluation contribute significantly to the academic discourse surrounding the suitability and performance of large language models in legal domains.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.11152

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Law > Criminal Law (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Quantum Speedup for Spectral Approximation of Kronecker Products

Gao, Yeqi, Song, Zhao, Zhang, Ruizhe

arXiv.org Artificial IntelligenceFeb-10-2024

Given its widespread application in machine learning and optimization, the Kronecker product emerges as a pivotal linear algebra operator. However, its computational demands render it an expensive operation, leading to heightened costs in spectral approximation of it through traditional computation algorithms. Existing classical methods for spectral approximation exhibit a linear dependency on the matrix dimension denoted by $n$, considering matrices of size $A_1 \in \mathbb{R}^{n \times d}$ and $A_2 \in \mathbb{R}^{n \times d}$. Our work introduces an innovative approach to efficiently address the spectral approximation of the Kronecker product $A_1 \otimes A_2$ using quantum methods. By treating matrices as quantum states, our proposed method significantly reduces the time complexity of spectral approximation to $O_{d,\epsilon}(\sqrt{n})$.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

2402.07027

Country: North America > United States > California (0.14)

Genre:

Overview (0.87)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.89)

Add feedback

Infinite-Horizon Graph Filters: Leveraging Power Series to Enhance Sparse Information Aggregation

Zhang, Ruizhe, Jiang, Xinke, Fang, Yuchen, Luo, Jiayuan, Xu, Yongxin, Zhu, Yichen, Chu, Xu, Zhao, Junfeng, Wang, Yasha

arXiv.org Artificial IntelligenceJan-23-2024

Graph Neural Networks (GNNs) have shown considerable effectiveness in a variety of graph learning tasks, particularly those based on the message-passing approach in recent years. However, their performance is often constrained by a limited receptive field, a challenge that becomes more acute in the presence of sparse graphs. In light of the power series, which possesses infinite expansion capabilities, we propose a novel Graph Power Filter Neural Network (GPFN) that enhances node classification by employing a power series graph filter to augment the receptive field. Concretely, our GPFN designs a new way to build a graph filter with an infinite receptive field based on the convergence power series, which can be analyzed in the spectral and spatial domains. Besides, we theoretically prove that our GPFN is a general framework that can integrate any power series and capture long-range dependencies. Finally, experimental results on three datasets demonstrate the superiority of our GPFN over state-of-the-art baselines.

artificial intelligence, graph filter, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2401.09943

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Think and Retrieval: A Hypothesis Knowledge Graph Enhanced Medical Large Language Models

Jiang, Xinke, Zhang, Ruizhe, Xu, Yongxin, Qiu, Rihong, Fang, Yue, Wang, Zhiyuan, Tang, Jinyi, Ding, Hongxin, Chu, Xu, Zhao, Junfeng, Wang, Yasha

arXiv.org Artificial IntelligenceDec-25-2023

We explore how the rise of Large Language Models (LLMs) significantly impacts task performance in the field of Natural Language Processing. We focus on two strategies, Retrieval-Augmented Generation (RAG) and Fine-Tuning (FT), and propose the Hypothesis Knowledge Graph Enhanced (HyKGE) framework, leveraging a knowledge graph to enhance medical LLMs. By integrating LLMs and knowledge graphs, HyKGE demonstrates superior performance in addressing accuracy and interpretability challenges, presenting potential applications in the medical domain. Our evaluations using real-world datasets highlight HyKGE's superiority in providing accurate knowledge with precise confidence, particularly in complex and difficult scenarios. The code will be available until published.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2312.15883

Country: North America > United States (0.31)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters

Song, Zhao, Yin, Junze, Zhang, Ruizhe

arXiv.org Artificial IntelligenceNov-24-2023

Linear regression is one of the most fundamental linear algebra problems. Given a dense matrix $A \in \mathbb{R}^{n \times d}$ and a vector $b$, the goal is to find $x'$ such that $ \| Ax' - b \|_2^2 \leq (1+\epsilon) \min_{x} \| A x - b \|_2^2 $. The best classical algorithm takes $O(nd) + \mathrm{poly}(d/\epsilon)$ time [Clarkson and Woodruff STOC 2013, Nelson and Nguyen FOCS 2013]. On the other hand, quantum linear regression algorithms can achieve exponential quantum speedups, as shown in [Wang Phys. Rev. A 96, 012335, Kerenidis and Prakash ITCS 2017, Chakraborty, Gily{\'e}n and Jeffery ICALP 2019]. However, the running times of these algorithms depend on some quantum linear algebra-related parameters, such as $\kappa(A)$, the condition number of $A$. In this work, we develop a quantum algorithm that runs in $\widetilde{O}(\epsilon^{-1}\sqrt{n}d^{1.5}) + \mathrm{poly}(d/\epsilon)$ time. It provides a quadratic quantum speedup in $n$ over the classical lower bound without any dependence on data-dependent parameters. In addition, we also show our result can be generalized to multiple regression and ridge linear regression.

artificial intelligence, machine learning, regression, (16 more...)

arXiv.org Artificial Intelligence

2311.14823

Country:

North America > United States (0.18)
Europe (0.14)

Genre: Research Report > New Finding (0.35)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time

Song, Zhao, Zhang, Lichen, Zhang, Ruizhe

arXiv.org Artificial IntelligenceNov-23-2023

We consider the problem of training a multi-layer over-parametrized neural network to minimize the empirical risk induced by a loss function. In the typical setting of over-parametrization, the network width $m$ is much larger than the data dimension $d$ and the number of training samples $n$ ($m=\mathrm{poly}(n,d)$), which induces a prohibitive large weight matrix $W\in \mathbb{R}^{m\times m}$ per layer. Naively, one has to pay $O(m^2)$ time to read the weight matrix and evaluate the neural network function in both forward and backward computation. In this work, we show how to reduce the training cost per iteration. Specifically, we propose a framework that uses $m^2$ cost only in the initialization phase and achieves \emph{a truly subquadratic cost per iteration} in terms of $m$, i.e., $m^{2-\Omega(1)}$ per iteration. Our result has implications beyond standard over-parametrization theory, as it can be viewed as designing an efficient data structure on top of a pre-trained large model to further speed up the fine-tuning process, a core procedure to deploy large language models (LLM).

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Artificial Intelligence

2112.07628

Country:

North America > United States > Texas (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fast Quantum Algorithm for Attention Computation

Gao, Yeqi, Song, Zhao, Yang, Xin, Zhang, Ruizhe

arXiv.org Artificial IntelligenceJul-16-2023

Large language models (LLMs) have demonstrated exceptional performance across a wide range of tasks. These models, powered by advanced deep learning techniques, have revolutionized the field of natural language processing (NLP) and have achieved remarkable results in various language-related tasks. LLMs have excelled in tasks such as machine translation, sentiment analysis, question answering, text generation, text classification, language modeling, and more. They have proven to be highly effective in capturing complex linguistic patterns, understanding context, and generating coherent and contextually relevant text. The attention scheme plays a crucial role in the architecture of large language models (LLMs). It is a fundamental component that enables the model to capture and utilize contextual information during language processing tasks effectively. Making the attention scheme computation faster is one of the central questions to speed up the LLMs computation. It is well-known that quantum machine has certain computational advantages compared to the classical machine. However, it is currently unknown whether quantum computing can aid in LLM. In this work, we focus on utilizing Grover's Search algorithm to compute a sparse attention computation matrix efficiently. We achieve a polynomial quantum speed-up over the classical method. Moreover, the attention matrix outputted by our quantum algorithm exhibits an extra low-rank structure that will be useful in obtaining a faster training algorithm for LLMs. Additionally, we present a detailed analysis of the algorithm's error analysis and time complexity within the context of computing the attention matrix.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2307.08045

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback