AITopics | Peng, Zhiyuan

Collaborating Authors

Peng, Zhiyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ScopeQA: A Framework for Generating Out-of-Scope Questions for RAG

Peng, Zhiyuan, Nian, Jinming, Evfimievski, Alexandre, Fang, Yi

arXiv.org Artificial IntelligenceDec-19-2024

Conversational AI agents use Retrieval Augmented Generation (RAG) to provide verifiable document-grounded responses to user inquiries. However, many natural questions do not have good answers: about 25\% contain false assumptions~\cite{Yu2023:CREPE}, and over 50\% are ambiguous~\cite{DBLP:conf/emnlp/MinMHZ20}. RAG agents need high-quality data to improve their responses to confusing questions. This paper presents a novel guided hallucination-based method to efficiently generate a diverse set of borderline out-of-scope confusing questions for a given document corpus. We conduct an empirical comparative evaluation of several large language models as RAG agents to measure the accuracy of confusion detection and appropriate response generation. We contribute a benchmark dataset to the public domain.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.14567

Country:

North America > United States (0.69)
Europe (0.68)
Asia (0.68)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Football (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications

Lin, Ethan, Peng, Zhiyuan, Fang, Yi

arXiv.org Artificial IntelligenceSep-25-2024

Recent studies have evaluated the creativity/novelty of large language models (LLMs) primarily from a semantic perspective, using benchmarks from cognitive science. However, accessing the novelty in scholarly publications is a largely unexplored area in evaluating LLMs. In this paper, we introduce a scholarly novelty benchmark (SchNovel) to evaluate LLMs' ability to assess novelty in scholarly papers. SchNovel consists of 15000 pairs of papers across six fields sampled from the arXiv dataset with publication dates spanning 2 to 10 years apart. In each pair, the more recently published paper is assumed to be more novel. Additionally, we propose RAG-Novelty, which simulates the review process taken by human reviewers by leveraging the retrieval of similar papers to assess novelty. Extensive experiments provide insights into the capabilities of different LLMs to assess novelty and demonstrate that RAG-Novelty outperforms recent baseline models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.16605

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.66)
Research Report > Promising Solution (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Dy-mer: An Explainable DNA Sequence Representation Scheme using Sparse Recovery

Peng, Zhiyuan, Tang, Yuanbo, Li, Yang

arXiv.org Artificial IntelligenceJul-6-2024

DNA sequences encode vital genetic and biological information, yet these unfixed-length sequences cannot serve as the input of common data mining algorithms. Hence, various representation schemes have been developed to transform DNA sequences into fixed-length numerical representations. However, these schemes face difficulties in learning high-quality representations due to the complexity and sparsity of DNA data. Additionally, DNA sequences are inherently noisy because of mutations. While several schemes have been proposed for their effectiveness, they often lack semantic structure, making it difficult for biologists to validate and leverage the results. To address these challenges, we propose \textbf{Dy-mer}, an explainable and robust DNA representation scheme based on sparse recovery. Leveraging the underlying semantic structure of DNA, we modify the traditional sparse recovery to capture recurring patterns indicative of biological functions by representing frequent K-mers as basis vectors and reconstructing each DNA sequence through simple concatenation. Experimental results demonstrate that \textbf{Dy-mer} achieves state-of-the-art performance in DNA promoter classification, yielding a remarkable \textbf{13\%} increase in accuracy. Moreover, its inherent explainability facilitates DNA clustering and motif detection, enhancing its utility in biological research.

artificial intelligence, bioinformatics, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2407.12051

Country:

North America > United States (0.28)
Asia > China > Guangdong Province (0.14)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks

Zhang, Zifan, Liu, Yuchen, Peng, Zhiyuan, Chen, Mingzhe, Xu, Dongkuan, Cui, Shuguang

arXiv.org Artificial IntelligenceJun-28-2024

Optimizing edge caching is crucial for the advancement of next-generation (nextG) wireless networks, ensuring high-speed and low-latency services for mobile users. Existing data-driven optimization approaches often lack awareness of the distribution of random data variables and focus solely on optimizing cache hit rates, neglecting potential reliability concerns, such as base station overload and unbalanced cache issues. This oversight can result in system crashes and degraded user experience. To bridge this gap, we introduce a novel digital twin-assisted optimization framework, called D-REC, which integrates reinforcement learning (RL) with diverse intervention modules to ensure reliable caching in nextG wireless networks. We first develop a joint vertical and horizontal twinning approach to efficiently create network digital twins, which are then employed by D-REC as RL optimizers and safeguards, providing ample datasets for training and predictive evaluation of our cache replacement policy. By incorporating reliability modules into a constrained Markov decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints, minimizing the risk of network failures. Theoretical analysis demonstrates comparable convergence rates between D-REC and vanilla data-driven methods without compromising caching performance. Extensive experiments validate that D-REC outperforms conventional approaches in cache hit rate and load balancing while effectively enforcing predetermined reliability intervention modules.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2407.00286

Country: North America > United States > North Carolina (0.14)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Telecommunications (1.00)
Information Technology > Security & Privacy (0.93)
Information Technology > Networks (0.66)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models

Wu, Xuyang, Peng, Zhiyuan, Sai, Krishna Sravanthi Rajanala, Wu, Hsin-Tai, Fang, Yi

arXiv.org Artificial IntelligenceJun-20-2024

Effective passage retrieval and reranking methods have been widely utilized to identify suitable candidates in open-domain question answering tasks, recent studies have resorted to LLMs for reranking the retrieved passages by the log-likelihood of the question conditioned on each passage. Although these methods have demonstrated promising results, the performance is notably sensitive to the human-written prompt (or hard prompt), and fine-tuning LLMs can be computationally intensive and time-consuming. Furthermore, this approach limits the leverage of question-passage relevance pairs and passage-specific knowledge to enhance the ranking capabilities of LLMs. In this paper, we propose passage-specific prompt tuning for reranking in open-domain question answering (PSPT): a parameter-efficient method that fine-tunes learnable passage-specific soft prompts, incorporating passage-specific knowledge from a limited set of question-passage relevance pairs. The method involves ranking retrieved passages based on the log-likelihood of the model generating the question conditioned on each passage and the learned soft prompt. We conducted extensive experiments utilizing the Llama-2-chat-7B model across three publicly available open-domain question answering datasets and the results demonstrate the effectiveness of the proposed approach.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.20654

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models

Peng, Zhiyuan, Wu, Xuyang, Wang, Qifan, Rajanala, Sravanthi, Fang, Yi

arXiv.org Artificial IntelligenceApr-11-2024

Parameter Efficient Fine-Tuning (PEFT) methods have been extensively utilized in Large Language Models (LLMs) to improve the down-streaming tasks without the cost of fine-tuing the whole LLMs. Recent studies have shown how to effectively use PEFT for fine-tuning LLMs in ranking tasks with convincing performance; there are some limitations, including the learned prompt being fixed for different documents, overfitting to specific tasks, and low adaptation ability. In this paper, we introduce a query-dependent parameter efficient fine-tuning (Q-PEFT) approach for text reranking to leak the information of the true queries to LLMs and then make the generation of true queries from input documents much easier. Specifically, we utilize the query to extract the top-$k$ tokens from concatenated documents, serving as contextual clues. We further augment Q-PEFT by substituting the retrieval mechanism with a multi-head attention layer to achieve end-to-end training and cover all the tokens in the documents, guiding the LLMs to generate more document-specific synthetic queries, thereby further improving the reranking performance. Extensive experiments are conducted on four public datasets, demonstrating the effectiveness of our proposed approach.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.04522

Country:

Europe > Italy (0.46)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph

Liu, Xukun, Peng, Zhiyuan, Yi, Xiaoyuan, Xie, Xing, Xiang, Lirong, Liu, Yuchen, Xu, Dongkuan

arXiv.org Artificial IntelligenceFeb-28-2024

While achieving remarkable progress in a broad range of tasks, large language models (LLMs) remain significantly limited in properly using massive external tools. Existing in-context learning approaches simply format tools into a list of plain text descriptions and input them to LLMs, from which, LLMs generate a sequence of tool calls to solve problems step by step. Such a paradigm ignores the intrinsic dependency between tools and offloads all reasoning loads to LLMs, making them restricted to a limited number of specifically designed tools. It thus remains challenging for LLMs to operate on a library of massive tools, casting a great limitation when confronted with real-world scenarios. This paper proposes ToolNet, a plug-and-play framework that scales up the number of tools to thousands with a moderate increase in token consumption. ToolNet organizes tools into a directed graph. Each node represents a tool, and weighted edges denote tool transition. Starting from an initial tool node, an LLM navigates in the graph by iteratively choosing the next one from its successors until the task is resolved. Extensive experiments show that ToolNet can achieve impressive results in challenging multi-hop tool learning datasets and is resilient to tool failures.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.00839

Country: Asia (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Extending Whisper with prompt tuning to target-speaker ASR

Ma, Hao, Peng, Zhiyuan, Shao, Mingjie, Li, Jing, Liu, Ju

arXiv.org Artificial IntelligenceJan-11-2024

Target-speaker automatic speech recognition (ASR) aims to transcribe the desired speech of a target speaker from multi-talker overlapped utterances. Most of the existing target-speaker ASR (TS-ASR) methods involve either training from scratch or fully fine-tuning a pre-trained model, leading to significant training costs and becoming inapplicable to large foundation models. This work leverages prompt tuning, a parameter-efficient fine-tuning approach, to extend Whisper, a large-scale single-talker ASR model, to TS-ASR. Variants of prompt tuning approaches along with their configurations are explored and optimized for TS-ASR.Experimental results show that prompt tuning can achieve performance comparable to state-of-the-art full training approaches while only requiring about 1\% of task-specific model parameters. Notably, the original Whisper's features, such as inverse text normalization and timestamp tagging, are retained in target-speaker ASR, keeping the generated transcriptions natural and informative.

machine learning, natural language, whisper, (18 more...)

arXiv.org Artificial Intelligence

2312.08079

Country: Asia > China > Shandong Province (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Explainable Trajectory Representation through Dictionary Learning

Tang, Yuanbo, Peng, Zhiyuan, Li, Yang

arXiv.org Artificial IntelligenceDec-13-2023

Trajectory representation learning on a network enhances our understanding of vehicular traffic patterns and benefits numerous downstream applications. Existing approaches using classic machine learning or deep learning embed trajectories as dense vectors, which lack interpretability and are inefficient to store and analyze in downstream tasks. In this paper, an explainable trajectory representation learning framework through dictionary learning is proposed. Given a collection of trajectories on a network, it extracts a compact dictionary of commonly used subpaths called "pathlets", which optimally reconstruct each trajectory by simple concatenations. The resulting representation is naturally sparse and encodes strong spatial semantics. Theoretical analysis of our proposed algorithm is conducted to provide a probabilistic bound on the estimation error of the optimal dictionary. A hierarchical dictionary learning scheme is also proposed to ensure the algorithm's scalability on large networks, leading to a multi-scale trajectory representation. Our framework is evaluated on two large-scale real-world taxi datasets. Compared to previous work, the dictionary learned by our method is more compact and has better reconstruction rate for new trajectories. We also demonstrate the promising performance of this method in downstream tasks including trip time prediction task and data compression.

data mining, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2312.08052

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Ground (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Soft Prompt Tuning for Augmenting Dense Retrieval with Large Language Models

Peng, Zhiyuan, Wu, Xuyang, Fang, Yi

arXiv.org Artificial IntelligenceAug-29-2023

Dense retrieval (DR) converts queries and documents into dense embeddings and measures the similarity between queries and documents in vector space. One of the challenges in DR is the lack of domain-specific training data. While DR models can learn from large-scale public datasets like MS MARCO through transfer learning, evidence shows that not all DR models and domains can benefit from transfer learning equally. Recently, some researchers have resorted to large language models (LLMs) to improve the zero-shot and few-shot DR models. However, the hard prompts or human-written prompts utilized in these works cannot guarantee the good quality of generated weak queries. To tackle this, we propose soft prompt tuning for augmenting DR (SPTAR): For each task, we leverage soft prompt-tuning to optimize a task-specific soft prompt on limited ground truth data and then prompt the LLMs to tag unlabeled documents with weak queries, yielding enough weak document-query pairs to train task-specific dense retrievers. We design a filter to select high-quality example document-query pairs in the prompt to further improve the quality of weak tagged queries. To the best of our knowledge, there is no prior work utilizing soft prompt tuning to augment DR models. The experiments demonstrate that SPTAR outperforms the unsupervised baselines BM25 and the recently proposed LLMs-based augmentation method for DR.

augmenting dense retrieval, large language model, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2307.08303

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback