AITopics | Jiang, Meng

Collaborating Authors

Jiang, Meng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning

Tan, Zhaoxuan, Zeng, Qingkai, Tian, Yijun, Liu, Zheyuan, Yin, Bing, Jiang, Meng

arXiv.org Artificial IntelligenceFeb-6-2024

Personalization in large language models (LLMs) is increasingly important, aiming to align LLM's interactions, content, and recommendations with individual user preferences. Recent advances in LLM personalization have spotlighted effective prompt design, by enriching user queries with non-parametric knowledge through behavior history retrieval and textual profiles. However, these approaches were limited due to a lack of model ownership, resulting in constrained customization and privacy issues. Moreover, they often failed to accurately capture user behavior patterns, especially in cases where user data were complex and dynamic. To address these shortcomings, we introduce One PEFT Per User (OPPU), which employs personalized parameter-efficient fine-tuning (PEFT) modules, to store user-specific behavior patterns and preferences. By plugging in users' personal PEFT parameters, they can own and use their LLMs personally. OPPU integrates parametric user knowledge in the personal PEFT parameters with the non-parametric knowledge acquired through retrieval and profile. This integration adapts individual LLMs to user behavior shifts. Experimental results demonstrate that OPPU significantly outperforms existing prompt-based methods across seven diverse tasks in the LaMP benchmark. Further in-depth studies reveal OPPU's enhanced capabilities in handling user behavior shifts, modeling users at different active levels, maintaining robustness across various user history formats, and displaying versatility with different PEFT methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.04401

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Large Language Models on Graphs: A Comprehensive Survey

Jin, Bowen, Liu, Gang, Han, Chi, Jiang, Meng, Ji, Heng, Han, Jiawei

arXiv.org Artificial IntelligenceFeb-1-2024

Large language models (LLMs), such as GPT4 and LLaMA, are creating significant advancements in natural language processing, due to their strong text encoding/decoding ability and newly found emergent capability (e.g., reasoning). While LLMs are mainly designed to process pure texts, there are many real-world scenarios where text data is associated with rich structure information in the form of graphs (e.g., academic networks, and e-commerce networks) or scenarios where graph data is paired with rich textual information (e.g., molecules with descriptions). Besides, although LLMs have shown their pure text-based reasoning ability, it is underexplored whether such ability can be generalized to graphs (i.e., graph-based reasoning). In this paper, we provide a systematic review of scenarios and techniques related to large language models on graphs. We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs. We then discuss detailed techniques for utilizing LLMs on graphs, including LLM as Predictor, LLM as Encoder, and LLM as Aligner, and compare the advantages and disadvantages of different schools of models. Furthermore, we discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets. Finally, we conclude with potential future research directions in this fast-growing field. The related source can be found at https://github.com/PeterGriffinJin/Awesome-Language-Model-on-Graphs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.02783

Country: North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Materials > Chemicals (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TrustLLM: Trustworthiness in Large Language Models

Sun, Lichao, Huang, Yue, Wang, Haoran, Wu, Siyuan, Zhang, Qihui, Gao, Chujie, Huang, Yixin, Lyu, Wenhan, Zhang, Yixuan, Li, Xiner, Liu, Zhengliang, Liu, Yixin, Wang, Yijue, Zhang, Zhikun, Kailkhura, Bhavya, Xiong, Caiming, Xiao, Chaowei, Li, Chunyuan, Xing, Eric, Huang, Furong, Liu, Hao, Ji, Heng, Wang, Hongyi, Zhang, Huan, Yao, Huaxiu, Kellis, Manolis, Zitnik, Marinka, Jiang, Meng, Bansal, Mohit, Zou, James, Pei, Jian, Liu, Jian, Gao, Jianfeng, Han, Jiawei, Zhao, Jieyu, Tang, Jiliang, Wang, Jindong, Mitchell, John, Shu, Kai, Xu, Kaidi, Chang, Kai-Wei, He, Lifang, Huang, Lifu, Backes, Michael, Gong, Neil Zhenqiang, Yu, Philip S., Chen, Pin-Yu, Gu, Quanquan, Xu, Ran, Ying, Rex, Ji, Shuiwang, Jana, Suman, Chen, Tianlong, Liu, Tianming, Zhou, Tianyi, Wang, William, Li, Xiang, Zhang, Xiangliang, Wang, Xiao, Xie, Xing, Chen, Xun, Wang, Xuyu, Liu, Yan, Ye, Yanfang, Cao, Yinzhi, Chen, Yong, Zhao, Yue

arXiv.org Artificial IntelligenceJan-25-2024

Large language models (LLMs), exemplified by ChatGPT, have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. Therefore, ensuring the trustworthiness of LLMs emerges as an important topic. This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. Our findings firstly show that in general trustworthiness and utility (i.e., functional effectiveness) are positively related. Secondly, our observations reveal that proprietary LLMs generally outperform most open-source counterparts in terms of trustworthiness, raising concerns about the potential risks of widely accessible open-source LLMs. However, a few open-source LLMs come very close to proprietary ones. Thirdly, it is important to note that some LLMs may be overly calibrated towards exhibiting trustworthiness, to the extent that they compromise their utility by mistakenly treating benign prompts as harmful and consequently not responding. Finally, we emphasize the importance of ensuring transparency not only in the models themselves but also in the technologies that underpin trustworthiness. Knowing the specific trustworthy technologies that have been employed is crucial for analyzing their effectiveness.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2401.05561

Country:

Europe (1.00)
Asia > Middle East (0.67)
North America > United States > California (0.67)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Personal > Interview (0.92)

Industry:

Media > News (1.00)
Leisure & Entertainment (1.00)
Law > Civil Rights & Constitutional Law (1.00)
(13 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Inverse Molecular Design with Multi-Conditional Diffusion Guidance

Liu, Gang, Xu, Jiaxin, Luo, Tengfei, Jiang, Meng

arXiv.org Artificial IntelligenceJan-24-2024

Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery. Despite success in unconditional molecule generation, integrating multiple properties such as synthetic score and gas permeability as condition constraints into diffusion models remains unexplored. We introduce multi-conditional diffusion guidance. The proposed Transformer-based denoising model has a condition encoder that learns the representations of numerical and categorical conditions. The denoising model, consisting of a structure encoder-decoder, is trained for denoising under the representation of conditions. The diffusion process becomes graph-dependent to accurately estimate graph-related noise in molecules, unlike the previous models that focus solely on the marginal distributions of atoms or bonds. We extensively validate our model for multi-conditional polymer and small molecule generation. Results demonstrate our superiority across metrics from distribution learning to condition control for molecular properties. An inverse polymer design task for gas separation with feedback from domain experts further demonstrates its practical utility.

artificial intelligence, machine learning, polymer, (20 more...)

arXiv.org Artificial Intelligence

2401.13858

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area (0.51)
Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis

Qin, Ruiyang, Xia, Jun, Jia, Zhenge, Jiang, Meng, Abbasi, Ahmed, Zhou, Peipei, Hu, Jingtong, Shi, Yiyu

arXiv.org Artificial IntelligenceJan-13-2024

After a large language model (LLM) is deployed on edge devices, it is desirable for these devices to learn from user-generated conversation data to generate user-specific and personalized responses in real-time. However, user-generated data usually contains sensitive and private information, and uploading such data to the cloud for annotation is not preferred if not prohibited. While it is possible to obtain annotation locally by directly asking users to provide preferred responses, such annotations have to be sparse to not affect user experience. In addition, the storage of edge devices is usually too limited to enable large-scale fine-tuning with full user-generated data. It remains an open question how to enable on-device LLM personalization, considering sparse annotation and limited on-device storage. In this paper, we propose a novel framework to select and store the most representative data online in a self-supervised way. Such data has a small memory footprint and allows infrequent requests of user annotations for further fine-tuning. To enhance fine-tuning quality, multiple semantically similar pairs of question texts and expected responses are generated using the LLM. Our experiments show that the proposed framework achieves the best user-specific content-generating capability (accuracy) and fine-tuning speed (performance) compared with vanilla baselines. To the best of our knowledge, this is the very first on-device LLM personalization framework.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2311.12275

Country:

North America > United States (0.15)
Europe > Italy (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

User Modeling in the Era of Large Language Models: Current Research and Future Directions

Tan, Zhaoxuan, Jiang, Meng

arXiv.org Artificial IntelligenceDec-23-2023

User modeling (UM) aims to discover patterns or learn representations from user data about the characteristics of a specific user, such as profile, preference, and personality. The user models enable personalization and suspiciousness detection in many online applications such as recommendation, education, and healthcare. Two common types of user data are text and graph, as the data usually contain a large amount of user-generated content (UGC) and online interactions. The research of text and graph mining is developing rapidly, contributing many notable solutions in the past two decades. Recently, large language models (LLMs) have shown superior performance on generating, understanding, and even reasoning over text data. The approaches of user modeling have been equipped with LLMs and soon become outstanding. This article summarizes existing research about how and why LLMs are great tools of modeling and understanding UGC. Then it reviews a few categories of large language models for user modeling (LLM-UM) approaches that integrate the LLMs with text and graph-based methods in different ways. Then it introduces specific LLM-UM techniques for a variety of UM applications. Finally, it presents remaining challenges and future directions in the LLM-UM research. We maintain the reading list at: https://github.com/TamSiuhin/LLM-UM-Reading

arxiv preprint arxiv, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2312.11518

Country:

North America > United States (0.27)
Europe (0.27)
Asia (0.27)

Genre:

Overview (1.00)
Instructional Material (0.92)
Research Report > New Finding (0.92)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Get an A in Math: Progressive Rectification Prompting

Wu, Zhenyu, Jiang, Meng, Shen, Chao

arXiv.org Artificial IntelligenceDec-11-2023

Chain-of-Thought (CoT) prompting methods have enabled large language models (LLMs) to generate reasoning paths and solve math word problems (MWPs). However, they are sensitive to mistakes in the paths, as any mistake can result in an incorrect answer. We propose a novel method named Progressive Rectification Prompting (PRP) to improve average accuracy on eight MWP datasets from 77.3 to 90.5. Given an initial answer from CoT, PRP iterates a verify-then-rectify process to progressively identify incorrect answers and rectify the reasoning paths. With the most likely correct answer, the LLM predicts a masked numerical value in the question; if the prediction does not match the masked value, the answer is likely incorrect. Then the LLM is prompted to re-generate the reasoning path hinted with a set of incorrect answers to prevent itself from repeating previous mistakes. PRP achieves the best performance compared against the CoT methods. Our implementation is made publicly available at https://wzy6642.github.io/prp.github.io/.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2312.06867

Country:

Europe (0.92)
North America > United States > Maryland (0.14)
Asia > Middle East > Qatar (0.14)
Asia > China > Shaanxi Province (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Pre-training Language Models for Comparative Reasoning

Yu, Mengxia, Zhang, Zhihan, Yu, Wenhao, Jiang, Meng

arXiv.org Artificial IntelligenceNov-28-2023

Comparative reasoning is a process of comparing objects, concepts, or entities to draw conclusions, which constitutes a fundamental cognitive ability. In this paper, we propose a novel framework to pre-train language models for enhancing their abilities of comparative reasoning over texts. While there have been approaches for NLP tasks that require comparative reasoning, they suffer from costly manual data labeling and limited generalizability to different tasks. Our approach introduces a novel method of collecting scalable data for text-based entity comparison, which leverages both structured and unstructured data. Moreover, we present a framework of pre-training language models via three novel objectives on comparative reasoning. Evaluation on downstream tasks including comparative question answering, question generation, and summarization shows that our pre-training framework significantly improves the comparative reasoning abilities of language models, especially under low-resource conditions. This work also releases the first integrated benchmark for comparative reasoning.

large language model, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2305.14457

Country:

Europe (1.00)
North America > United States > Colorado (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning

Zhang, Zhihan, Lee, Dong-Ho, Fang, Yuwei, Yu, Wenhao, Jia, Mengzhao, Jiang, Meng, Barbieri, Francesco

arXiv.org Artificial IntelligenceNov-15-2023

Instruction tuning has remarkably advanced large language models (LLMs) in understanding and responding to diverse human instructions. Despite the success in high-resource languages, its application in lower-resource ones faces challenges due to the imbalanced foundational abilities of LLMs across different languages, stemming from the uneven language distribution in their pre-training data. To tackle this issue, we propose pivot language guided generation (PLUG), an approach that utilizes a high-resource language, primarily English, as the pivot to enhance instruction tuning in lower-resource languages. It trains the model to first process instructions in the pivot language, and then produce responses in the target language. To evaluate our approach, we introduce a benchmark, X-AlpacaEval, of instructions in 4 languages (Chinese, Korean, Italian, and Spanish), each annotated by professional translators. Our approach demonstrates a significant improvement in the instruction-following abilities of LLMs by 29% on average, compared to directly responding in the target language alone. Further experiments validate the versatility of our approach by employing alternative pivot languages beyond English to assist languages where LLMs exhibit lower proficiency.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.08711

Country:

North America > United States > California (0.28)
Europe > France > Île-de-France > Paris > Paris (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explaining Tree Model Decisions in Natural Language for Network Intrusion Detection

Ziems, Noah, Liu, Gang, Flanagan, John, Jiang, Meng

arXiv.org Artificial IntelligenceOct-30-2023

Network intrusion detection (NID) systems which leverage machine learning have been shown to have strong performance in practice when used to detect malicious network traffic. Decision trees in particular offer a strong balance between performance and simplicity, but require users of NID systems to have background knowledge in machine learning to interpret. In addition, they are unable to provide additional outside information as to why certain features may be important for classification. In this work, we explore the use of large language models (LLMs) to provide explanations and additional background knowledge for decision tree NID systems. Further, we introduce a new human evaluation framework for decision tree explanations, which leverages automatically generated quiz questions that measure human evaluators' understanding of decision tree inference. Finally, we show LLM generated decision tree explanations correlate highly with human ratings of readability, quality, and use of background knowledge while simultaneously providing better understanding of decision boundaries.

explanation, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.19658

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback