AITopics | Meng, Xuying

Collaborating Authors

Meng, Xuying

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhance Graph Alignment for Large Language Models

Luo, Haitong, Meng, Xuying, Wang, Suhang, Zhao, Tianxiang, Wang, Fali, Cao, Hanyun, Zhang, Yujun

arXiv.org Artificial IntelligenceOct-15-2024

Graph-structured data is prevalent in the real world. Recently, due to the powerful emergent capabilities, Large Language Models (LLMs) have shown promising performance in modeling graphs. The key to effectively applying LLMs on graphs is converting graph data into a format LLMs can comprehend. Graph-to-token approaches are popular in enabling LLMs to process graph information. They transform graphs into sequences of tokens and align them with text tokens through instruction tuning, where self-supervised instruction tuning helps LLMs acquire general knowledge about graphs, and supervised fine-tuning specializes LLMs for the downstream tasks on graphs. Despite their initial success, we find that existing methods have a misalignment between self-supervised tasks and supervised downstream tasks, resulting in negative transfer from self-supervised fine-tuning to downstream tasks. To address these issues, we propose Graph Alignment Large Language Models (GALLM) to benefit from aligned task templates. In the self-supervised tuning stage, we introduce a novel text matching task using templates aligned with downstream tasks. In the task-specific tuning stage, we propose two category prompt methods that learn supervision information from additional explanation with further aligned templates. Experimental evaluations on four datasets demonstrate substantial improvements in supervised learning, multi-dataset generalizability, and particularly in zero-shot capability, highlighting the model's potential as a graph foundation model.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.1137

Country: North America > United States (0.95)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Not All Layers of LLMs Are Necessary During Inference

Fan, Siqi, Jiang, Xin, Li, Xiang, Meng, Xuying, Han, Peng, Shang, Shuo, Sun, Aixin, Wang, Yequan, Wang, Zhongyuan

arXiv.org Artificial IntelligenceJul-9-2024

Due to the large number of parameters, the inference phase of Large Language Models (LLMs) is resource-intensive. However, not all requests posed to LLMs are equally difficult to handle. Through analysis, we show that for some tasks, LLMs can achieve results comparable to the final output at some intermediate layers. That is, not all layers of LLMs are necessary during inference. If we can predict at which layer the inferred results match the final results (produced by evaluating all layers), we could significantly reduce the inference cost. To this end, we propose a simple yet effective algorithm named AdaInfer to adaptively terminate the inference process for an input instance. AdaInfer relies on easily obtainable statistical features and classic classifiers like SVM. Experiments on well-known LLMs like the Llama2 series and OPT, show that AdaInfer can achieve an average of 17.8% pruning ratio, and up to 43% on sentiment tasks, with nearly no performance drop (<1%). Because AdaInfer does not alter LLM parameters, the LLMs incorporated with AdaInfer maintain generalizability across tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.02181

Country:

Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FLM-101B: An Open LLM and How to Train It with $100K Budget

Li, Xiang, Yao, Yiqun, Jiang, Xin, Fang, Xuezhi, Meng, Xuying, Fan, Siqi, Han, Peng, Li, Jing, Du, Li, Qin, Bowen, Zhang, Zheng, Sun, Aixin, Wang, Yequan

arXiv.org Artificial IntelligenceSep-17-2023

Large language models (LLMs) have achieved remarkable success in NLP and multimodal tasks, among others. Despite these successes, two main challenges remain in developing LLMs: (i) high computational cost, and (ii) fair and objective evaluations. In this paper, we report a solution to significantly reduce LLM training cost through a growth strategy. We demonstrate that a 101B-parameter LLM with 0.31T tokens can be trained with a budget of 100K US dollars. Inspired by IQ tests, we also consolidate an additional range of evaluations on top of existing evaluations that focus on knowledge-oriented abilities. These IQ evaluations include symbolic mapping, rule understanding, pattern mining, and anti-interference. Such evaluations minimize the potential impact of memorization. Experimental results show that our model, named FLM-101B, trained with a budget of 100K US dollars, achieves performance comparable to powerful and well-known models, e.g., GPT-3 and GLM-130B, especially on the additional range of IQ evaluations. The checkpoint of FLM-101B is released at https://huggingface.co/CofeAI/FLM-101B.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2309.03852

Country:

Asia > China (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Maryland (0.14)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

FreeLM: Fine-Tuning-Free Language Model

Li, Xiang, Jiang, Xin, Meng, Xuying, Sun, Aixin, Wang, Yequan

arXiv.org Artificial IntelligenceMay-2-2023

Pre-trained language models (PLMs) have achieved remarkable success in NLP tasks. Despite the great success, mainstream solutions largely follow the pre-training then finetuning paradigm, which brings in both high deployment costs and low training efficiency. Nevertheless, fine-tuning on a specific task is essential because PLMs are only pre-trained with language signal from large raw data. In this paper, we propose a novel fine-tuning-free strategy for language models, to consider both language signal and teacher signal. Teacher signal is an abstraction of a battery of downstream tasks, provided in a unified proposition format. Trained with both language and strong task-aware teacher signals in an interactive manner, our FreeLM model demonstrates strong generalization and robustness. FreeLM outperforms large models e.g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments. FreeLM is much smaller with 0.3B parameters, compared to 175B in these models.

freelm, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.01616

Country:

Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

GCRE-GPT: A Generative Model for Comparative Relation Extraction

Wang, Yequan, Zhang, Hengran, Sun, Aixin, Meng, Xuying

arXiv.org Artificial IntelligenceMar-15-2023

Given comparative text, comparative relation extraction aims to extract two targets (\eg two cameras) in comparison and the aspect they are compared for (\eg image quality). The extracted comparative relations form the basis of further opinion analysis.Existing solutions formulate this task as a sequence labeling task, to extract targets and aspects. However, they cannot directly extract comparative relation(s) from text. In this paper, we show that comparative relations can be directly extracted with high accuracy, by generative model. Based on GPT-2, we propose a Generation-based Comparative Relation Extractor (GCRE-GPT). Experiment results show that \modelname achieves state-of-the-art accuracy on two datasets.

artificial intelligence, natural language, relation, (17 more...)

arXiv.org Artificial Intelligence

2303.08601

Country:

Europe (1.00)
North America > United States (0.69)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Perplexity from PLM Is Unreliable for Evaluating Text Quality

Wang, Yequan, Deng, Jiawen, Sun, Aixin, Meng, Xuying

arXiv.org Artificial IntelligenceMar-15-2023

Recently, amounts of works utilize perplexity~(PPL) to evaluate the quality of the generated text. They suppose that if the value of PPL is smaller, the quality(i.e. fluency) of the text to be evaluated is better. However, we find that the PPL referee is unqualified and it cannot evaluate the generated text fairly for the following reasons: (i) The PPL of short text is larger than long text, which goes against common sense, (ii) The repeated text span could damage the performance of PPL, and (iii) The punctuation marks could affect the performance of PPL heavily. Experiments show that the PPL is unreliable for evaluating the quality of given text. Last, we discuss the key problems with evaluating text quality using language models.

artificial intelligence, natural language, text quality, (16 more...)

arXiv.org Artificial Intelligence

2210.05892

Country:

Europe (1.00)
North America > United States (0.46)

Genre: Research Report (0.65)

Industry: Leisure & Entertainment > Sports (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

PoKE: Prior Knowledge Enhanced Emotional Support Conversation with Latent Variable

Xu, Xiaohan, Meng, Xuying, Wang, Yequan

arXiv.org Artificial IntelligenceFeb-15-2023

Emotional support conversation (ESC) task can utilize various support strategies to help people relieve emotional distress and overcome the problem they face, which has attracted much attention in these years. However, most state-of-the-art works rely heavily on external commonsense knowledge to infer the mental state of the user in every dialogue round. Although effective, they may suffer from significant human effort, knowledge update and domain change in a long run. Therefore, in this article, we focus on exploring the task itself without using any external knowledge. We find all existing works ignore two significant characteristics of ESC. (a) Abundant prior knowledge exists in historical conversations, such as the responses to similar cases and the general order of support strategies, which has a great reference value for current conversation. (b) There is a one-to-many mapping relationship between context and support strategy, i.e.multiple strategies are reasonable for a single context. It lays a better foundation for the diversity of generations. Taking into account these two key factors, we propose Prior Knowledge Enhanced emotional support model with latent variable, PoKE. The proposed model fully taps the potential of prior knowledge in terms of exemplars and strategy sequence and then utilizes a latent variable to model the one-to-many relationship of strategy. Furthermore, we introduce a memory schema to incorporate the encoded knowledge into decoder. Experiment results on benchmark dataset show that our PoKE outperforms existing baselines on both automatic evaluation and human evaluation. Compared with the model using external knowledge, PoKE still can make a slight improvement in some metrics. Further experiments prove that abundant prior knowledge is conducive to high-quality emotional support, and a well-learned latent variable is critical to the diversity of generations.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.1264

Country:

North America > United States (0.30)
Asia (0.28)

Genre:

Research Report (0.64)
Personal > Interview (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Peeking the Impact of Points of Interests on Didi

Tian, Yonghong, Li, Zeyu, Xu, Zhiwei, Meng, Xuying, Zheng, Bing

arXiv.org Machine LearningApr-5-2018

Recently, the online car-hailing service, Didi, has emerged as a leader in the sharing economy. Used by passengers and drivers extensive, it becomes increasingly important for the car-hailing service providers to minimize the waiting time of passengers and optimize the vehicle utilization, thus to improve the overall user experience. Therefore, the supply-demand estimation is an indispensable ingredient of an efficient online car-hailing service. To improve the accuracy of the estimation results, we analyze the implicit relationships between the points of Interest (POI) and the supply-demand gap in this paper. The different categories of POIs have positive or negative effects on the estimation, we propose a POI selection scheme and incorporate it into XGBoost [1] to achieve more accurate estimation results. Our experiment demonstrates our method provides more accurate estimation results and more stable estimation results than the existing methods.

artificial intelligence, ground transportation, pois, (18 more...)

arXiv.org Machine Learning

1804.04176

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Exploiting Emotion on Reviews for Recommender Systems

Meng, Xuying (Institute of Computing Technology, Chinese Academy of Sciences) | Wang, Suhang (Arizona State University) | Liu, Huan (Arizona State University) | Zhang, Yujun (Institute of Computing Technology, Chinese Academy of Sciences.)

AAAI ConferencesFeb-8-2018

Review history is widely used by recommender systems to infer users' preferences and help find the potential interests from the huge volumes of data, whereas it also brings in great concerns on the sparsity and cold-start problems due to its inadequacy. Psychology and sociology research has shown that emotion information is a strong indicator for users' preferences. Meanwhile, with the fast development of online services, users are willing to express their emotion on others' reviews, which makes the emotion information pervasively available. Besides, recent research shows that the number of emotion on reviews is always much larger than the number of reviews. Therefore incorporating emotion on reviews may help to alleviate the data sparsity and cold-start problems for recommender systems. In this paper, we provide a principled and mathematical way to exploit both positive and negative emotion on reviews, and propose a novel framework MIRROR, exploiting eMotIon on Reviews for RecOmmendeR systems from both global and local perspectives. Empirical results on real-world datasets demonstrate the effectiveness of our proposed framework and further experiments are conducted to understand how emotion on reviews works for the proposed framework.

artificial intelligence, emotion, survey article, (15 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Personalized Privacy-Preserving Social Recommendation

Meng, Xuying (Institute of Computing Technology, Chinese Academy of Sciences) | Wang, Suhang (Arizona State University) | Shu, Kai (Arizona State University) | Li, Jundong (Arizona State University) | Chen, Bo (Michigan Technological University) | Liu, Huan (Arizona State University) | Zhang, Yujun (Institute of Computing Technology, Chinese Academy of Sciences)

AAAI ConferencesFeb-8-2018

Privacy leakage is an important issue for social recommendation. Existing privacy preserving social recommendation approaches usually allow the recommender to fully control users' information. This may be problematic since the recommender itself may be untrusted, leading to serious privacy leakage. Besides, building social relationships requires sharing interests as well as other private information, which may lead to more privacy leakage. Although sometimes users are allowed to hide their sensitive private data using privacy settings, the data being shared can still be abused by the adversaries to infer sensitive private information. Supporting social recommendation with least privacy leakage to untrusted recommender and other users (i.e., friends) is an important yet challenging problem. In this paper, we aim to address the problem of achieving privacy-preserving social recommendation under personalized privacy settings. We propose PrivSR, a novel framework for privacy-preserving social recommendation, in which users can model ratings and social relationships privately. Meanwhile, by allocating different noise magnitudes to personalized sensitive and non-sensitive ratings, we can protect users' privacy against the untrusted recommender and friends. Theoretical analysis and experimental evaluation on real-world datasets demonstrate that our framework can protect users' privacy while being able to retain effectiveness of the underlying recommender system.

artificial intelligence, effectiveness, social media, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback