AITopics

2410.1627

Country: North America > United States (0.48)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-24-2024

ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models

Zhao, Haiquan, Li, Lingyu, Chen, Shisong, Kong, Shuqi, Wang, Jiaan, Huang, Kexin, Gu, Tianle, Wang, Yixu, Liang, Dandan, Li, Zhixu, Teng, Yan, Xiao, Yanghua, Wang, Yingchun

Emotion Support Conversation (ESC) is a crucial application, which aims to reduce human stress, offer emotional guidance, and ultimately enhance human mental and physical well-being. With the advancement of Large Language Models (LLMs), many researchers have employed LLMs as the ESC models. However, the evaluation of these LLM-based ESCs remains uncertain. Inspired by the awesome development of role-playing agents, we propose an ESC Evaluation framework (ESC-Eval), which uses a role-playing agent to interact with ESC models, followed by a manual evaluation of the interactive dialogues. In detail, we first re-organize 2,801 role-playing cards from seven existing datasets to define the roles of the role-playing agent. Second, we train a specific role-playing model called ESC-Role which behaves more like a confused person than GPT-4. Third, through ESC-Role and organized role cards, we systematically conduct experiments using 14 LLMs as the ESC models, including general AI-assistant LLMs (ChatGPT) and ESC-oriented LLMs (ExTES-Llama). We conduct comprehensive human annotations on interactive multi-turn dialogues of different ESC models. The results show that ESC-oriented LLMs exhibit superior ESC abilities compared to general AI-assistant LLMs, but there is still a gap behind human performance. Moreover, to automate the scoring process for future ESC models, we developed ESC-RANK, which trained on the annotated data, achieving a scoring performance surpassing 35 points of GPT-4. Our data and code are available at https://github.com/haidequanbu/ESC-Eval.

large language model, machine learning, natural language, (19 more...)

2406.14952

Country:

Asia (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.67)
Health & Medicine > Consumer Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

arXiv.org Artificial IntelligenceJun-13-2024

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Gu, Tianle, Zhou, Zeyang, Huang, Kexin, Liang, Dandan, Wang, Yixu, Zhao, Haiquan, Yao, Yuanqi, Qiao, Xingge, Wang, Keqing, Yang, Yujiu, Teng, Yan, Qiao, Yu, Wang, Yingchun

Powered by remarkable advancements in Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities in manifold tasks. However, the practical application scenarios of MLLMs are intricate, exposing them to potential malicious instructions and thereby posing safety risks. While current benchmarks do incorporate certain safety considerations, they often lack comprehensive coverage and fail to exhibit the necessary rigor and robustness. For instance, the common practice of employing GPT-4V as both the evaluator and a model to be evaluated lacks credibility, as it tends to exhibit a bias toward its own responses. In this paper, we present MLLMGuard, a multidimensional safety evaluation suite for MLLMs, including a bilingual image-text evaluation dataset, inference utilities, and a lightweight evaluator. MLLMGuard's assessment comprehensively covers two languages (English and Chinese) and five important safety dimensions (Privacy, Bias, Toxicity, Truthfulness, and Legality), each with corresponding rich subtasks. Focusing on these dimensions, our evaluation dataset is primarily sourced from platforms such as social media, and it integrates text-based and image-based red teaming techniques with meticulous annotation by human experts. This can prevent inaccurate evaluation caused by data leakage when using open-source datasets and ensures the quality and challenging nature of our benchmark. Additionally, a fully automated lightweight evaluator termed GuardRank is developed, which achieves significantly higher evaluation accuracy than GPT-4. Our evaluation results across 13 advanced models indicate that MLLMs still have a substantial journey ahead before they can be considered safe and responsible.

large language model, machine learning, natural language, (19 more...)

2406.07594

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-3-2024

OVEL: Large Language Model as Memory Manager for Online Video Entity Linking

Zhao, Haiquan, Wang, Xuwu, Chen, Shisong, Li, Zhixu, Zheng, Xin, Xiao, Yanghua

In recent years, multi-modal entity linking (MEL) has garnered increasing attention in the research community due to its significance in numerous multi-modal applications. Video, as a popular means of information transmission, has become prevalent in people's daily lives. However, most existing MEL methods primarily focus on linking textual and visual mentions or offline videos's mentions to entities in multi-modal knowledge bases, with limited efforts devoted to linking mentions within online video content. In this paper, we propose a task called Online Video Entity Linking OVEL, aiming to establish connections between mentions in online videos and a knowledge base with high accuracy and timeliness. To facilitate the research works of OVEL, we specifically concentrate on live delivery scenarios and construct a live delivery entity linking dataset called LIVE. Besides, we propose an evaluation metric that considers timelessness, robustness, and accuracy. Furthermore, to effectively handle OVEL task, we leverage a memory block managed by a Large Language Model and retrieve entity candidates from the knowledge base to augment LLM performance on memory management. The experimental results prove the effectiveness and efficiency of our method.

information, large language model, machine learning, (17 more...)

2403.01411

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningNov-23-2017

Robustness of Maximum Correntropy Estimation Against Large Outliers

Chen, Badong, Xing, Lei, Zhao, Haiquan, Xu, Bin, Principe, Jose C.

The maximum correntropy criterion (MCC) has recently been successfully applied in robust regression, classification and adaptive filtering, where the correntropy is maximized instead of minimizing the well-known mean square error (MSE) to improve the robustness with respect to outliers (or impulsive noises). Considerable efforts have been devoted to develop various robust adaptive algorithms under MCC, but so far little insight has been gained as to how the optimal solution will be affected by outliers. In this work, we study this problem in the context of parameter estimation for a simple linear errors-in-variables (EIV) model where all variables are scalar. Under certain conditions, we derive an upper bound on the absolute value of the estimation error and show that the optimal solution under MCC can be very close to the true value of the unknown parameter even with outliers (whose values can be arbitrarily large) in both input and output variables. Illustrative examples are presented to verify and clarify the theory.

artificial intelligence, data mining, mcc, (14 more...)

1703.08065

Country: Asia > China (0.47)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.58)

arXiv.org Machine LearningAug-1-2016

Kernel Risk-Sensitive Loss: Definition, Properties and Application to Robust Adaptive Filtering

Chen, Badong, Xing, Lei, Xu, Bin, Zhao, Haiquan, Zheng, Nanning, Principe, Jose C.

Nonlinear similarity measures defined in kernel space, such as correntropy, can extract higher-order statistics of data and offer potentially significant performance improvement over their linear counterparts especially in non-Gaussian signal processing and machine learning. In this work, we propose a new similarity measure in kernel space, called the kernel risk-sensitive loss (KRSL), and provide some important properties. We apply the KRSL to adaptive filtering and investigate the robustness, and then develop the MKRSL algorithm and analyze the mean square convergence performance. Compared with correntropy, the KRSL can offer a more efficient performance surface, thereby enabling a gradient based method to achieve faster convergence speed and higher accuracy while still maintaining the robustness to outliers. Theoretical analysis results and superior performance of the new algorithm are confirmed by simulation.

algorithm, artificial intelligence, machine learning, (16 more...)

doi: 10.1109/TSP.2017.2669903

1608.00441

Country: Asia > China (0.47)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

arXiv.org Machine LearningFeb-3-2016

Diffusion Maximum Correntropy Criterion Algorithms for Robust Distributed Estimation

Ma, Wentao, Chen, Badong, Duan, Jiandong, Zhao, Haiquan

Robust diffusion adaptive estimation algorithms based on the maximum correntropy criterion (MCC), including adaptation to combination MCC and combination to adaptation MCC, are developed to deal with the distributed estimation over network in impulsive (long-tailed) noise environments. The cost functions used in distributed estimation are in general based on the mean square error (MSE) criterion, which is desirable when the measurement noise is Gaussian. In non-Gaussian situations, such as the impulsive-noise case, MCC based methods may achieve much better performance than the MSE methods as they take into account higher order statistics of error distribution. The proposed methods can also outperform the robust diffusion least mean p-power(DLMP) and diffusion minimum error entropy (DMEE) algorithms. The mean and mean square convergence analysis of the new algorithms are also carried out.

algorithm, artificial intelligence, data mining, (16 more...)

1508.01903

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications (0.68)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningSep-15-2015

Maximum Correntropy Kalman Filter

Chen, Badong, Liu, Xi, Zhao, Haiquan, Príncipe, José C.

Traditional Kalman filter (KF) is derived under the well-known minimum mean square error (MMSE) criterion, which is optimal under Gaussian assumption. However, when the signals are non-Gaussian, especially when the system is disturbed by some heavy-tailed impulsive noises, the performance of KF will deteriorate seriously. To improve the robustness of KF against impulsive noises, we propose in this work a new Kalman filter, called the maximum correntropy Kalman filter (MCKF), which adopts the robust maximum correntropy criterion (MCC) as the optimality criterion, instead of using the MMSE. Similar to the traditional KF, the state mean and covariance matrix propagation equations are used to give prior estimations of the state and covariance matrix in MCKF. A novel fixed-point algorithm is then used to update the posterior estimations. A sufficient condition that guarantees the convergence of the fixed-point algorithm is given. Illustration examples are presented to demonstrate the effectiveness and robustness of the new algorithm.

algorithm, artificial intelligence, machine learning, (16 more...)

1509.0458

Country:

North America > United States > New York (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Communications > Networks > Sensor Networks (0.46)

arXiv.org Machine LearningApr-11-2015

Generalized Correntropy for Robust Adaptive Filtering

Chen, Badong, Xing, Lei, Zhao, Haiquan, Zheng, Nanning, Príncipe, José C.

As a robust nonlinear similarity measure in kernel space, correntropy has received increasing attention in domains of machine learning and signal processing. In particular, the maximum correntropy criterion (MCC) has recently been successfully applied in robust regression and filtering. The default kernel function in correntropy is the Gaussian kernel, which is, of course, not always the best choice. In this work, we propose a generalized correntropy that adopts the generalized Gaussian density (GGD) function as the kernel (not necessarily a Mercer kernel), and present some important properties. We further propose the generalized maximum correntropy criterion (GMCC), and apply it to adaptive filtering. An adaptive algorithm, called the GMCC algorithm, is derived, and the mean square convergence performance is studied. We show that the proposed algorithm is very stable and can achieve zero probability of divergence (POD). Simulation results confirm the theoretical expectations and demonstrate the desirable performance of the new algorithm.

algorithm, artificial intelligence, machine learning, (18 more...)

doi: 10.1109/TSP.2016.2539127

1504.02931

Country:

Asia > China (0.46)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)