AITopics | Wang, Yan

Collaborating Authors

Wang, Yan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Preference Construction: A Bayesian Interactive Preference Elicitation Framework Based on Monte Carlo Tree Search

Wang, Yan, Liu, Jiapeng, Kadziński, Milosz, Liao, Xiuwu

arXiv.org Artificial IntelligenceMar-19-2025

We present a novel preference learning framework to capture participant preferences efficiently within limited interaction rounds. It involves three main contributions. First, we develop a variational Bayesian approach to infer the participant's preference model by estimating posterior distributions and managing uncertainty from limited information. Second, we propose an adaptive questioning policy that maximizes cumulative uncertainty reduction, formulating questioning as a finite Markov decision process and using Monte Carlo Tree Search to prioritize promising question trajectories. By considering long-term effects and leveraging the efficiency of the Bayesian approach, the policy avoids shortsightedness. Third, we apply the framework to Multiple Criteria Decision Aiding, with pairwise comparison as the preference information and an additive value function as the preference model. We integrate the reparameterization trick to address high-variance issues, enhancing robustness and efficiency. Computational studies on real-world and synthetic datasets demonstrate the framework's practical usability, outperforming baselines in capturing preferences and achieving superior uncertainty reduction within limited interactions.

artificial intelligence, bayesian inference, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2503.1515

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

GenDR: Lightning Generative Detail Restorator

Wang, Yan, Zhao, Shijie, Chen, Kai, Zhang, Kexin, Li, Junlin, Zhang, Li

arXiv.org Artificial IntelligenceMar-9-2025

Recent research applying text-to-image (T2I) diffusion models to real-world super-resolution (SR) has achieved remarkable success. However, fundamental misalignments between T2I and SR targets result in a dilemma between inference speed and detail fidelity. Specifically, T2I tasks prioritize multi-step inversion to synthesize coherent outputs aligned with textual prompts and shrink the latent space to reduce generating complexity. Contrariwise, SR tasks preserve most information from low-resolution input while solely restoring high-frequency details, thus necessitating sufficient latent space and fewer inference steps. To bridge the gap, we present a one-step diffusion model for generative detail restoration, GenDR, distilled from a tailored diffusion model with larger latent space. In detail, we train a new SD2.1-VAE16 (0.9B) via representation alignment to expand latent space without enlarging the model size. Regarding step-distillation, we propose consistent score identity distillation (CiD) that incorporates SR task-specific loss into score distillation to leverage more SR priors and align the training target. Furthermore, we extend CiD with adversarial learning and representation alignment (CiDA) to enhance perceptual quality and accelerate training. We also polish the pipeline to achieve a more efficient inference. Experimental results demonstrate that GenDR achieves state-of-the-art performance in both quantitative metrics and visual fidelity.

artificial intelligence, distillation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.0679

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

OrdRankBen: A Novel Ranking Benchmark for Ordinal Relevance in NLP

Wang, Yan, Qian, Lingfei, Peng, Xueqing, Huang, Jimin, Feng, Dongji

arXiv.org Artificial IntelligenceMar-1-2025

The evaluation of ranking tasks remains a significant challenge in natural language processing (NLP), particularly due to the lack of direct labels for results in real-world scenarios. Benchmark datasets play a crucial role in providing standardized testbeds that ensure fair comparisons, enhance reproducibility, and enable progress tracking, facilitating rigorous assessment and continuous improvement of ranking models. Existing NLP ranking benchmarks typically use binary relevance labels or continuous relevance scores, neglecting ordinal relevance scores. However, binary labels oversimplify relevance distinctions, while continuous scores lack a clear ordinal structure, making it challenging to capture nuanced ranking differences effectively. To address these challenges, we introduce OrdRankBen, a novel benchmark designed to capture multi-granularity relevance distinctions. Unlike conventional benchmarks, OrdRankBen incorporates structured ordinal labels, enabling more precise ranking evaluations. Given the absence of suitable datasets for ordinal relevance ranking in NLP, we constructed two datasets with distinct ordinal label distributions. We further evaluate various models for three model types, ranking-based language models, general large language models, and ranking-focused large language models on these datasets. Experimental results show that ordinal relevance modeling provides a more precise evaluation of ranking models, improving their ability to distinguish multi-granularity differences among ranked items-crucial for tasks that demand fine-grained relevance differentiation.

information retrieval, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.00674

Country: North America > United States (0.50)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)

Add feedback

Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance

Peng, Xueqing, Papadopoulos, Triantafillos, Soufleri, Efstathia, Giannouris, Polydoros, Xiang, Ruoyu, Wang, Yan, Qian, Lingfei, Huang, Jimin, Xie, Qianqian, Ananiadou, Sophia

arXiv.org Artificial IntelligenceFeb-25-2025

Despite Greece's pivotal role in the global economy, large language models (LLMs) remain underexplored for Greek financial context due to the linguistic complexity of Greek and the scarcity of domain-specific datasets. Previous efforts in multilingual financial natural language processing (NLP) have exposed considerable performance disparities, yet no dedicated Greek financial benchmarks or Greek-specific financial LLMs have been developed until now. To bridge this gap, we introduce Plutus-ben, the first Greek Financial Evaluation Benchmark, and Plutus-8B, the pioneering Greek Financial LLM, fine-tuned with Greek domain-specific data. Plutus-ben addresses five core financial NLP tasks in Greek: numeric and textual named entity recognition, question answering, abstractive summarization, and topic classification, thereby facilitating systematic and reproducible LLM assessments. To underpin these tasks, we present three novel, high-quality Greek financial datasets, thoroughly annotated by expert native Greek speakers, augmented by two existing resources. Our comprehensive evaluation of 22 LLMs on Plutus-ben reveals that Greek financial NLP remains challenging due to linguistic complexity, domain-specific terminology, and financial reasoning gaps. These findings underscore the limitations of cross-lingual transfer, the necessity for financial expertise in Greek-trained models, and the challenges of adapting financial LLMs to Greek text. We release Plutus-ben, Plutus-8B, and all associated datasets publicly to promote reproducible research and advance Greek financial NLP, fostering broader multilingual inclusivity in finance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.18772

Country:

Europe > Greece (0.48)
Europe > United Kingdom > England (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Banking & Finance > Economy (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

A Macro- and Micro-Hierarchical Transfer Learning Framework for Cross-Domain Fake News Detection

Yang, Xuankai, Wang, Yan, Zhang, Xiuzhen, Wang, Shoujin, Wang, Huaxiong, Lam, Kwok Yan

arXiv.org Artificial IntelligenceFeb-24-2025

Cross-domain fake news detection aims to mitigate domain shift and improve detection performance by transferring knowledge across domains. Existing approaches transfer knowledge based on news content and user engagements from a source domain to a target domain. However, these approaches face two main limitations, hindering effective knowledge transfer and optimal fake news detection performance. Firstly, from a micro perspective, they neglect the negative impact of veracity-irrelevant features in news content when transferring domain-shared features across domains. Secondly, from a macro perspective, existing approaches ignore the relationship between user engagement and news content, which reveals shared behaviors of common users across domains and can facilitate more effective knowledge transfer. To address these limitations, we propose a novel macro- and micro- hierarchical transfer learning framework (MMHT) for cross-domain fake news detection. Firstly, we propose a micro-hierarchical disentangling module to disentangle veracity-relevant and veracity-irrelevant features from news content in the source domain for improving fake news detection performance in the target domain. Secondly, we propose a macro-hierarchical transfer learning module to generate engagement features based on common users' shared behaviors in different domains for improving effectiveness of knowledge transfer. Extensive experiments on real-world datasets demonstrate that our framework significantly outperforms the state-of-the-art baselines.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3696410.3714517

2502.14403

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.93)

Add feedback

VidLBEval: Benchmarking and Mitigating Language Bias in Video-Involved LVLMs

Yang, Yiming, Guo, Yangyang, Lu, Hui, Wang, Yan

arXiv.org Artificial IntelligenceFeb-23-2025

Recently, Large Vision-Language Models (LVLMs) have made significant strides across diverse multimodal tasks and benchmarks. This paper reveals a largely under-explored problem from existing video-involved LVLMs - language bias, where models tend to prioritize language over video and thus result in incorrect responses. To address this research gap, we first collect a Video Language Bias Evaluation Benchmark, which is specifically designed to assess the language bias in video-involved LVLMs through two key tasks: ambiguous video contrast and interrogative question probing. Accordingly, we design accompanied evaluation metrics that aim to penalize LVLMs being biased by language. In addition, we also propose Multi-branch Contrastive Decoding (MCD), introducing two expert branches to simultaneously counteract language bias potentially generated by the amateur text-only branch. Our experiments demonstrate that i) existing video-involved LVLMs, including both proprietary and open-sourced, are largely limited by the language bias problem; ii) our MCD can effectively mitigate this issue and maintain general-purpose capabilities in various video-involved LVLMs without any additional retraining or alteration to model architectures.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.16602

Country: Asia (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Constructing a Norm for Children's Scientific Drawing: Distribution Features Based on Semantic Similarity of Large Language Models

Zhang, Yi, Wei, Fan, Li, Jingyi, Wang, Yan, Yu, Yanyan, Chen, Jianli, Cai, Zipo, Liu, Xinyu, Wang, Wei, Wang, Peng, Wang, Zhong

arXiv.org Artificial IntelligenceFeb-21-2025

The use of children's drawings to examining their conceptual understanding has been proven to be an effective method, but there are two major problems with previous research: 1. The content of the drawings heavily relies on the task, and the ecological validity of the conclusions is low; 2. The interpretation of drawings relies too much on the subjective feelings of the researchers. To address this issue, this study uses the Large Language Model (LLM) to identify 1420 children's scientific drawings (covering 9 scientific themes/concepts), and uses the word2vec algorithm to calculate their semantic similarity. The study explores whether there are consistent drawing representations for children on the same theme, and attempts to establish a norm for children's scientific drawings, providing a baseline reference for follow-up children's drawing research. The results show that the representation of most drawings has consistency, manifested as most semantic similarity greater than 0.8. At the same time, it was found that the consistency of the representation is independent of the accuracy (of LLM's recognition), indicating the existence of consistency bias. In the subsequent exploration of influencing factors, we used Kendall rank correlation coefficient to investigate the effects of Sample Size, Abstract Degree, and Focus Points on drawings, and used word frequency statistics to explore whether children represented abstract themes/concepts by reproducing what was taught in class.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2502.15348

Country: Asia > China (0.15)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Health & Medicine (0.68)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Low-Complexity Cooperative Payload Transportation for Nonholonomic Mobile Robots Under Scalable Constraints

Guan, Renhe, Wang, Yuanzhe, Liu, Tao, Wang, Yan

arXiv.org Artificial IntelligenceFeb-18-2025

--Cooperative transportation, a key aspect of logistics cyber-physical systems (CPS), is typically approached using distributed control and optimization-based methods. The distributed control methods consume less time, but poorly handle and extend to multiple constraints. Instead, optimization-based methods handle constraints effectively, but they are usually centralized, time-consuming and thus not easily scalable to numerous robots. T o overcome drawbacks of both, we propose a novel cooperative transportation method for nonholonomic mobile robots by improving conventional formation control, which is distributed, has a low time-complexity and accommodates scalable constraints. The proposed control-based method is testified on a cable-suspended payload and divided into two parts, including robot trajectory generation and trajectory tracking. Unlike most time-consuming trajectory generation methods, ours can generate trajectories with only constant time-complexity, needless of global maps. As for trajectory tracking, our control-based method not only scales easily to multiple constraints as those optimization-based methods, but reduces their time-complexity from polynomial to linear . Simulations and experiments can verify the feasibility of our method. ECENTL Y, logistics cyber-physical systems (CPS), particularly multi-robot cooperative transportation, have garnered increasing attention due to their advantages, such as cost reduction and enhanced productivity [1]-[17]. In this scenario, robots are required to coordinately transport the payload from a starting place to the desired destination quickly. Typically, the robot formation is subject to numerous constraints in practical transportation, such as obstacle avoidance, inter-robot collision avoidance, velocity constraints, payload protection, nonholonomic kinematics, etc. So far, how to overcome as many constraints as possible in the shortest time has become an important issue in cooperative transportation problems. Most cooperative transportation algorithms are based on two frameworks, including distributed control [3]-[8] and optimization [10]-[17].

artificial intelligence, constraint, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2502.13366

Country:

Asia > China (0.95)
Europe > Norway > Norwegian Sea (0.24)

Genre: Research Report (0.40)

Industry: Transportation (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.61)

Add feedback

A Survey on Video Analytics in Cloud-Edge-Terminal Collaborative Systems

Gong, Linxiao, Yang, Hao, Fang, Gaoyun, Ju, Bobo, Guo, Juncen, Zhu, Xiaoguang, Wang, Yan, Hu, Xiping, Sun, Peng, Boukerche, Azzedine

arXiv.org Artificial IntelligenceFeb-12-2025

The explosive growth of video data has driven the development of distributed video analytics in cloud-edge-terminal collaborative (CETC) systems, enabling efficient video processing, real-time inference, and privacy-preserving analysis. Among multiple advantages, CETC systems can distribute video processing tasks and enable adaptive analytics across cloud, edge, and terminal devices, leading to breakthroughs in video surveillance, autonomous driving, and smart cities. In this survey, we first analyze fundamental architectural components, including hierarchical, distributed, and hybrid frameworks, alongside edge computing platforms and resource management mechanisms. Building upon these foundations, edge-centric approaches emphasize on-device processing, edge-assisted offloading, and edge intelligence, while cloud-centric methods leverage powerful computational capabilities for complex video understanding and model training. Our investigation also covers hybrid video analytics incorporating adaptive task offloading and resource-aware scheduling techniques that optimize performance across the entire system. Beyond conventional approaches, recent advances in large language models and multimodal integration reveal both opportunities and challenges in platform scalability, data protection, and system reliability. Future directions also encompass explainable systems, efficient processing mechanisms, and advanced video analytics, offering valuable insights for researchers and practitioners in this dynamic field.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.06581

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry: Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance

Qian, Lingfei, Zhou, Weipeng, Wang, Yan, Peng, Xueqing, Huang, Jimin, Xie, Qianqian

arXiv.org Artificial IntelligenceFeb-12-2025

Recent advancements in large language models (LLMs) have shown strong general reasoning abilities, yet their effectiveness in financial reasoning remains underexplored. In this study, we comprehensively evaluate 16 powerful reasoning and general LLMs on three complex financial tasks involving financial text, tabular data, and equations, assessing numerical reasoning, tabular interpretation, financial terminology comprehension, long-context processing, and equation-based problem solving. Our results show that while better datasets and pretraining improve financial reasoning, general enhancements like CoT fine-tuning do not always yield consistent gains. Moreover, all reasoning strategies face challenges in improving performance on long-context and multi-table tasks. To address these limitations, we develop a financial reasoning-enhanced model based on Llama-3.1-8B-Instruct, by CoT fine-tuning and reinforcement learning with domain-specific reasoning paths. Even with simple fine-tuning with one financial dataset, our model achieves a consistent 10% performance improvement across tasks, surpassing all 8B models and even Llama3-70B-Instruct and Llama3.1-70B-Instruct on average. Our results highlight the need for domain-specific adaptations in financial tasks, emphasizing future directions such as multi-table reasoning, long-context processing, and financial terminology comprehension. All our datasets, models, and codes are publicly available. Furthermore, we introduce a leaderboard for benchmarking future datasets and models.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.08127

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback