AITopics | Zhang, Cheng

Collaborating Authors

Zhang, Cheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations

Ji, Ziwei, Yu, Lei, Koishekenov, Yeskendir, Bang, Yejin, Hartshorn, Anthony, Schelten, Alan, Zhang, Cheng, Fung, Pascale, Cancedda, Nicola

arXiv.org Artificial IntelligenceMar-18-2025

LLMs often adopt an assertive language style also when making false claims. Such ``overconfident hallucinations'' mislead users and erode trust. Achieving the ability to express in language the actual degree of uncertainty around a claim is therefore of great importance. We find that ``verbal uncertainty'' is governed by a single linear feature in the representation space of LLMs, and show that this has only moderate correlation with the actual ``semantic uncertainty'' of the model. We apply this insight and show that (1) the mismatch between semantic and verbal uncertainty is a better predictor of hallucinations than semantic uncertainty alone and (2) we can intervene on verbal uncertainty at inference time and reduce hallucinations on short-form answers, achieving an average relative reduction of 32%.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.14477

Country:

Asia (0.93)
North America > United States (0.67)
Europe > United Kingdom > England (0.29)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing

Wang, Yuanhao, Zhang, Cheng, Frazão, Gonçalo, Yang, Jinlong, Ichim, Alexandru-Eugen, Beeler, Thabo, De la Torre, Fernando

arXiv.org Artificial IntelligenceMar-11-2025

We introduce GarmentCrafter, a new approach that enables non-professional users to create and modify 3D garments from a single-view image. While recent advances in image generation have facilitated 2D garment design, creating and editing 3D garments remains challenging for non-professional users. Existing methods for single-view 3D reconstruction often rely on pre-trained generative models to synthesize novel views conditioning on the reference image and camera pose, yet they lack cross-view consistency, failing to capture the internal relationships across different views. In this paper, we tackle this challenge through progressive depth prediction and image warping to approximate novel views. Subsequently, we train a multi-view diffusion model to complete occluded and unknown clothing regions, informed by the evolving camera pose. By jointly inferring RGB and depth, GarmentCrafter enforces inter-view coherence and reconstructs precise geometries and fine details. Extensive experiments demonstrate that our method achieves superior visual fidelity and inter-view coherence compared to state-of-the-art single-view 3D garment reconstruction methods.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.08678

Country:

Asia (0.28)
Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Natural Language (0.88)

Add feedback

HVI: A New Color Space for Low-light Image Enhancement

Yan, Qingsen, Feng, Yixu, Zhang, Cheng, Pang, Guansong, Shi, Kangbiao, Wu, Peng, Dong, Wei, Sun, Jinqiu, Zhang, Yanning

arXiv.org Artificial IntelligenceFeb-28-2025

Low-Light Image Enhancement (LLIE) is a crucial computer vision task that aims to restore detailed visual information from corrupted low-light images. Many existing LLIE methods are based on standard RGB (sRGB) space, which often produce color bias and brightness artifacts due to inherent high color sensitivity in sRGB. While converting the images using Hue, Saturation and Value (HSV) color space helps resolve the brightness issue, it introduces significant red and black noise artifacts. To address this issue, we propose a new color space for LLIE, namely Horizontal/Vertical-Intensity (HVI), defined by polarized HS maps and learnable intensity. The former enforces small distances for red coordinates to remove the red artifacts, while the latter compresses the low-light regions to remove the black artifacts. To fully leverage the chromatic and intensity information, a novel Color and Intensity Decoupling Network (CIDNet) is further introduced to learn accurate photometric mapping function under different lighting conditions in the HVI space. Comprehensive results from benchmark and ablation experiments show that the proposed HVI color space with CIDNet outperforms the state-of-the-art methods on 10 datasets. The code is available at https://github.com/Fediory/HVI-CIDNet.

artificial intelligence, enhancement, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.20272

Country:

North America > United States (0.14)
Europe > Italy (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Adversarial Transform Particle Filters

Gong, Chengxin, Lin, Wei, Zhang, Cheng

arXiv.org Machine LearningFeb-10-2025

The particle filter (PF) and the ensemble Kalman filter (EnKF) are widely used for approximate inference in state-space models. From a Bayesian perspective, these algorithms represent the prior by an ensemble of particles and update it to the posterior with new observations over time. However, the PF often suffers from weight degeneracy in high-dimensional settings, whereas the EnKF relies on linear Gaussian assumptions that can introduce significant approximation errors. In this paper, we propose the Adversarial Transform Particle Filter (ATPF), a novel filtering framework that combines the strengths of the PF and the EnKF through adversarial learning. Specifically, importance sampling is used to ensure statistical consistency as in the PF, while adversarially learned transformations, such as neural networks, allow accurate posterior matching for nonlinear and non-Gaussian systems. In addition, we incorporate kernel methods to ease optimization and leverage regularization techniques based on optimal transport for better statistical properties and numerical stability. We provide theoretical guarantees, including generalization bounds for both the analysis and forecast steps of ATPF. Extensive experiments across various nonlinear and non-Gaussian scenarios demonstrate the effectiveness and practical advantages of our method.

artificial intelligence, machine learning, particle, (17 more...)

arXiv.org Machine Learning

2502.06165

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)

Add feedback

PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

Xie, Tianyu, Richman, Harry, Gao, Jiansi, Matsen, Frederick A. IV, Zhang, Cheng

arXiv.org Machine LearningFeb-7-2025

Learning informative representations of phylogenetic tree structures is essential for analyzing evolutionary relationships. Classical distance-based methods have been widely used to project phylogenetic trees into Euclidean space, but they are often sensitive to the choice of distance metric and may lack sufficient resolution. In this paper, we introduce phylogenetic variational autoencoders (PhyloVAEs), an unsupervised learning framework designed for representation learning and generative modeling of tree topologies. Leveraging an efficient encoding mechanism inspired by autoregressive tree topology generation, we develop a deep latent-variable generative model that facilitates fast, parallelized topology generation. Phylo-VAE combines this generative model with a collaborative inference model based on learnable topological features, allowing for high-resolution representations of phylogenetic tree samples. Extensive experiments demonstrate PhyloVAE's robust representation learning capabilities and fast generation of phylogenetic tree topologies. Phylogenetic trees are the foundational structure for describing the evolutionary processes among individuals or groups of biological entities. Reconstructing these trees based on collected biological sequences (e.g., DNA, RNA, protein) from observed species, also known as phylogenetic inference (Felsenstein, 2004), is an essential discipline of computational biology (Fitch, 1971; Felsenstein, 1981; Yang & Rannala, 1997; Ronquist et al., 2012). Large collections of trees obtained from these approaches (e.g., posterior samples from MCMC runs (Ronquist et al., 2012)), however, are often difficult to summarize or visualize due to the discrete and non-Euclidean nature of the tree topology space The classical approach to visualize and analyze distributions of phylogenetic trees is to calculate pairwise distances between the trees and project them into a plane using multidimensional scaling (MDS) (Amenta & Klingner, 2002; Hillis et al., 2005; Jombart et al., 2017). However, these approaches have the shortcoming that one can not map an arbitrary point in the visualization to a tree, and therefore do not form an actual visualization of the relevant tree space.

artificial intelligence, machine learning, tree topology, (18 more...)

arXiv.org Machine Learning

2502.0473

Country: Asia (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning

Cheng, Ziheng, Xie, Tianyu, Zhang, Shiyue, Zhang, Cheng

arXiv.org Machine LearningFeb-6-2025

While conditional diffusion models have achieved remarkable success in various applications, they require abundant data to train from scratch, which is often infeasible in practice. To address this issue, transfer learning has emerged as an essential paradigm in small data regimes. Despite its empirical success, the theoretical underpinnings of transfer learning conditional diffusion models remain unexplored. In this paper, we take the first step towards understanding the sample efficiency of transfer learning conditional diffusion models through the lens of representation learning. Inspired by practical training procedures, we assume that there exists a low-dimensional representation of conditions shared across all tasks. Our analysis shows that with a well-learned representation from source tasks, the samplecomplexity of target tasks can be reduced substantially. In addition, we investigate the practical implications of our theoretical results in several real-world applications of conditional diffusion models. Numerical experiments are also conducted to verify our results.

artificial intelligence, log 3, machine learning, (15 more...)

arXiv.org Machine Learning

2502.04491

Country:

North America > United States > California (0.14)
Europe > France (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

RMTransformer: Accurate Radio Map Construction and Coverage Prediction

Li, Yuxuan, Zhang, Cheng, Wang, Wen, Huang, Yongming

arXiv.org Artificial IntelligenceJan-11-2025

Radio map, or pathloss map prediction, is a crucial method for wireless network modeling and management. By leveraging deep learning to construct pathloss patterns from geographical maps, an accurate digital replica of the transmission environment could be established with less computational overhead and lower prediction error compared to traditional model-driven techniques. While existing state-of-the-art (SOTA) methods predominantly rely on convolutional architectures, this paper introduces a hybrid transformer-convolution model, termed RMTransformer, to enhance the accuracy of radio map prediction. The proposed model features a multi-scale transformer-based encoder for efficient feature extraction and a convolution-based decoder for precise pixel-level image reconstruction. Simulation results demonstrate that the proposed scheme significantly improves prediction accuracy, and over a 30% reduction in root mean square error (RMSE) is achieved compared to typical SOTA approaches.

machine learning, natural language, rmtransformer, (17 more...)

arXiv.org Artificial Intelligence

2501.0519

Country:

Asia (0.70)
Europe (0.46)
North America > United States > California (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language Models

Tu, Ruibo, Kjellström, Hedvig, Henter, Gustav Eje, Zhang, Cheng

arXiv.org Artificial IntelligenceDec-23-2024

Causal reasoning capabilities are essential for large language models (LLMs) in a wide range of applications, such as education and healthcare. But there is still a lack of benchmarks for a better understanding of such capabilities. Current LLM benchmarks are mainly based on conversational tasks, academic math tests, and coding tests. Such benchmarks evaluate LLMs in well-regularized settings, but they are limited in assessing the skills and abilities to solve real-world problems. In this work, we provide a benchmark, named by CARL-GT, which evaluates CAusal Reasoning capabilities of large Language models using Graphs and Tabular data. The benchmark has a diverse range of tasks for evaluating LLMs from causal graph reasoning, knowledge discovery, and decision-making aspects. In addition, effective zero-shot learning prompts are developed for the tasks. In our experiments, we leverage the benchmark for evaluating open-source LLMs and provide a detailed comparison of LLMs for causal reasoning abilities. We found that LLMs are still weak in casual reasoning, especially with tabular data to discover new insights. Furthermore, we investigate and discuss the relationships of different benchmark tasks by analyzing the performance of LLMs. The experimental results show that LLMs have different strength over different tasks and that their performance on tasks in different categories, i.e., causal graph reasoning, knowledge discovery, and decision-making, shows stronger correlation than tasks in the same category.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.1797

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models

Liu, Xinxin, Thomas, Aaron, Zhang, Cheng, Cheng, Jianyi, Zhao, Yiren, Gao, Xitong

arXiv.org Artificial IntelligenceDec-17-2024

Parameter-Efficient Fine-Tuning (PEFT) has gained prominence through low-rank adaptation methods like LoRA. In this paper, we focus on sparsity-based PEFT (SPEFT), which introduces trainable sparse adaptations to the weight matrices in the model, offering greater flexibility in selecting fine-tuned parameters compared to low-rank methods. We conduct the first systematic evaluation of salience metrics for SPEFT, inspired by zero-cost NAS proxies, and identify simple gradient-based metrics is reliable, and results are on par with the best alternatives, offering both computational efficiency and robust performance. Additionally, we compare static and dynamic masking strategies, finding that static masking, which predetermines non-zero entries before training, delivers efficiency without sacrificing performance, while dynamic masking offers no substantial benefits. Across NLP tasks, a simple gradient-based, static SPEFT consistently outperforms other fine-tuning methods for LLMs, providing a simple yet effective baseline for SPEFT. Our work challenges the notion that complexity is necessary for effective PEFT. Our work is open source and available to the community at [https://github.com/0-ml/speft].

large language model, machine learning, salience metric, (13 more...)

arXiv.org Artificial Intelligence

2412.13488

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Resonant Inductive Coupling Power Transfer for Mid-Sized Inspection Robot

Hassan, Mohd Norhakim Bin, Watson, Simon, Zhang, Cheng

arXiv.org Artificial IntelligenceNov-26-2024

This paper presents a wireless power transfer (WPT) for a mid-sized inspection mobile robot. The objective is to transmit 100 W of power over 1 meter of distance, achieved through lightweight Litz wire coils weighing 320 g held together with a coil structure of 3.54 kg. The Wireless Power Transfer System (WPTS) is mounted onto an unmanned ground vehicle (UGV). The study addresses an investigation of coil design, accounting for misalignment and tolerance issues in resonance-coupled coils. In experimental validation, the system effectively transmits 109.7 W of power over a 1-meter distance, with obstacles present. This achievement yields a system efficiency of 47.14%, a value that is remarkably close to the maximum power transfer point (50%) when the WPTS utilises the full voltage allowance of the capacitor. The paper shows the WPTS charging speed of 5 minutes for 12 V, 0.8 Ah lead acid batteries.

artificial intelligence, coil, mobile robot, (10 more...)

arXiv.org Artificial Intelligence

2411.17505

Country:

Asia (0.46)
Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Electrical Industrial Apparatus (1.00)
Energy > Power Industry (0.95)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Communications > Networks > Sensor Networks (0.46)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.39)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.34)

Add feedback