AITopics | Wu, Hao

Collaborating Authors

Wu, Hao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A bio-inspired sand-rolling robot: effect of body shape on sand rolling performance

Liao, Xingjue, Liu, Wenhao, Wu, Hao, Qian, Feifei

arXiv.org Artificial IntelligenceMar-18-2025

The capability of effectively moving on complex terrains such as sand and gravel can empower our robots to robustly operate in outdoor environments, and assist with critical tasks such as environment monitoring, search-and-rescue, and supply delivery. Inspired by the Mount Lyell salamander's ability to curl its body into a loop and effectively roll down {\Revision hill slopes}, in this study we develop a sand-rolling robot and investigate how its locomotion performance is governed by the shape of its body. We experimentally tested three different body shapes: Hexagon, Quadrilateral, and Triangle. We found that Hexagon and Triangle can achieve a faster rolling speed on sand, but exhibited more frequent failures of getting stuck. Analysis of the interaction between robot and sand revealed the failure mechanism: the deformation of the sand produced a local ``sand incline'' underneath robot contact segments, increasing the effective region of supporting polygon (ERSP) and preventing the robot from shifting its center of mass (CoM) outside the ERSP to produce sustainable rolling. Based on this mechanism, a highly-simplified model successfully captured the critical body pitch for each rolling shape to produce sustained rolling on sand, and informed design adaptations that mitigated the locomotion failures and improved robot speed by more than 200$\%$. Our results provide insights into how locomotors can utilize different morphological features to achieve robust rolling motion across deformable substrates.

artificial intelligence, pitch angle, robot, (18 more...)

arXiv.org Artificial Intelligence

2503.13919

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Adaptive Stochastic Gradient Descents on Manifolds with an Application on Weighted Low-Rank Approximation

Yang, Peiqi, Xu, Conglong, Wu, Hao

arXiv.org Artificial IntelligenceMar-14-2025

We prove a convergence theorem for stochastic gradient descents on manifolds with adaptive learning rate and apply it to the weighted low-rank approximation problem.

artificial intelligence, lemma 2, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.11833

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Water Quality Data Imputation via A Fast Latent Factorization of Tensors with PID-based Optimizer

Liu, Qian, Wang, Lan, Yang, Bing, Wu, Hao

arXiv.org Artificial IntelligenceMar-10-2025

Water quality data can supply a substantial decision support for water resources utilization and pollution prevention. However, there are numerous missing values in water quality data due to inescapable factors like sensor failure, thereby leading to biased result for hydrological analysis and failing to support environmental governance decision accurately. A Latent Factorization of Tensors (LFT) with Stochastic Gradient Descent (SGD) proves to be an efficient imputation method. However, a standard SGD-based LFT model commonly surfers from the slow convergence that impairs its efficiency. To tackle this issue, this paper proposes a Fast Latent Factorization of Tensors (FLFT) model. It constructs an adjusted instance error into SGD via leveraging a nonlinear PID controller to incorporates the past, current and future information of prediction error for improving convergence rate. Comparing with state-of-art models in real world datasets, the results of experiment indicate that the FLFT model achieves a better convergence rate and higher accuracy.

artificial intelligence, latent factorization, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2503.06997

Country: Asia > China (1.00)

Genre: Research Report (0.64)

Industry: Water & Waste Management > Water Management > Water Supplies & Services (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition

Ding, Xin, Wu, Hao, Yang, Yifan, Jiang, Shiqi, Bai, Donglin, Chen, Zhibo, Cao, Ting

arXiv.org Artificial IntelligenceMar-8-2025

With the rise of real-world human-AI interaction applications, such as AI assistants, the need for Streaming Video Dialogue is critical. To address this need, we introduce \sys, a video LLM framework that achieves ultra-FPS streaming video processing (100 fps on a single A100) and enables proactive, always-on responses in real time, without explicit user intervention. To solve the key challenge of the contradiction between linear video streaming speed and quadratic transformer computation cost, we propose a novel perception-cognition interleaving paradigm named ''event-gated LLM invocation'', in contrast to the existing per-time-step LLM invocation. By introducing a Cognition Gate network between the video encoder and the LLM, LLM is only invoked when relevant events occur. To realize the event feature extraction with constant cost, we propose Event-Preserving Feature Extractor (EPFE) based on state-space method, generating a single perception token for spatiotemporal features. These techniques enable the video LLM with full-FPS perception and real-time cognition response. Experiments on Ego4D and SoccerNet streaming tasks, as well as standard offline benchmarks, demonstrate state-of-the-art performance in both model capability and real-time efficiency, paving the way for ultra-high-FPS applications, such as Game AI Copilot and interactive media.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2503.0622

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Soccer (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SpecServe: Efficient and SLO-Aware Large Language Model Serving with Adaptive Speculative Decoding

Huang, Kaiyu, Wu, Hao, Shi, Zhubo, Zou, Han, Yu, Minchen, Shi, Qingjiang

arXiv.org Artificial IntelligenceMar-6-2025

Large Language Model (LLM) services often face challenges in achieving low inference latency and meeting Service Level Objectives (SLOs) under dynamic request patterns. Speculative decoding, which exploits lightweight models for drafting and LLMs for verification, has emerged as a compelling technique to accelerate LLM inference. However, existing speculative decoding solutions often fail to adapt to varying workloads and system environments, resulting in performance variability and SLO violations. In this paper, we introduce SpecServe, an efficient LLM inference system that dynamically adjusts speculative strategies according to real-time request loads and system configurations. SpecServe proposes a theoretical model to understand and predict the efficiency of speculative decoding across diverse scenarios. Additionally, it implements intelligent drafting and verification algorithms to guarantee optimal performance while achieving high SLO attainment. Experimental results on real-world LLM traces demonstrate that SpecServe consistently meets SLOs and achieves substantial performance improvements, yielding 1.14$\times$-14.3$\times$ speedups over state-of-the-art speculative inference systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.05096

Country:

North America > United States (0.68)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Flow-based Bayesian filtering for high-dimensional nonlinear stochastic dynamical systems

Wang, Xintong, Guan, Xiaofei, Guo, Ling, Wu, Hao

arXiv.org Machine LearningMar-5-2025

Bayesian filtering for high-dimensional nonlinear stochastic dynamical systems is a fundamental yet challenging problem in many fields of science and engineering. Existing methods face significant obstacles: Gaussian-based filters struggle with non-Gaussian distributions, while sequential Monte Carlo methods are computationally intensive and prone to particle degeneracy in high dimensions. Although generative models in machine learning have made significant progress in modeling high-dimensional non-Gaussian distributions, their inefficiency in online updating limits their applicability to filtering problems. To address these challenges, we propose a flow-based Bayesian filter (FBF) that integrates normalizing flows to construct a novel latent linear state-space model with Gaussian filtering distributions. This framework facilitates efficient density estimation and sampling using invertible transformations provided by normalizing flows, and it enables the construction of filters in a data-driven manner, without requiring prior knowledge of system dynamics or observation models. Numerical experiments demonstrate the superior accuracy and efficiency of FBF.

artificial intelligence, machine learning, section 4, (19 more...)

arXiv.org Machine Learning

2502.16232

Country: Asia > China (0.14)

Genre:

Research Report (1.00)
Overview (0.68)

Add feedback

Learning-Augmented Frequent Directions

Aamand, Anders, Chen, Justin Y., Gollapudi, Siddharth, Silwal, Sandeep, Wu, Hao

arXiv.org Artificial IntelligenceMar-2-2025

An influential paper of Hsu et al. (ICLR'19) introduced the study of learning-augmented streaming algorithms in the context of frequency estimation. A fundamental problem in the streaming literature, the goal of frequency estimation is to approximate the number of occurrences of items appearing in a long stream of data using only a small amount of memory. Hsu et al. develop a natural framework to combine the worst-case guarantees of popular solutions such as CountMin and CountSketch with learned predictions of high frequency elements. They demonstrate that learning the underlying structure of data can be used to yield better streaming algorithms, both in theory and practice. We simplify and generalize past work on learning-augmented frequency estimation. Our first contribution is a learning-augmented variant of the Misra-Gries algorithm which improves upon the error of learned CountMin and learned CountSketch and achieves the state-of-the-art performance of randomized algorithms (Aamand et al., NeurIPS'23) with a simpler, deterministic algorithm. Our second contribution is to adapt learning-augmentation to a high-dimensional generalization of frequency estimation corresponding to finding important directions (top singular vectors) of a matrix given its rows one-by-one in a stream. We analyze a learning-augmented variant of the Frequent Directions algorithm, extending the theoretical and empirical understanding of learned predictions to matrix streaming.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.00937

Country:

North America > United States (0.93)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

BeamVQ: Beam Search with Vector Quantization to Mitigate Data Scarcity in Physical Spatiotemporal Forecasting

Wang, Weiyan, Shi, Xingjian, Shu, Ruiqi, Gao, Yuan, Chen, Rui Ray, Wang, Kun, Xu, Fan, Xue, Jinbao, Li, Shuaipeng, Tao, Yangyu, Wang, Di, Wu, Hao, Huang, Xiaomeng

arXiv.org Artificial IntelligenceFeb-26-2025

In practice, physical spatiotemporal forecasting can suffer from data scarcity, because collecting large-scale data is non-trivial, especially for extreme events. Hence, we propose \method{}, a novel probabilistic framework to realize iterative self-training with new self-ensemble strategies, achieving better physical consistency and generalization on extreme events. Following any base forecasting model, we can encode its deterministic outputs into a latent space and retrieve multiple codebook entries to generate probabilistic outputs. Then BeamVQ extends the beam search from discrete spaces to the continuous state spaces in this field. We can further employ domain-specific metrics (e.g., Critical Success Index for extreme events) to filter out the top-k candidates and develop the new self-ensemble strategy by combining the high-quality candidates. The self-ensemble can not only improve the inference quality and robustness but also iteratively augment the training datasets during continuous self-training. Consequently, BeamVQ realizes the exploration of rare but critical phenomena beyond the original dataset. Comprehensive experiments on different benchmarks and backbones show that BeamVQ consistently reduces forecasting MSE (up to 39%), enhancing extreme events detection and proving its effectiveness in handling data scarcity.

artificial intelligence, beamvq, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2502.18925

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)

Add feedback

Weighted Low-rank Approximation via Stochastic Gradient Descent on Manifolds

Xu, Conglong, Yang, Peiqi, Wu, Hao

arXiv.org Machine LearningFeb-19-2025

We solve a regularized weighted low-rank approximation problem by a stochastic gradient descent on a manifold. To guarantee the convergence of our stochastic gradient descent, we establish a convergence theorem on manifolds for retraction-based stochastic gradient descents admitting confinements. On sample data from the Netflix Prize training dataset, our algorithm outperforms the existing stochastic gradient descent on Euclidean spaces. We also compare the accelerated line search on this manifold to the existing accelerated line search on Euclidean spaces.

artificial intelligence, equation, machine learning, (15 more...)

arXiv.org Machine Learning

2502.14174

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry:

Media > Film (0.34)
Leisure & Entertainment (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Benchmarking Large Language Models via Random Variables

Hong, Zijin, Wu, Hao, Dong, Su, Dong, Junnan, Xiao, Yilin, Zhang, Yujing, Wang, Zhu, Huang, Feiran, Li, Linyi, Yang, Hongxia, Huang, Xiao

arXiv.org Artificial IntelligenceFeb-17-2025

Recent studies have raised concerns about the reliability of current mathematical benchmarks, highlighting issues such as simplistic design and potential data contamination. Therefore, creating a reliable benchmark that effectively evaluates the genuine capabilities of large language models (LLMs) in mathematical reasoning remains a significant challenge. To address this, we propose RV-Bench, a framework for Benchmarking LLMs via Random Variables in mathematical reasoning. Specifically, the background content of a random variable question (RV question) mirrors the original problem in existing benchmarks, but the variable combinations are randomized, making it "unseen" by the LLMs. Models must completely understand the question pattern of the original problem to correctly answer RV questions with various variable values. As a result, the LLM's genuine capability in mathematical reasoning is reflected by its accuracy and robustness on RV-Bench. We conducted extensive experiments on over 30 representative LLMs across more than 1000 RV questions. Our findings suggest that LLMs exhibit an imbalance in proficiency between encountered and "unseen" data domains. Proficiency generalization across similar mathematical reasoning tasks is verified to be limited by accuracy and robustness, but it can still be enhanced through test-time scaling.

artificial intelligence, large language model, random variable, (1 more...)

arXiv.org Artificial Intelligence

2501.1179

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback