AITopics

2502.14645

Country: Asia > Thailand (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-21-2024

Hierarchical Search-Based Cooperative Motion Planning

Wu, Yuchen, Yang, Yifan, Xu, Gang, Cao, Junjie, Chen, Yansong, Wen, Licheng, Liu, Yong

Cooperative path planning, a crucial aspect of multi-agent systems research, serves a variety of sectors, including military, agriculture, and industry. Many existing algorithms, however, come with certain limitations, such as simplified kinematic models and inadequate support for multiple group scenarios. Focusing on the planning problem associated with a nonholonomic Ackermann model for Unmanned Ground Vehicles (UGV), we propose a leaderless, hierarchical Search-Based Cooperative Motion Planning (SCMP) method. The high-level utilizes a binary conflict search tree to minimize runtime, while the low-level fabricates kinematically feasible, collision-free paths that are shape-constrained. Our algorithm can adapt to scenarios featuring multiple groups with different shapes, outlier agents, and elaborate obstacles. We conduct algorithm comparisons, performance testing, simulation, and real-world testing, verifying the effectiveness and applicability of our algorithm. The implementation of our method will be open-sourced at https://github.com/WYCUniverStar/SCMP.

agent, algorithm, artificial intelligence, (17 more...)

2410.1571

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

arXiv.org Machine LearningOct-7-2024

Stochastic Runge-Kutta Methods: Provable Acceleration of Diffusion Models

Wu, Yuchen, Chen, Yuxin, Wei, Yuting

Diffusion models play a pivotal role in contemporary generative modeling, claiming state-of-the-art performance across various domains. Despite their superior sample quality, mainstream diffusion-based stochastic samplers like DDPM often require a large number of score function evaluations, incurring considerably higher computational cost compared to single-step generators like generative adversarial networks. While several acceleration methods have been proposed in practice, the theoretical foundations for accelerating diffusion models remain underexplored. In this paper, we propose and analyze a training-free acceleration algorithm for SDE-style diffusion samplers, based on the stochastic Runge-Kutta method. The proposed sampler provably attains $\varepsilon^2$ error -- measured in KL divergence -- using $\widetilde O(d^{3/2} / \varepsilon)$ score function evaluations (for sufficiently small $\varepsilon$), strengthening the state-of-the-art guarantees $\widetilde O(d^{3} / \varepsilon)$ in terms of dimensional dependency. Numerical experiments validate the efficiency of the proposed method.

artificial intelligence, diffusion model, machine learning, (14 more...)

2410.0476

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-23-2024

Optimizing Multi-Touch Textile and Tactile Skin Sensing Through Circuit Parameter Estimation

Su, Bo Ying, Wu, Yuchen, Wen, Chengtao, Liu, Changliu

Tactile and textile skin technologies have become increasingly important for enhancing human-robot interaction and allowing robots to adapt to different environments. Despite notable advancements, there are ongoing challenges in skin signal processing, particularly in achieving both accuracy and speed in dynamic touch sensing. This paper introduces a new framework that poses the touch sensing problem as an estimation problem of resistive sensory arrays. Utilizing a Regularized Least Squares objective function which estimates the resistance distribution of the skin. We enhance the touch sensing accuracy and mitigate the ghosting effects, where false or misleading touches may be registered. Furthermore, our study presents a streamlined skin design that simplifies manufacturing processes without sacrificing performance. Experimental outcomes substantiate the effectiveness of our method, showing 26.9% improvement in multi-touch force-sensing accuracy for the tactile skin.

artificial intelligence, machine learning, resistance, (18 more...)

2404.15131

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.50)

Industry: Materials (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.84)

arXiv.org Machine LearningMar-3-2024

Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models

Wu, Yuchen, Chen, Minshuo, Li, Zihao, Wang, Mengdi, Wei, Yuting

Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. Such information is coined as guidance. For example, in text-to-image synthesis, text input is encoded as guidance to generate semantically aligned images. Proper guidance inputs are closely tied to the performance of diffusion models. A common observation is that strong guidance promotes a tight alignment to the task-specific information, while reducing the diversity of the generated samples. In this paper, we provide the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models. Under mild conditions, we prove that incorporating diffusion guidance not only boosts classification confidence but also diminishes distribution diversity, leading to a reduction in the differential entropy of the output distribution. Our analysis covers the widely adopted sampling schemes including DDPM and DDIM, and leverages comparison inequalities for differential equations as well as the Fokker-Planck equation that characterizes the evolution of probability density function, which may be of independent theoretical interest.

artificial intelligence, machine learning, natural language, (15 more...)

2403.01639

Country: North America > United States (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningFeb-26-2024

Failures and Successes of Cross-Validation for Early-Stopped Gradient Descent

Patil, Pratik, Wu, Yuchen, Tibshirani, Ryan J.

We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we show that LOOCV converges uniformly along the GD trajectory to the prediction risk. Our theory requires only mild assumptions on the data distribution and does not require the underlying regression function to be linear. Furthermore, by leveraging the individual LOOCV errors, we construct consistent estimators for the entire prediction error distribution along the GD trajectory and consistent estimators for a wide class of error functionals. This in particular enables the construction of pathwise prediction intervals based on GD iterates that have asymptotically correct nominal coverage conditional on the training data.

artificial intelligence, exp, machine learning, (16 more...)

2402.16793

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)

arXiv.org Artificial IntelligenceDec-28-2023

ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation

Xu, Jiazheng, Liu, Xiao, Wu, Yuchen, Tong, Yuxuan, Li, Qinkai, Ding, Ming, Tang, Jie, Dong, Yuxiao

We present a comprehensive solution to learn and improve text-to-image models from human preference feedback. To begin with, we build ImageReward -- the first general-purpose text-to-image human preference reward model -- to effectively encode human preferences. Its training is based on our systematic annotation pipeline including rating and ranking, which collects 137k expert comparisons to date. In human evaluation, ImageReward outperforms existing scoring models and metrics, making it a promising automatic metric for evaluating text-to-image synthesis. On top of it, we propose Reward Feedback Learning (ReFL), a direct tuning algorithm to optimize diffusion models against a scorer. Both automatic and human evaluation support ReFL's advantages over compared methods. All code and datasets are provided at \url{https://github.com/THUDM/ImageReward}.

artificial intelligence, imagereward, machine learning, (17 more...)

2304.05977

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Industry: Media > Photography (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-20-2023

Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models

Mei, Song, Wu, Yuchen

We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of dimensionality for intrinsically high-dimensional data. This limitation is pronounced in graphical models such as Markov random fields, common for image distributions, where the approximation efficiency of score functions remains unestablished. To address this, we observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms. Furthermore, these algorithms are amenable to efficient neural network representation. We demonstrate this in examples of graphical models, including Ising models, conditional Ising models, restricted Boltzmann machines, and sparse encoding models. Combined with off-the-shelf discretization error bounds for diffusion-based sampling, we provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.

approximation, artificial intelligence, machine learning, (18 more...)

2309.1142

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceJun-8-2023

Are We Ready for Radar to Replace Lidar in All-Weather Mapping and Localization?

Burnett, Keenan, Wu, Yuchen, Yoon, David J., Schoellig, Angela P., Barfoot, Timothy D.

We present an extensive comparison between three topometric localization systems: radar-only, lidar-only, and a cross-modal radar-to-lidar system across varying seasonal and weather conditions using the Boreas dataset. Contrary to our expectations, our experiments showed that our lidar-only pipeline achieved the best localization accuracy even during a snowstorm. Our results seem to suggest that the sensitivity of lidar localization to moderate precipitation has been exaggerated in prior works. However, our radar-only pipeline was able to achieve competitive accuracy with a much smaller map. Furthermore, radar localization and radar sensors still have room to improve and may yet prove valuable in extreme weather or as a redundant backup system. Code for this project can be found at: https://github.com/utiasASRL/vtr3

artificial intelligence, localization, radar, (16 more...)

2203.10174

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (0.70)

arXiv.org Artificial IntelligenceJan-26-2023

Boreas: A Multi-Season Autonomous Driving Dataset

Burnett, Keenan, Yoon, David J., Wu, Yuchen, Li, Andrew Zou, Zhang, Haowei, Lu, Shichen, Qian, Jingxing, Tseng, Wei-Kang, Lambert, Andrew, Leung, Keith Y. K., Schoellig, Angela P., Barfoot, Timothy D.

Lidar To date, autonomous vehicle research and development has focused on achieving sufficient reliability in ideal conditions GNSS/IMU Camera such as the sunny climates observed in San Francisco, California or Phoenix, Arizona. Adverse weather conditions such as rain and snow remain outside the operational envelope for many of these systems. Additionally, a majority of self-driving vehicles are currently reliant on highlyaccurate maps for both localization and perception. These maps are costly to maintain and may degrade as a result of seasonal changes. In order for self-driving vehicles to be deployed safely, these short-comings must be addressed. To encourage research in this area, we have created the Boreas dataset, a large multi-modal dataset collected by driving a repeated route over the course of one year.

artificial intelligence, dataset, sequence, (17 more...)

2203.10168

Country:

North America > United States > California > San Francisco County > San Francisco (0.54)
North America > United States > Arizona (0.54)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.65)
Automobiles & Trucks (0.51)
Information Technology > Robotics & Automation (0.41)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)