analysis and machine intelligence
Generative AI for Character Animation: A Comprehensive Survey of Techniques, Applications, and Future Directions
Abootorabi, Mohammad Mahdi, Ghahroodi, Omid, Zahraei, Pardis Sadat, Behzadasl, Hossein, Mirrokni, Alireza, Salimipanah, Mobina, Rasouli, Arash, Behzadipour, Bahar, Azarnoush, Sara, Maleki, Benyamin, Sadraiye, Erfan, Feriz, Kiarash Kiani, Nahad, Mahdi Teymouri, Moghadasi, Ali, Abianeh, Abolfazl Eshagh, Nazar, Nizi, Rabiee, Hamid R., Baghshah, Mahdieh Soleymani, Ahmadi, Meisam, Asgari, Ehsaneddin
Generative AI is reshaping art, gaming, and most notably animation. Recent breakthroughs in foundation and diffusion models have reduced the time and cost of producing animated content. Characters are central animation components, involving motion, emotions, gestures, and facial expressions. The pace and breadth of advances in recent months make it difficult to maintain a coherent view of the field, motivating the need for an integrative review. Unlike earlier overviews that treat avatars, gestures, or facial animation in isolation, this survey offers a single, comprehensive perspective on all the main generative AI applications for character animation. We begin by examining the state-of-the-art in facial animation, expression rendering, image synthesis, avatar creation, gesture modeling, motion synthesis, object generation, and texture synthesis. We highlight leading research, practical deployments, commonly used datasets, and emerging trends for each area. To support newcomers, we also provide a comprehensive background section that introduces foundational models and evaluation metrics, equipping readers with the knowledge needed to enter the field. We discuss open challenges and map future research directions, providing a roadmap to advance AI-driven character-animation technologies. This survey is intended as a resource for researchers and developers entering the field of generative AI animation or adjacent fields. Resources are available at: https://github.com/llm-lab-org/Generative-AI-for-Character-Animation-Survey.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
- Europe > Switzerland > Zürich > Zürich (0.13)
- North America > United States > New York > New York County > New York City (0.04)
- (14 more...)
- Research Report > Promising Solution (1.00)
- Overview (1.00)
- Media (1.00)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- (3 more...)
SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
Improving the efficiency of state-of-the-art methods in semantic segmentation requires overcoming the increasing computational cost as well as issues such as fusing semantic information from global and local contexts. Based on the recent success and problems that convolutional neural networks (CNNs) encounter in semantic segmentation, this research proposes an encoder-decoder architecture with a unique efficient residual network. Attention-boosting gates (AbGs) and attention-boosting modules (AbMs) are deployed by aiming to fuse the feature-based semantic information with the global context of the efficient residual network in the encoder. Respectively, the decoder network is developed with the additional attention-fusion networks (AfNs) inspired by AbM. AfNs are designed to improve the efficiency in the one-to-one conversion of the semantic information by deploying additional convolution layers in the decoder part. Our network is tested on the challenging CamVid and Cityscapes datasets, and the proposed methods reveal significant improvements on the existing baselines, such as ResNet-50. To the best of our knowledge, the developed network, SERNet-Former, achieves state-of-the-art results (84.62 % mean IoU) on CamVid dataset and challenging results (87.35 % mean IoU) on Cityscapes validation dataset.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning
Lin, Huiwei, Feng, Shanshan, Zhang, Baoquan, Li, Xutao, Ong, Yew-soon, Ye, Yunming
Online continual learning (OCL) aims to continuously learn new data from a single pass over the online data stream. It generally suffers from the catastrophic forgetting issue. Existing replay-based methods effectively alleviate this issue by replaying part of old data in a proxy-based or contrastive-based replay manner. In this paper, we conduct a comprehensive analysis of these two replay manners and find they can be complementary. Inspired by this finding, we propose a novel replay-based method called proxy-based contrastive replay (PCR), which replaces anchor-to-sample pairs with anchor-to-proxy pairs in the contrastive-based loss to alleviate the phenomenon of forgetting. Based on PCR, we further develop a more advanced method named holistic proxy-based contrastive replay (HPCR), which consists of three components. The contrastive component conditionally incorporates anchor-to-sample pairs to PCR, learning more fine-grained semantic information with a large training batch. The second is a temperature component that decouples the temperature coefficient into two parts based on their impacts on the gradient and sets different values for them to learn more novel knowledge. The third is a distillation component that constrains the learning process to keep more historical knowledge. Experiments on four datasets consistently demonstrate the superiority of HPCR over various state-of-the-art methods.
- Asia > China > Heilongjiang Province > Harbin (0.05)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > Singapore (0.04)
- (3 more...)
- Research Report (1.00)
- Instructional Material > Online (0.71)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Low-Rank Tensor Function Representation for Multi-Dimensional Data Recovery
Luo, Yisi, Zhao, Xile, Li, Zhemin, Ng, Michael K., Meng, Deyu
Since higher-order tensors are naturally suitable for representing multi-dimensional data in real-world, e.g., color images and videos, low-rank tensor representation has become one of the emerging areas in machine learning and computer vision. However, classical low-rank tensor representations can only represent data on finite meshgrid due to their intrinsical discrete nature, which hinders their potential applicability in many scenarios beyond meshgrid. To break this barrier, we propose a low-rank tensor function representation (LRTFR), which can continuously represent data beyond meshgrid with infinite resolution. Specifically, the suggested tensor function, which maps an arbitrary coordinate to the corresponding value, can continuously represent data in an infinite real space. Parallel to discrete tensors, we develop two fundamental concepts for tensor functions, i.e., the tensor function rank and low-rank tensor function factorization. We theoretically justify that both low-rank and smooth regularizations are harmoniously unified in the LRTFR, which leads to high effectiveness and efficiency for data continuous representation. Extensive multi-dimensional data recovery applications arising from image processing (image inpainting and denoising), machine learning (hyperparameter optimization), and computer graphics (point cloud upsampling) substantiate the superiority and versatility of our method as compared with state-of-the-art methods. Especially, the experiments beyond the original meshgrid resolution (hyperparameter optimization) or even beyond meshgrid (point cloud upsampling) validate the favorable performances of our method for continuous representation.
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
CPnP: Consistent Pose Estimator for Perspective-n-Point Problem with Bias Elimination
Zeng, Guangyang, Chen, Shiyu, Mu, Biqiang, Shi, Guodong, Wu, Junfeng
The Perspective-n-Point (PnP) problem has been widely studied in both computer vision and photogrammetry societies. With the development of feature extraction techniques, a large number of feature points might be available in a single shot. It is promising to devise a consistent estimator, i.e., the estimate can converge to the true camera pose as the number of points increases. To this end, we propose a consistent PnP solver, named \emph{CPnP}, with bias elimination. Specifically, linear equations are constructed from the original projection model via measurement model modification and variable elimination, based on which a closed-form least-squares solution is obtained. We then analyze and subtract the asymptotic bias of this solution, resulting in a consistent estimate. Additionally, Gauss-Newton (GN) iterations are executed to refine the consistent solution. Our proposed estimator is efficient in terms of computations -- it has $O(n)$ computational complexity. Experimental tests on both synthetic data and real images show that our proposed estimator is superior to some well-known ones for images with dense visual features, in terms of estimation precision and computing time.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Differentially private subspace clustering
Wang, Yining, Wang, Yu-Xiang, Singh, Aarti
Subspace clustering is an unsupervised learning problem that aims at grouping data points into multiple ``clusters'' so that data points in a single cluster lie approximately on a low-dimensional linear subspace. It is originally motivated by 3D motion segmentation in computer vision, but has recently been generically applied to a wide range of statistical machine learning problems, which often involves sensitive datasets about human subjects. This raises a dire concern for data privacy. In this work, we build on the framework of ``differential privacy'' and present two provably private subspace clustering algorithms. We demonstrate via both theory and experiments that one of the presented methods enjoys formal privacy and utility guarantees; the other one asymptotically preserves differential privacy while having good performance in practice. Along the course of the proof, we also obtain two new provable guarantees for the agnostic subspace clustering and the graph connectivity problem which might be of independent interests.
- Asia > Singapore (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Montana (0.04)
- Information Technology > Security & Privacy (1.00)
- Education > Focused Education > Special Education (0.44)
Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features
Zhuang, Liansheng, Gao, Shenghua, Tang, Jinhui, Wang, Jingjing, Lin, Zhouchen, Ma, Yi
This paper aims at constructing a good graph for discovering intrinsic data structures in a semi-supervised learning setting. Firstly, we propose to build a non-negative low-rank and sparse (referred to as NNLRS) graph for the given data representation. Specifically, the weights of edges in the graph are obtained by seeking a nonnegative low-rank and sparse matrix that represents each data sample as a linear combination of others. The so-obtained NNLRS-graph can capture both the global mixture of subspaces structure (by the low rankness) and the locally linear structure (by the sparseness) of the data, hence is both generative and discriminative. Secondly, as good features are extremely important for constructing a good graph, we propose to learn the data embedding matrix and construct the graph jointly within one framework, which is termed as NNLRS with embedded features (referred to as NNLRS-EF). Extensive experiments on three publicly available datasets demonstrate that the proposed method outperforms the state-of-the-art graph construction method by a large margin for both semi-supervised classification and discriminative analysis, which verifies the effectiveness of our proposed method.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- (3 more...)
One-shot learning by inverting a compositional causal process
Lake, Brenden M., Salakhutdinov, Ruslan R., Tenenbaum, Josh
People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems. Here we present a Hierarchical Bayesian model based on compositionality and causality that can learn a wide range of natural (although simple) visual concepts, generalizing in human-like ways from just one image. We evaluated performance on a challenging one-shot classification task, where our model achieved a human-level error rate while substantially outperforming two deep learning models. We also used a visual Turing test" to show that our model produces human-like performance on other conceptual tasks, including generating new examples and parsing."
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > New York (0.04)