AITopics

2110.11945

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-29-2021

Cloud2Curve: Generation and Vectorization of Parametric Sketches

Das, Ayan, Yang, Yongxin, Hospedales, Timothy, Xiang, Tao, Song, Yi-Zhe

Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. We further aim to model sketches as a sequence of low-dimensional parametric curves. To this end, we propose an inverse graphics framework capable of approximating a raster or waypoint based stroke encoded as a point-cloud with a variable-degree B\'ezier curve. Building on this module, we present Cloud2Curve, a generative model for scalable high-resolution vector sketches that can be trained end-to-end using point-cloud data alone. As a consequence, our model is also capable of deterministic vectorization which can map novel raster or waypoint based sketches to their corresponding high-resolution scalable B\'ezier equivalent. We evaluate the generation and vectorization capabilities of our model on Quick, Draw! and K-MNIST datasets.

deep learning, neural network, sketch, (20 more...)

2103.15536

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-3-2021

Domain Generalization: A Survey

Zhou, Kaiyang, Liu, Ziwei, Qiao, Yu, Xiang, Tao, Loy, Chen Change

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most statistical learning algorithms strongly rely on the i.i.d.~assumption while in practice the target data often come from a different distribution than the source data, known as domain shift. Domain generalization (DG) aims to achieve OOD generalization by only using source domain data for model learning. Since first introduced in 2011, research in DG has undergone a decade progress. Ten years of research in this topic have led to a broad spectrum of methodologies, e.g., based on domain alignment, meta-learning, data augmentation, or ensemble learning, just to name a few; and have covered various applications such as object recognition, segmentation, action recognition, and person re-identification. In this paper, for the first time, a comprehensive literature review is provided to summarize the ten-year development in DG. First, we cover the background by giving the problem definitions and discussing how DG is related to other fields like domain adaptation and transfer learning. Second, we conduct a thorough review into existing methods and present a taxonomy based on their methodologies and motivations. Finally, we conclude this survey with potential research directions.

deep learning, generalization, neural network, (19 more...)

2103.02503

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Information Technology (0.92)
Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningSep-18-2020

The Hidden Vulnerability of Watermarking for Deep Neural Networks

Guo, Shangwei, Zhang, Tianwei, Qiu, Han, Zeng, Yi, Xiang, Tao, Liu, Yang

Watermarking has shown its effectiveness in protecting the intellectual property of Deep Neural Networks (DNNs). Existing techniques usually embed a set of carefully-crafted sample-label pairs into the target model during the training process. Then ownership verification is performed by querying a suspicious model with those watermark samples and checking the prediction results. These watermarking solutions claim to be robustness against model transformations, which is challenged by this paper. We design a novel watermark removal attack, which can defeat state-of-the-art solutions without any prior knowledge of the adopted watermarking technique and training samples. We make two contributions in the design of this attack. First, we propose a novel preprocessing function, which embeds imperceptible patterns and performs spatial-level transformations over the input. This function can make the watermark sample unrecognizable by the watermarked model, while still maintaining the correct prediction results of normal samples. Second, we introduce a fine-tuning strategy using unlabelled and out-of-distribution samples, which can improve the model usability in an efficient manner. Extensive experimental results indicate that our proposed attack can effectively bypass existing watermarking solutions with very high success rates.

deep learning, intellectual property & technology law, watermark, (20 more...)

2009.08697

Country:

Asia (0.46)
North America > United States > California (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

arXiv.org Machine LearningFeb-12-2020

Few-Shot Learning as Domain Adaptation: Algorithm and Analysis

Guan, Jiechao, Lu, Zhiwu, Xiang, Tao, Wen, Ji-Rong

To recognize the unseen classes with only few samples, few-shot learning (FSL) uses prior knowledge learned from the seen classes. A major challenge for FSL is that the distribution of the unseen classes is different from that of those seen, resulting in poor generalization even when a model is meta-trained on the seen classes. This class-difference-caused distribution shift can be considered as a special case of domain shift. In this paper, for the first time, we propose a domain adaptation prototypical network with attention (DAPNA) to explicitly tackle such a domain shift problem in a meta-learning framework. Specifically, armed with a set transformer based attention module, we construct each episode with two sub-episodes without class overlap on the seen classes to simulate the domain shift between the seen and unseen classes. To align the feature distributions of the two sub-episodes with limited training samples, a feature transfer network is employed together with a margin disparity discrepancy (MDD) loss. Importantly, theoretical analysis is provided to give the learning bound of our DAPNA. Extensive experiments show that our DAPNA outperforms the state-of-the-art FSL alternatives, often by significant margins.

deep learning, few-shot learning, neural network, (18 more...)

2002.0205

Country:

Asia (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJan-8-2019

Tree Tensor Networks for Generative Modeling

Cheng, Song, Wang, Lei, Xiang, Tao, Zhang, Pan

Matrix product states (MPS), a tensor network designed for one-dimensional quantum systems, has been recently proposed for generative modeling of natural data (such as images) in terms of `Born machine'. However, the exponential decay of correlation in MPS restricts its representation power heavily for modeling complex data such as natural images. In this work, we push forward the effort of applying tensor networks to machine learning by employing the Tree Tensor Network (TTN) which exhibits balanced performance in expressibility and efficient training and sampling. We design the tree tensor network to utilize the 2-dimensional prior of the natural images and develop sweeping learning and sampling algorithms which can be efficiently implemented utilizing Graphical Processing Units (GPU). We apply our model to random binary patterns and the binary MNIST datasets of handwritten digits. We show that TTN is superior to MPS for generative modeling in keeping correlation of pixels in natural images, as well as giving better log-likelihood scores in standard datasets of handwritten digits. We also compare its performance with state-of-the-art generative models such as the Variational AutoEncoders, Restricted Boltzmann machines, and PixelCNN. Finally, we discuss the future development of Tensor Network States in machine learning problems.

deep learning, neural network, tensor, (18 more...)

1901.02217

Country: Asia > China (0.15)

Genre: Research Report (0.50)

Industry: Education (0.34)

Neural Information Processing SystemsDec-31-2018

Domain-Invariant Projection Learning for Zero-Shot Recognition

Zhao, An, Ding, Mingyu, Guan, Jiechao, Lu, Zhiwu, Xiang, Tao, Wen, Ji-Rong

Zero-shot learning (ZSL) aims to recognize unseen object classes without any training samples, which can be regarded as a form of transfer learning from seen classes to unseen ones. This is made possible by learning a projection between a feature space and a semantic space (e.g. attribute space). Key to ZSL is thus to learn a projection function that is robust against the often large domain gap between the seen and unseen classes. In this paper, we propose a novel ZSL model termed domain-invariant projection learning (DIPL). Our model has two novel components: (1) A domain-invariant feature self-reconstruction task is introduced to the seen/unseen class data, resulting in a simple linear formulation that casts ZSL into a min-min optimization problem. Solving the problem is non-trivial, and a novel iterative algorithm is formulated as the solver, with rigorous theoretic algorithm analysis provided. (2) To further align the two domains via the learned projection, shared semantic structure among seen and unseen classes is explored via forming superclasses in the semantic space. Extensive experiments show that our model outperforms the state-of-the-art alternatives by significant margins.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
(2 more...)

Neural Information Processing SystemsDec-31-2018

Domain-Invariant Projection Learning for Zero-Shot Recognition

Zhao, An, Ding, Mingyu, Guan, Jiechao, Lu, Zhiwu, Xiang, Tao, Wen, Ji-Rong

Zero-shot learning (ZSL) aims to recognize unseen object classes without any training samples, which can be regarded as a form of transfer learning from seen classes to unseen ones. This is made possible by learning a projection between a feature space and a semantic space (e.g. attribute space). Key to ZSL is thus to learn a projection function that is robust against the often large domain gap between the seen and unseen classes. In this paper, we propose a novel ZSL model termed domain-invariant projection learning (DIPL). Our model has two novel components: (1) A domain-invariant feature self-reconstruction task is introduced to the seen/unseen class data, resulting in a simple linear formulation that casts ZSL into a min-min optimization problem. Solving the problem is non-trivial, and a novel iterative algorithm is formulated as the solver, with rigorous theoretic algorithm analysis provided. (2) To further align the two domains via the learned projection, shared semantic structure among seen and unseen classes is explored via forming superclasses in the semantic space. Extensive experiments show that our model outperforms the state-of-the-art alternatives by significant margins.

artificial intelligence, deep learning, neural network, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
North America > United States (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
(2 more...)

arXiv.org Artificial IntelligenceAug-7-2018

SketchyScene: Richly-Annotated Scene Sketches

Zou, Changqing, Yu, Qian, Du, Ruofei, Mo, Haoran, Song, Yi-Zhe, Xiang, Tao, Gao, Chengying, Chen, Baoquan, Zhang, Hao

We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level. The dataset is created through a novel and carefully designed crowdsourcing pipeline, enabling users to efficiently generate large quantities of realistic and diverse scene sketches. SketchyScene contains more than 29,000 scene-level sketches, 7,000+ pairs of scene templates and photos, and 11,000+ object sketches. All objects in the scene sketches have ground-truth semantic and instance masks. The dataset is also highly scalable and extensible, easily allowing augmenting and/or changing scene composition. We demonstrate the potential impact of SketchyScene by training new computational models for semantic segmentation of scene sketches and showing how the new dataset enables several applications including image retrieval, sketch colorization, editing, and captioning, etc. The dataset and code can be found at https://github.com/SketchyScene/SketchyScene.

crowdsourcing, neural network, sketch, (23 more...)

1808.02473

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.46)

arXiv.org Machine LearningFeb-13-2018

Pose-Normalized Image Generation for Person Re-identification

Qian, Xuelin, Fu, Yanwei, Wang, Wenxuan, Xiang, Tao, Wu, Yang, Jiang, Yu-Gang, Xue, Xiangyang

Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations. In this work, we address both problems by proposing a novel deep person image generation model for synthesizing realistic person images conditional on pose. The model is based on a generative adversarial network (GAN) and used specifically for pose normalization in re-id, thus termed pose-normalization GAN (PN-GAN). With the synthesized images, we can learn a new type of deep re-id feature free of the influence of pose variations. We show that this feature is strong on its own and highly complementary to features learned with the original images. Importantly, we now have a model that generalizes to any new re-id dataset without the need for collecting any training data for model fine-tuning, thus making a deep re-id model truly scalable. Extensive experiments on five benchmarks show that our model outperforms the state-of-the-art models, often significantly. In particular, the features learned on Market-1501 can achieve a Rank-1 accuracy of 68.67% on VIPeR without any model fine-tuning, beating almost all existing models fine-tuned on the dataset.

person re-identification, pose-normalized image generation

1712.02225

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)