AITopics | Zhu, Zhenyao

Collaborating Authors

Zhu, Zhenyao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations

Zhu, Zhenyao, Luo, Ping, Wang, Xiaogang, Tang, Xiaoou

Neural Information Processing SystemsFeb-14-2020, 05:11:37 GMT

artificial intelligence, identity and view representation, neural network, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.45)

Add feedback

Fully Supervised Speaker Diarization

Zhang, Aonan, Wang, Quan, Zhu, Zhenyao, Paisley, John, Wang, Chong

arXiv.org Machine LearningOct-27-2018

In this paper, we propose a fully supervised speaker diarization approach, named unbounded interleaved-state recurrent neural networks (UIS-RNN). Given extracted speaker-discriminative embeddings (a.k.a. d-vectors) from input utterances, each individual speaker is modeled by a parameter-sharing RNN, while the RNN states for different speakers interleave in the time domain. This RNN is naturally integrated with a distance-dependent Chinese restaurant process (ddCRP) to accommodate an unknown number of speakers. Our system is fully supervised and is able to learn from examples where time-stamped speaker labels are annotated. We achieved a 7.6% diarization error rate on NIST SRE 2000 CALLHOME, which is better than the state-of-the-art method using spectral clustering. Moreover, our method decodes in an online fashion while most state-of-the-art systems rely on offline clustering.

deep learning, neural network, utterance, (18 more...)

arXiv.org Machine Learning

1810.04719

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Add feedback

Principled Hybrids of Generative and Discriminative Domain Adaptation

Zhao, Han, Zhu, Zhenyao, Hu, Junjie, Coates, Adam, Gordon, Geoff

arXiv.org Artificial IntelligenceOct-27-2017

We propose a probabilistic framework for domain adaptation that blends both generative and discriminative modeling in a principled way. Under this framework, generative and discriminative models correspond to specific choices of the prior over parameters. This provides us a very general way to interpolate between generative and discriminative extremes through different choices of priors. By maximizing both the marginal and the conditional log-likelihoods, models derived from this framework can use both labeled instances from the source domain as well as unlabeled instances from both source and target domains. Under this framework, we show that the popular reconstruction loss of autoencoder corresponds to an upper bound of the negative marginal log-likelihoods of unlabeled instances, where marginal distributions are given by proper kernel density estimations. This provides a way to interpret the empirical success of autoencoders in domain adaptation and semi-supervised learning. We instantiate our framework using neural networks, and build a concrete model, DAuto. Empirically, we demonstrate the effectiveness of DAuto on text, image and speech datasets, showing that it outperforms related competitors when domain adaptation is possible.

artificial intelligence, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1705.09011

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California (0.14)

Industry: Government (0.95)

Add feedback

Face Model Compression by Distilling Knowledge from Neurons

Luo, Ping (The Chinese University of Hong Kong) | Zhu, Zhenyao (The Chinese University of Hong Kong) | Liu, Ziwei (The Chinese University of Hong Kong) | Wang, Xiaogang (The Chinese University of Hong Kong) | Tang, Xiaoou (The Chinese University of Hong Kong)

AAAI ConferencesApr-19-2016

The recent advanced face recognition systems werebuilt on large Deep Neural Networks (DNNs) or theirensembles, which have millions of parameters. However, the expensive computation of DNNs make theirdeployment difficult on mobile and embedded devices. This work addresses model compression for face recognition,where the learned knowledge of a large teachernetwork or its ensemble is utilized as supervisionto train a compact student network. Unlike previousworks that represent the knowledge by the soften labelprobabilities, which are difficult to fit, we represent theknowledge by using the neurons at the higher hiddenlayer, which preserve as much information as the label probabilities, but are more compact. By leveragingthe essential characteristics (domain knowledge) of thelearned face representation, a neuron selection methodis proposed to choose neurons that are most relevant toface recognition. Using the selected neurons as supervisionto mimic the single networks of DeepID2+ andDeepID3, which are the state-of-the-art face recognition systems, a compact student with simple network structure achieves better verification accuracy on LFW than its teachers, respectively. When using an ensemble of DeepID2+ as teacher, a mimicked student is able to outperform it and achieves 51.6 times compression ratio and 90 times speed-up in inference, making this cumbersome model applicable on portable devices.

deep learning, neural network, neuron, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.29)
North America > United States > Massachusetts (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations

Zhu, Zhenyao, Luo, Ping, Wang, Xiaogang, Tang, Xiaoou

Neural Information Processing SystemsDec-31-2014

Various factors, such as identities, views (poses), and illuminations, are coupled in face images. Disentangling the identity and view representations is a major challenge in face recognition. Existing face recognition systems either use handcrafted features or learn features discriminatively to improve recognition accuracy. This is different from the behavior of human brain. Intriguingly, even without accessing 3D data, human not only can recognize face identity, but can also imagine face images of a person under different viewpoints given a single 2D image, making face perception in the brain robust to view changes. In this sense, human brain has learned and encoded 3D face models from 2D images. To take into account this instinct, this paper proposes a novel deep neural net, named multi-view perceptron (MVP), which can untangle the identity and view features, and infer a full spectrum of multi-view images in the meanwhile, given a single 2D face image. The identity features of MVP achieve superior performance on the MultiPIE dataset. MVP is also capable to interpolate and predict images under viewpoints that are unobserved in the training data.

deep learning, neural network, representation, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.29)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback