AITopics | Huang, Kaizhu

Collaborating Authors

Huang, Kaizhu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizer by Strong Convexity

Zhou, Yangfan, Huang, Kaizhu, Cheng, Cheng, Wang, Xuguang, Liu, Xin

arXiv.org Machine LearningApr-28-2021

The AdaBelief algorithm demonstrates superior generalization ability to the Adam algorithm by viewing the exponential moving average of observed gradients. AdaBelief is proved to have a data-dependent $O(\sqrt{T})$ regret bound when objective functions are convex, where $T$ is a time horizon. However, it remains to be an open problem on how to exploit strong convexity to further improve the convergence rate of AdaBelief. To tackle this problem, we present a novel optimization algorithm under strong convexity, called FastAdaBelief. We prove that FastAdaBelief attains a data-dependant $O(\log T)$ regret bound, which is substantially lower than AdaBelief. In addition, the theoretical analysis is validated by extensive experiments performed on open datasets (i.e., CIFAR-10 and Penn Treebank) for image classification and language modeling.

algorithm, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2104.1379

Country: Asia > China > Anhui Province (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems

Guo, Ping, Huang, Kaizhu, Xu, Zenglin

arXiv.org Artificial IntelligenceMar-9-2021

In this work, we generalize the reaction-diffusion equation in statistical physics, Schr\"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research. We take finite difference method to discretize NPDE for finding numerical solution, and the basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated. The learning strategies, such as Adaptive moment estimation, L-BFGS, pseudoinverse learning algorithms and partial differential equation constrained optimization, are also presented. We believe it is of significance that presented clear physical image of interpretable deep neural networks, which makes it be possible for applying to analog computing device design, and pave the road to physical artificial intelligence.

deep learning, equation, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

2103.08313

Country:

North America > United States > New York (0.14)
North America > Canada > Quebec (0.14)
Asia > China > Guangdong Province (0.14)
(2 more...)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Better Forecasting by Fusing Near and Distant Future Visions

Cheng, Jiezhu, Huang, Kaizhu, Zheng, Zibin

arXiv.org Machine LearningDec-11-2019

Multivariate time series forecasting is an important yet challenging problem in machine learning. Most existing approaches only forecast the series value of one future moment, ignoring the interactions between predictions of future moments with different temporal distance. Such a deficiency probably prevents the model from getting enough information about the future, thus limiting the forecasting accuracy. To address this problem, we propose Multi-Level Construal Neural Network (MLCNN), a novel multi-task deep learning framework. Inspired by the Construal Level Theory of psychology, this model aims to improve the predictive performance by fusing forecasting information (i.e., future visions) of different future time. We first use the Convolution Neural Network to extract multi-level abstract representations of the raw data for near and distant future predictions. We then model the interplay between multiple predictive tasks and fuse their future visions through a modified Encoder-Decoder architecture. Finally, we combine traditional Autoregression model with the neural network to solve the scale insensitive problem. Experiments on three real-world datasets show that our method achieves statistically significant improvements compared to the most state-of-the-art baseline methods, with average 4.59% reduction on RMSE metric and average 6.87% reduction on MAE metric.

deep learning, neural network, time series forecasting, (18 more...)

arXiv.org Machine Learning

1912.05122

Country:

Asia > China (0.47)
North America (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generative Adversarial Classifier for Handwriting Characters Super-Resolution

Qian, Zhuang, Huang, Kaizhu, Wang, Qiufeng, Xiao, Jimin, Zhang, Rui

arXiv.org Artificial IntelligenceJan-18-2019

Generative Adversarial Networks (GAN) receive great attentions recently due to its excellent performance in image generation, transformation, and super-resolution. However, GAN has rarely been studied and trained for classification, leading that the generated images may not be appropriate for classification. In this paper, we propose a novel Generative Adversarial Classifier (GAC) particularly for low-resolution Handwriting Character Recognition. Specifically, involving additionally a classifier in the training process of normal GANs, GAC is calibrated for learning suitable structures and restored characters images that benefits the classification. Experimental results show that our proposed method can achieve remarkable performance in handwriting characters 8x super-resolution, approximately 10% and 20% higher than the present state-of-the-art methods respectively on benchmark data CASIA-HWDB1.1 and MNIST.

artificial intelligence, classifier, neural network, (17 more...)

arXiv.org Artificial Intelligence

1901.06199

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Scalable Deep Neural Network Architecture for Multi-Building and Multi-Floor Indoor Localization Based on Wi-Fi Fingerprinting

Kim, Kyeong Soo, Lee, Sanghyuk, Huang, Kaizhu

arXiv.org Machine LearningDec-5-2017

One of the key technologies for future large-scale location-aware services covering a complex of multi-story buildings --- e.g., a big shopping mall and a university campus --- is a scalable indoor localization technique. In this paper, we report the current status of our investigation on the use of deep neural networks (DNNs) for scalable building/floor classification and floor-level position estimation based on Wi-Fi fingerprinting. Exploiting the hierarchical nature of the building/floor estimation and floor-level coordinates estimation of a location, we propose a new DNN architecture consisting of a stacked autoencoder for the reduction of feature space dimension and a feed-forward classifier for multi-label classification of building/floor/location, on which the multi-building and multi-floor indoor localization system based on Wi-Fi fingerprinting is built. Experimental results for the performance of building/floor estimation and floor-level coordinates estimation of a given location demonstrate the feasibility of the proposed DNN-based indoor localization system, which can provide near state-of-the-art performance using a single DNN, for the implementation with lower complexity and energy consumption at mobile devices.

deep learning, estimation, neural network, (18 more...)

arXiv.org Machine Learning

1712.0199

Country:

Asia > China (0.29)
North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report (1.00)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Stochastic Conjugate Gradient Algorithm with Variance Reduction

Jin, Xiao-Bo, Zhang, Xu-Yao, Huang, Kaizhu, Geng, Guang-Gang

arXiv.org Machine LearningOct-26-2017

Conjugate gradient methods are a class of important methods for solving linear equations and nonlinear optimization. In our work, we propose a new stochastic conjugate gradient algorithm with variance reduction (CGVR) and prove its linear convergence with the Fletcher and Revves method for strongly convex and smooth functions. We experimentally demonstrate that the CGVR algorithm converges faster than its counterparts for six large-scale optimization problems that may be convex, non-convex or non-smooth, and its AUC (Area Under Curve) performance with $L2$-regularized $L2$-loss is comparable to that of LIBLINEAR but with significant improvement in computational efficiency.

algorithm, artificial intelligence, information management, (19 more...)

arXiv.org Machine Learning

1710.09979

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > New York (0.14)
Asia > China > Henan Province (0.14)

Genre: Research Report (0.64)

Industry: Education (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

A Unified Gradient Regularization Family for Adversarial Examples

Lyu, Chunchuan, Huang, Kaizhu, Liang, Hai-Ning

arXiv.org Machine LearningNov-19-2015

Adversarial examples are augmented data points generated by imperceptible perturbation of input samples. They have recently drawn much attention with the machine learning and data mining community. Being difficult to distinguish from real examples, such adversarial examples could change the prediction of many of the best learning models including the state-of-the-art deep learning models. Recent attempts have been made to build robust models that take into account adversarial examples. However, these methods can either lead to performance drops or lack mathematical motivations. In this paper, we propose a unified framework to build robust machine learning models against adversarial examples. More specifically, using the unified framework, we develop a family of gradient regularization methods that effectively penalize the gradient of loss function w.r.t. inputs. Our proposed framework is appealing in that it offers a unified view to deal with adversarial examples. It incorporates another recently-proposed perturbation based approach as a special case. In addition, we present some visual effects that reveals semantic meaning in those perturbations, and thus support our regularization method and provide another explanation for generalizability of adversarial examples. By applying this technique to Maxout networks, we conduct a series of experiments and achieve encouraging results on two benchmark datasets. In particular,we attain the best accuracy on MNIST data (without data augmentation) and competitive performance on CIFAR-10 data.

deep learning, neural network, perturbation, (19 more...)

arXiv.org Machine Learning

1511.06385

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Robust Metric Learning by Smooth Optimization

Huang, Kaizhu, Jin, Rong, Xu, Zenglin, Liu, Cheng-Lin

arXiv.org Machine LearningMar-15-2012

Most existing distance metric learning methods assume perfect side information that is usually given in pairwise or triplet constraints. Instead, in many real-world applications, the constraints are derived from side information, such as users' implicit feedbacks and citations among articles. As a result, these constraints are usually noisy and contain many mistakes. In this work, we aim to learn a distance metric from noisy constraints by robust optimization in a worst-case scenario, to which we refer as robust metric learning. We formulate the learning task initially as a combinatorial optimization problem, and show that it can be elegantly transformed to a convex programming problem. We present an efficient learning algorithm based on smooth optimization [7]. It has a worst-case convergence rate of O(1/{\surd}{\varepsilon}) for smooth optimization problems, where {\varepsilon} is the desired error of the approximate solution. Finally, our empirical study with UCI data sets demonstrate the effectiveness of the proposed method in comparison to state-of-the-art methods.

artificial intelligence, optimization problem, side information, (17 more...)

arXiv.org Machine Learning

1203.3461

Country:

Asia (0.29)
Europe > Germany (0.28)
North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sparse Metric Learning via Smooth Optimization

Ying, Yiming, Huang, Kaizhu, Campbell, Colin

Neural Information Processing SystemsDec-31-2009

In this paper we study the problem of learning a low-dimensional (sparse) distance matrix. We propose a novel metric learning model which can simultaneously conduct dimension reduction and learn a distance matrix. The sparse representation involves a mixed-norm regularization which is non-convex. We then show that it can be equivalently formulated as a convex saddle (min-max) problem. From this saddle representation, we develop an efficient smooth optimization approach for sparse metric learning although the learning model is based on a non-differential loss function. This smooth optimization approach has an optimal convergence rate of $O(1 /\ell^2)$ for smooth problems where $\ell$ is the iteration number. Finally, we run experiments to validate the effectiveness and efficiency of our sparse metric learning model on various datasets.

artificial intelligence, matrix, optimization problem, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.28)
Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback