AITopics

2403.00258

Country:

Europe (0.46)
Asia > China (0.28)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-4-2024

Deep Equilibrium Models are Almost Equivalent to Not-so-deep Explicit Models for High-dimensional Gaussian Mixtures

Ling, Zenan, Li, Longbo, Feng, Zhanbo, Zhang, Yixuan, Zhou, Feng, Qiu, Robert C., Liao, Zhenyu

Deep equilibrium models (DEQs), as a typical implicit neural network, have demonstrated remarkable success on various tasks. There is, however, a lack of theoretical understanding of the connections and differences between implicit DEQs and explicit neural network models. In this paper, leveraging recent advances in random matrix theory (RMT), we perform an in-depth analysis on the eigenspectra of the conjugate kernel (CK) and neural tangent kernel (NTK) matrices for implicit DEQs, when the input data are drawn from a high-dimensional Gaussian mixture. We prove, in this setting, that the spectral behavior of these Implicit-CKs and NTKs depend on the DEQ activation function and initial weight variances, but only via a system of four nonlinear equations. As a direct consequence of this theoretical result, we demonstrate that a shallow explicit network can be carefully designed to produce the same CK or NTK as a given DEQ. Despite derived here for Gaussian mixture data, empirical results show the proposed theory and design principle also apply to popular real-world datasets.

artificial intelligence, deq, machine learning, (14 more...)

2402.02697

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceOct-13-2023

Timestamp-supervised Wearable-based Activity Segmentation and Recognition with Contrastive Learning and Order-Preserving Optimal Transport

Xia, Songpengcheng, Chu, Lei, Pei, Ling, Yang, Jiarui, Yu, Wenxian, Qiu, Robert C.

Human activity recognition (HAR) with wearables is one of the serviceable technologies in ubiquitous and mobile computing applications. The sliding-window scheme is widely adopted while suffering from the multi-class windows problem. As a result, there is a growing focus on joint segmentation and recognition with deep-learning methods, aiming at simultaneously dealing with HAR and time-series segmentation issues. However, obtaining the full activity annotations of wearable data sequences is resource-intensive or time-consuming, while unsupervised methods yield poor performance. To address these challenges, we propose a novel method for joint activity segmentation and recognition with timestamp supervision, in which only a single annotated sample is needed in each activity segment. However, the limited information of sparse annotations exacerbates the gap between recognition and segmentation tasks, leading to sub-optimal model performance. Therefore, the prototypes are estimated by class-activation maps to form a sample-to-prototype contrast module for well-structured embeddings. Moreover, with the optimal transport theory, our approach generates the sample-level pseudo-labels that take advantage of unlabeled data between timestamp annotations for further performance improvement. Comprehensive experiments on four public HAR datasets demonstrate that our model trained with timestamp supervision is superior to the state-of-the-art weakly-supervised methods and achieves comparable performance to the fully-supervised approaches.

artificial intelligence, machine learning, recognition, (18 more...)

2310.09114

Country:

Asia (0.94)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (0.92)
Health & Medicine > Consumer Health (0.67)
Education > Educational Setting (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-10-2023

Zero-shot Inversion Process for Image Attribute Editing with Diffusion Models

Feng, Zhanbo, Ling, Zenan, Gong, Ci, Zhou, Feng, Li, Jie, Qiu, Robert C.

Denoising diffusion models have shown outstanding performance in image editing. Existing works tend to use either image-guided methods, which provide a visual reference but lack control over semantic coherence, or text-guided methods, which ensure faithfulness to text guidance but lack visual quality. To address the problem, we propose the Zero-shot Inversion Process (ZIP), a framework that injects a fusion of generated visual reference and text guidance into the semantic latent space of a \textit{frozen} pre-trained diffusion model. Only using a tiny neural network, the proposed ZIP produces diverse content and attributes under the intuitive control of the text prompt. Moreover, ZIP shows remarkable robustness for both in-domain and out-of-domain attribute manipulation on real images. We perform detailed experiments on various benchmark datasets. Compared to state-of-the-art methods, ZIP produces images of equivalent quality while providing a realistic editing effect.

artificial intelligence, machine learning, zero-shot inversion process, (2 more...)

2308.15854

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

arXiv.org Machine LearningAug-30-2023

On the Equivalence between Implicit and Explicit Neural Networks: A High-dimensional Viewpoint

Ling, Zenan, Liao, Zhenyu, Qiu, Robert C.

Implicit neural networks have demonstrated remarkable success in various tasks. However, there is a lack of theoretical analysis of the connections and differences between implicit and explicit networks. In this paper, we study high-dimensional implicit neural networks and provide the high dimensional equivalents for the corresponding conjugate kernels and neural tangent kernels. Built upon this, we establish the equivalence between implicit and explicit networks in high dimensions.

artificial intelligence, implicit nn, machine learning, (15 more...)

2308.16425

Country: Asia > China (0.15)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceAug-16-2022

Multi-level Contrast Network for Wearables-based Joint Activity Segmentation and Recognition

Xia, Songpengcheng, Chu, Lei, Pei, Ling, Yu, Wenxian, Qiu, Robert C.

Human activity recognition (HAR) with wearables is promising research that can be widely adopted in many smart healthcare applications. In recent years, the deep learning-based HAR models have achieved impressive recognition performance. However, most HAR algorithms are susceptible to the multi-class windows problem that is essential yet rarely exploited. In this paper, we propose to relieve this challenging problem by introducing the segmentation technology into HAR, yielding joint activity segmentation and recognition. Especially, we introduce the Multi-Stage Temporal Convolutional Network (MS-TCN) architecture for sample-level activity prediction to joint segment and recognize the activity sequence. Furthermore, to enhance the robustness of HAR against the inter-class similarity and intra-class heterogeneity, a multi-level contrastive loss, containing the sample-level and segment-level contrast, has been proposed to learn a well-structured embedding space for better activity segmentation and recognition performance. Finally, with comprehensive experiments, we verify the effectiveness of the proposed method on two public HAR datasets, achieving significant improvements in the various evaluation metrics.

artificial intelligence, machine learning, recognition, (16 more...)

2208.07547

Country: Asia > China (0.29)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningAug-16-2018

Nonconvex Regularization Based Sparse and Low-Rank Recovery in Signal Processing, Statistics, and Machine Learning

Wen, Fei, Chu, Lei, Liu, Peilin, Qiu, Robert C.

In the past decade, sparse and low-rank recovery have drawn much attention in many areas such as signal/image processing, statistics, bioinformatics and machine learning. To achieve sparsity and/or low-rankness inducing, the $\ell_1$ norm and nuclear norm are of the most popular regularization penalties due to their convexity. While the $\ell_1$ and nuclear norm are convenient as the related convex optimization problems are usually tractable, it has been shown in many applications that a nonconvex penalty can yield significantly better performance. In recent, nonconvex regularization based sparse and low-rank recovery is of considerable interest and it in fact is a main driver of the recent progress in nonconvex and nonsmooth optimization. This paper gives an overview of this topic in various fields in signal processing, statistics and machine learning, including compressive sensing (CS), sparse regression and variable selection, sparse signals separation, sparse principal component analysis (PCA), large covariance and inverse covariance matrices estimation, matrix completion, and robust PCA. We present recent developments of nonconvex regularization based sparse and low-rank recovery in these fields, addressing the issues of penalty selection, applications and the convergence of nonconvex algorithms.

algorithm, optimization problem, survey article, (20 more...)

1808.05403

Country:

Asia (0.28)
North America > United States > California (0.14)

Genre: Overview (0.88)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Health & Medicine > Health Care Technology (0.46)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningJul-31-2018

Spectrum concentration in deep residual learning: a free probability appproach

Ling, Zenan, Qiu, Robert C.

We revisit the initialization of deep residual networks (ResNets) by introducing a novel analytical tool in free probability to the community of deep learning. This tool deals with non-Hermitian random matrices, rather than their conventional Hermitian counterparts in the literature. As a consequence, this new tool enables us to evaluate the singular value spectrum of the input-output Jacobian of a fully- connected deep ResNet for both linear and nonlinear cases. With the powerful tool of free probability, we conduct an asymptotic analysis of the spectrum on the single-layer case, and then extend this analysis to the multi-layer case of an arbitrary number of layers. In particular, we propose to rescale the classical random initialization by the number of residual units, so that the spectrum has the order of $O(1)$, when compared with the large width and depth of the network. We empirically demonstrate that the proposed initialization scheme learns at a speed of orders of magnitudes faster than the classical ones, and thus attests a strong practical relevance of this investigation.

deep learning, neural network, resnet, (17 more...)

1807.11694

Country: North America > United States > Tennessee (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJul-28-2016

Positive Definite Estimation of Large Covariance Matrix Using Generalized Nonconvex Penalties

Wen, Fei, Yang, Yuan, Liu, Peilin, Qiu, Robert C.

This work addresses the issue of large covariance matrix estimation in high-dimensional statistical analysis. Recently, improved iterative algorithms with positive-definite guarantee have been developed. However, these algorithms cannot be directly extended to use a nonconvex penalty for sparsity inducing. Generally, a nonconvex penalty has the capability of ameliorating the bias problem of the popular convex lasso penalty, and thus is more advantageous. In this work, we propose a class of positive-definite covariance estimators using generalized nonconvex penalties. We develop a first-order algorithm based on the alternating direction method framework to solve the nonconvex optimization problem efficiently. The convergence of this algorithm has been proved. Further, the statistical properties of the new estimators have been analyzed for generalized nonconvex penalties. Moreover, extension of this algorithm to covariance estimation from sketched measurements has been considered. The performances of the new estimators have been demonstrated by both a simulation study and a gene clustering example for tumor tissues. Code for the proposed estimators is available at https://github.com/FWen/Nonconvex-PDLCE.git.

estimator, health & medicine, optimization problem, (19 more...)

1604.04348

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.66)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceFeb-19-2012

Generalized FMD Detection for Spectrum Sensing Under Low Signal-to-Noise Ratio

Lin, Feng, Qiu, Robert C., Hu, Zhen, Hou, Shujie, Browning, James P., Wicks, Michael C.

Spectrum sensing is a fundamental problem in cognitive radio. We propose a function of covariance matrix based detection algorithm for spectrum sensing in cognitive radio network. Monotonically increasing property of function of matrix involving trace operation is utilized as the cornerstone for this algorithm. The advantage of proposed algorithm is it works under extremely low signal-to-noise ratio, like lower than -30 dB with limited sample data. Theoretical analysis of threshold setting for the algorithm is discussed. A performance comparison between the proposed algorithm and other state-of-the-art methods is provided, by the simulation on captured digital television (DTV) signal.

artificial intelligence, covariance matrix, télécommunications, (18 more...)

1202.419

Country: North America > United States > Tennessee (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)