AITopics

Aesthetic quality prediction is a challenging task in the computer vision community because of the complex interplay with semantic contents and photographic technologies. Recent studies on the powerful deep learning based aesthetic quality assessment usually use a binary high-low label or a numerical score to represent the aesthetic quality. However the scalar representation cannot describe well the underlying varieties of the human perception of aesthetics. In this work, we propose to predict the aesthetic score distribution (i.e., a score distribution vector of the ordinal basic human ratings) using Deep Convolutional Neural Network (DCNN). Conventional DCNNs which aim to minimize the difference between the predicted scalar numbers or vectors and the ground truth cannot be directly used for the ordinal basic rating distribution. Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution, which eliminates the requirement of the original full data of human ratings (without normalization). Experimental results on large scale aesthetic dataset demonstrate the effectiveness of our introduced CJS-CNN in this task.

deep learning, neural network, score distribution, (20 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.68)
Europe (0.68)
North America > Canada > Quebec (0.14)
(2 more...)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Consistent and Specific Multi-View Subspace Clustering

Luo, Shirui (Chinese Academy of Sciences) | Zhang, Changqing (University of Chinese Academy of Sciences; Institute of Information Engineering; School of Cyber Security) | Zhang, Wei (Tianjin University) | Cao, Xiaochun (Chinese Academy of Sciences; Institute of Information Engineering)

Multi-view clustering has attracted intensive attention due to the effectiveness of exploiting multiple views of data. However, most existing multi-view clustering methods only aim to explore the consistency or enhance the diversity of different views. In this paper, we propose a novel multi-view subspace clustering method (CSMSC), where consistency and specificity are jointly exploited for subspace representation learning. We formulate the multi-view self-representation property using a shared consistent representation and a set of specific representations, which better fits the real-world datasets. Specifically, consistency models the common properties among all views, while specificity captures the inherent difference in each view. In addition, to optimize the non-convex problem, we introduce a convex relaxation and develop an alternating optimization algorithm to recover the corresponding data representations. Experimental evaluations on four benchmark datasets demonstrate that the proposed approach achieves better performance over several state-of-the-arts.

artificial intelligence, optimization problem, representation, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.55)

Latent Semantic Aware Multi-View Multi-Label Classification

For real-world applications, data are often associated with multiple labels and represented with multiple views. Most existing multi-label learning methods do not sufficiently consider the complementary information among multiple views, leading to unsatisfying performance. To address this issue, we propose a novel approach for multi-view multi-label learning based on matrix factorization to exploit complementarity among different views. Specifically, under the assumption that there exists a common representation across different views, the uncovered latent patterns are enforced to be aligned across different views in kernel spaces. In this way, the latent semantic patterns underlying in data could be well uncovered and this enhances the reasonability of the common representation of multiple views. As a result, the consensus multi-view representation is obtained which encodes the complementarity and consistence of different views in latent semantic space. We provide theoretical guarantee for the strict convexity for our method by properly setting parameters. Empirical evidence shows the clear advantages of our method over the state-of-the-art ones.

artificial intelligence, different view, optimization problem, (19 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.28)
Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Variational Recurrent Neural Machine Translation

Partially inspired by successful applications of variational recurrent neural networks, we propose a novel variational recurrent neural machine translation (VRNMT) model in this paper. Different from the variational NMT, VRNMT introduces a series of latent random variables to model the translation procedure of a sentence in a generative way, instead of a single latent variable. Specifically, the latent random variables are included into the hidden states of the NMT decoder with elements from the variational autoencoder. In this way, these variables are recurrently generated, which enables them to further capture strong and complex dependencies among the output translations at different timesteps. In order to deal with the challenges in performing efficient posterior inference and large-scale training during the incorporation of latent variables, we build a neural posterior approximator, and equip it with a reparameterization technique to estimate the variational lower bound. Experiments on Chinese-English and English-German translation tasks demonstrate that the proposed model achieves significant improvements over both the conventional and variational NMT models.

deep learning, neural network, translation, (18 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.29)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Collaborative Dynamic Sparse Topic Regression with User Profile Evolution for Item Recommendation

Gao, Li (Chinese Academy of Sciences) | Wu, Jia (University of Technology Sydney) | Zhou, Chuan (Chinese Academy of Sciences) | Hu, Yue (Chinese Academy of Sciences)

In many time-aware item recommender systems, modeling the accurate evolution of both user profiles and the contents of items over time is essential. However, most existing methods focus on learning users' dynamic interests, where the contents of items are assumed to be stable over time. They thus fail to capture the dynamic changes in the item's contents. In this paper, we present a novel method CDUE for time-aware item recommendation, which captures the evolution of both user's interests and item's contents information via topic dynamics. Specifically, we propose a dynamic sparse topic model to track the evolution of topics for changes in items' contents over time and adapt a vector autoregressive model to profile users' dynamic interests. The item's topics and user's interests and their evolutions are learned collaboratively and simultaneously into a unified learning framework. Experimental results on two real-world data sets demonstrate the quality and effectiveness of the proposed method and show that our method can be used to make better future recommendations.

artificial intelligence, machine learning, recommendation, (17 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.14)
Oceania > Australia (0.14)

Genre: Research Report > Promising Solution (0.66)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Attention Based LSTM for Target Dependent Sentiment Classification

Yang, Min (The University of Hong Kong) | Tu, Wenting (The University of Hong Kong) | Wang, Jingxuan (The University of Hong Kong) | Xu, Fei (Chinese Academy of Sciences) | Chen, Xiaojun (Shenzhen University)

We present an attention-based bidirectional LSTM approach to improve the target-dependent sentiment classification. Our method learns the alignment between the target entities and the most distinguishing features. We conduct extensive experiments on a real-life dataset. The experimental results show that our model achieves state-of-the-art results.

deep learning, neural network, sentiment classification, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.50)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Cross-View People Tracking by Scene-Centered Spatio-Temporal Parsing

Xu, Yuanlu (University of California, Los Angeles) | Liu, Xiaobai (San Diego State University) | Qin, Lei (Chinese Academy of Sciences) | Zhu, Song-Chun (University of California, Los Angeles)

In this paper, we propose a Spatio-temporal Attributed Parse Graph (ST-APG) to integrate semantic attributes with trajectories for cross-view people tracking. Given videos from multiple cameras with overlapping field of view (FOV), our goal is to parse the videos and organize the trajectories of all targets into a scene-centered representation. We leverage rich semantic attributes of human, e.g., facing directions, postures and actions, to enhance cross-view tracklet associations, besides frequently used appearance and geometry features in the literature.In particular, the facing direction of a human in 3D, once detected, often coincides with his/her moving direction or trajectory. Similarly, the actions of humans, once recognized, provide strong cues for distinguishing one subject from the others. The inference is solved by iteratively grouping tracklets with cluster sampling and estimating people semantic attributes by dynamic programming.In experiments, we validate our method on one public dataset and create another new dataset that records people's daily life in public, e.g., food court, office reception and plaza, each of which includes 3-4 cameras. We evaluate the proposed method on these challenging videos and achieve promising multi-view tracking results.

Riemannian Submanifold Tracking on Low-Rank Algebraic Variety

Li, Qian (Chinese Academy of Sciences) | Wang, Zhichao (Tsinghua University)

Matrix recovery aims to learn a low-rank structure from high dimensional data, which arises in numerous learning applications. As a popular heuristic to matrix recovery, convex relaxation involves iterative calling of singular value decomposition (SVD). Riemannian optimization based method can alleviate such expensive cost in SVD for improved scalability, which however is usually degraded by the unknown rank. This paper proposes a novel algorithm RIST that exploits the algebraic variety of low-rank manifold for matrix recovery. Particularly, RIST utilizes an efficient scheme that automatically estimate the potential rank on the real algebraic variety and tracks the favorable Riemannian submanifold. Moreover, RIST utilizes the second-order geometric characterization and achieves provable superlinear convergence, which is superior to the linear convergence of most existing methods. Extensive comparison experiments demonstrate the accuracy and ef- ficiency of RIST algorithm.

algorithm, artificial intelligence, optimization problem, (16 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Unsupervised Large Graph Embedding

Nie, Feiping (Northwestern Polytechnical University) | Zhu, Wei (Northwestern Polytechnical University) | Li, Xuelong (Chinese Academy of Sciences)

There are many successful spectral based unsupervised dimensionality reduction methods, including Laplacian Eigenmap (LE), Locality Preserving Projection (LPP), Spectral Regression (SR), etc. LPP and SR are two different linear spectral based methods, however, we discover that LPP and SR are equivalent, if the symmetric similarity matrix is doubly stochastic, Positive Semi-Definite (PSD) and with rank p, where p is the reduced dimension. The discovery promotes us to seek low-rank and doubly stochastic similarity matrix, we then propose an unsupervised linear dimensionality reduction method, called Unsupervised Large Graph Embedding (ULGE). ULGE starts with similar idea as LPP, it adopts an efficient approach to construct similarity matrix and then performs spectral analysis efficiently, the computational complexity can reduce to O(ndm), which is a significant improvement compared to conventional spectral based methods which need O(n^2d) at least, where n, d and m are the number of samples, dimensions and anchors, respectively. Extensive experiments on several public available data sets demonstrate the efficiency and effectiveness of the proposed method.

artificial intelligence, similarity matrix, survey article, (18 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.15)
Asia > China (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Asynchronous Stochastic Proximal Optimization Algorithms with Variance Reduction

Regularized empirical risk minimization (R-ERM) is an important branch of machine learning, since it constrains the capacity of the hypothesis space and guarantees the generalization ability of the learning algorithm. Two classic proximal optimization algorithms, i.e., proximal stochastic gradient descent (ProxSGD) and proximal stochastic coordinate descent (ProxSCD) have been widely used to solve the R-ERM problem. Recently, variance reduction technique was proposed to improve ProxSGD and ProxSCD, and the corresponding ProxSVRG and ProxSVRCD have better convergence rate. These proximal algorithms with variance reduction technique have also achieved great success in applications at small and moderate scales. However, in order to solve large-scale R-ERM problems and make more practical impacts, the parallel versions of these algorithms are sorely needed. In this paper, we propose asynchronous ProxSVRG (Async-ProxSVRG) and asynchronous ProxSVRCD (Async-ProxSVRCD) algorithms, and prove that Async-ProxSVRG can achieve near linear speedup when the training data is sparse, while Async-ProxSVRCD can achieve near linear speedup regardless of the sparse condition, as long as the number of block partitions are appropriately set. We have conducted experiments on a regularized logistic regression task. The results verified our theoretical findings and demonstrated the practical efficiency of the asynchronous stochastic proximal algorithms with variance reduction.

algorithm, artificial intelligence, machine learning, (19 more...)

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)