individual information
Extracting individual variable information for their decoupling, direct mutual information and multi-feature Granger causality
Working with multiple variables they usually contain difficult to control complex dependencies. This article proposes extraction of their individual information, e.g. $\overline{X|Y}$ as random variable containing information from $X$, but with removed information about $Y$, by using $(x,y) \leftrightarrow (\bar{x}=\textrm{CDF}_{X|Y=y}(x),y)$ reversible normalization. One application can be decoupling of individual information of variables: reversibly transform $(X_1,\ldots,X_n)\leftrightarrow(\tilde{X}_1,\ldots \tilde{X}_n)$ together containing the same information, but being independent: $\forall_{i\neq j} \tilde{X}_i\perp \tilde{X}_j, \tilde{X}_i\perp X_j$. It requires detailed models of complex conditional probability distributions - it is generally a difficult task, but here can be done through multiple dependency reducing iterations, using imperfect methods (here HCR: Hierarchical Correlation Reconstruction). It could be also used for direct mutual information - evaluating direct information transfer: without use of intermediate variables. For causality direction there is discussed multi-feature Granger causality, e.g. to trace various types of individual information transfers between such decoupled variables, including propagation time (delay).
Tensor Valued Common and Individual Feature Extraction: Multi-dimensional Perspective
Kisil, Ilia, Calvi, Giuseppe G., Mandic, Danilo P.
Modern datasets in data science applications have immense volume, veracity, velocity and variety (the for V's of big data) [1, 2], and often exhibit a large degree of structural richness among their entries. These data characteristics are often prohibitive to the application of classical matrix algebra as its "flat-view" way of operation cannot cope with the sheer volume of data and the corresponding imbalanced matrix structures, such as as "tall and narrow" or "short and wide" ones. On the other hand, when arranged in multidimensional structures (tensors), the same data often admit much more convenient and mathematically tractable ways of analysis, by virtue of the associated multi-linear algebra. However, until recently, such an approach to data analysis was not very popular, due to high demand for storage and computational resources. There are several ways to tensorize data prior to further analysis, such as through: (i) natural tensor formation, (ii) experimental design, or (iii) mathematical construction [3]. This flexibility and a highly informative nature of multi-way data representation is supported by 1 Figure 1: Efficient representation of an imbalanced block-matrix structure (a set of video frames, top row) in the form of much more convenient and flexible tensor structure (a cube of frames, bottom row).
Multi-View Correlated Feature Learning by Uncovering Shared Component
Xue, Xiaowei (Zhejiang University) | Nie, Feiping (Northwestern Polytechnical University) | Wang, Sen (Griffith University) | Chang, Xiaojun (University of Technology Sydney) | Stantic, Bela (Griffith University) | Yao, Min (Zhejiang University)
Learning multiple heterogeneous features from different data sources is challenging. One research topic is how to exploit and utilize the correlations among various features across multiple views with the aim of improving the performance of learning tasks, such as classification. In this paper, we propose a new multi-view feature learning algorithm that simultaneously analyzes features from different views. Compared to most of the existing subspace learning methods that only focus on exploiting a shared latent subspace, our algorithm not only learns individual information in each view but also captures feature correlations among multiple views by learning a shared component. By assuming that such a component is shared by all views, we simultaneously exploit the shared component and individual information of each view in a batch mode. Since the objective function is non-smooth and difficult to solve, we propose an efficient iterative algorithm for optimization with guaranteed convergence. Extensive experiments are conducted on several benchmark datasets. The results demonstrate that our proposed algorithm performs better than all the compared multi-view learning algorithms.
Unification of Information Maximization and Minimization
In the present paper, we propose a method to unify information maximization and minimization in hidden units. The information maximization and minimization are performed on two different levels: collective and individual level. Thus, two kinds of information: collective and individual information are defined. By maximizing collective information and by minimizing individual information, simple networks can be generated in terms of the number of connections and the number of hidden units. Obtained networks are expected to give better generalization and improved interpretation of internal representations.
Unification of Information Maximization and Minimization
In the present paper, we propose a method to unify information maximization and minimization in hidden units. The information maximization and minimization are performed on two different levels: collective and individual level. Thus, two kinds of information: collective and individual information are defined. By maximizing collective information and by minimizing individual information, simple networks can be generated in terms of the number of connections and the number of hidden units. Obtained networks are expected to give better generalization and improved interpretation of internal representations.
Unification of Information Maximization and Minimization
In the present paper, we propose a method to unify information maximization and minimization in hidden units. The information maximization and minimization are performed on two different levels: collectiveand individual level. Thus, two kinds of information: collective and individual information are defined. By maximizing collective information and by minimizing individual information, simple networks can be generated in terms of the number of connections andthe number of hidden units. Obtained networks are expected to give better generalization and improved interpretation of internal representations.