Collaborating Authors

Correlogram in R: how to highlight the most correlated variables in a dataset


To tackle this issue and make it much more insightful, let's transform the correlation matrix into a correlation plot. A correlation plot, also referred as a correlogram, allows to highlight the variables that are most (positively and negatively) correlated. The correlogram represents the correlations for all pairs of variables. Positive correlations are displayed in blue and negative correlations in red. The intensity of the color is proportional to the correlation coefficient so the stronger the correlation (i.e., the closer to -1 or 1), the darker the boxes.

Elegant correlation table using xtable R package


Correlation matrix analysis is an important method to find dependence between variables. Computing correlation matrix and drawing correlogram is explained here. The aim of this article is to show you how to get the lower and the upper triangular part of a correlation matrix. We will also use the xtable R package to display a nice correlation table in html or latex formats. Note that online software is also available here to compute correlation matrix and to plot a correlogram without any installation.

Linear readout from a neural population with partial correlation data

Neural Information Processing Systems

How much information does a neural population convey about a stimulus? Answers to this question are known to strongly depend on the correlation of response variability in neural populations. These noise correlations, however, are essentially immeasurable as the number of parameters in a noise correlation matrix grows quadratically with population size. Here, we suggest to bypass this problem by imposing a parametric model on a noise correlation matrix. Our basic assumption is that noise correlations arise due to common inputs between neurons.

How to detect spurious correlations, and how to find the real ones


Originally posted on DataSciebceCentral, by Dr. Granville. Click here to read original article and comments. Specifically designed in the context of big data in our research lab, the new and simple strong correlation synthetic metric proposed in this article should be used, whenever you want to check if there is a real association between two variables, especially in large-scale automated data science or machine learning projects. Use this new metric now, to avoid being accused of reckless data science and evenbeing sued for wrongful analytic practice. In this paper, the traditional correlation is referred to as the weak correlation, as it captures only a small part of the association between two variables: weak correlation results in capturing spurious correlations and predictive modeling deficiencies, even with as few as 100 variables.

Maxima in the thermodynamic response and correlation functions of deeply supercooled water


One explanation for the divergence of many of the thermodynamic properties of water is that there is a critical point in deeply supercooled water at some positive pressure. For bulk water samples, these conditions are described as "no man's land," because ice nucleates before such temperatures can be reached. Kim et al. used femtosecond x-ray laser pulses to probe micrometer-sized water droplets cooled to 227 K (see the Perspective by Gallo and Stanley). The temperature dependence of the isothermal compressibility and correlation length extracted from x-ray scattering functions showed maxima at 229 K for H2O and 233 K for D2O, rather than diverging to infinity. These results point to the existence of the Widom line, a locus of maximum correlation lengths emanating from a critical point in the supercooled regime.