Dependency detection with similarity constraints
Lahti, Leo, Myllykangas, Samuel, Knuutila, Sakari, Kaski, Samuel
Unsupervised two-view learning, or detection of dependencies between two paired data sets, is typically done by some variant of canonical correlation analysis (CCA). CCA searches for a linear projection for each view, such that the correlations between the projections are maximized. The solution is invariant to any linear transformation of either or both of the views; for tasks with small sample size such flexibility implies overfitting, which is even worse for more flexible nonparametric or kernel-based dependency discovery methods. We develop variants which reduce the degrees of freedom by assuming constraints on similarity of the projections in the two views. A particular example is provided by a cancer gene discovery application where chromosomal distance affects the dependencies between gene copy number and activity levels. Similarity constraints are shown to improve detection performance of known cancer genes.
Jan-31-2011
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > Finland
- Central Ostrobothnia > Kokkola (0.04)
- Paijanne Tavastia > Lahti (0.04)
- Uusimaa > Helsinki (0.05)
- North America > United States
- California > Alameda County
- Berkeley (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New Jersey > Middlesex County
- Piscataway (0.04)
- New York > New York County
- New York City (0.04)
- California > Alameda County
- Asia > Middle East
- Genre:
- Research Report > Experimental Study (0.35)
- Industry:
- Technology: