One-sided Matrix Completion from Two Observations Per Row

Cao, Steven, Liang, Percy, Valiant, Gregory

Jun-6-2023–arXiv.org Artificial Intelligence

However, most of our understanding is restricted to settings where each Given only a few observed entries from a lowrank row and each column have more observations than the rank matrix X, matrix completion is the problem of the underlying matrix. It is natural that past work operated of imputing the missing entries, and it formalizes under this assumption because full matrix completion a wide range of real-world settings that involve is impossible without it: for a rank-r matrix X with estimating missing data. However, when shape m d, one can show that estimating the matrix is there are too few observed entries to complete impossible with o(r(m + d)) observations. Nonetheless, the matrix, what other aspects of the underlying many important applications do not satisfy this assumption: matrix can be reliably recovered? We study one for example, in low-coverage genotype imputation (Li such problem setting, that of "one-sided" matrix et al., 2009), we might sequence d = 2,000 people for completion, where our goal is to recover the 10,000 genetic variants each, out of the m = 10,000,000 right singular vectors of X, even in the regime genetic variants in humans. Represented as a matrix, we where recovering the left singular vectors is impossible, have a 10,000,000 2,000 matrix with 2,000 10,000 = which arises when there are more rows 20,000,000 total observations, or about two observations than columns and very few observations. We propose per row on average, which is certainly much less than the a natural algorithm that involves imputing rank of the matrix.

artificial intelligence, machine learning, matrix, (13 more...)

arXiv.org Artificial Intelligence

Jun-6-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Hawaii > Honolulu County
    - Honolulu (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found