Goto

Collaborating Authors

 offline


Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent

Neural Information Processing Systems

Matrix completion, where we wish to recover a low rank matrix by observing a few entries from it, is a widely studied problem in both theory and practice with wide applications. Most of the provable algorithms so far on this problem have been restricted to the offline setting where they provide an estimate of the unknown matrix using all observations simultaneously. However, in many applications, the online version, where we observe one entry at a time and dynamically update our estimate, is more appealing. While existing algorithms are efficient for the offline setting, they could be highly inefficient for the online setting. In this paper, we propose the first provable, efficient online algorithm for matrix completion. Our algorithm starts from an initial estimate of the matrix and then performs non-convex stochastic gradient descent (SGD). After every observation, it performs a fast update involving only one row of two tall matrices, giving near linear total runtime. Our algorithm can be naturally used in the offline setting as well, where it gives competitive sample complexity and runtime to state of the art algorithms. Our proofs introduce a general framework to show that SGD updates tend to stay away from saddle surfaces and could be of broader interests to other non-convex problems.








c13d5a10028586fdc15ee7da97b7563f-Supplemental-Conference.pdf

Neural Information Processing Systems

This section reports the recall performance of MHN and BayesPCN models on high query noise associativerecalltasks. Table5describes theCIFAR10 recallresults ofninestructurally identical BayesPCN models with four hidden layers of size 1024, a single particle, and GELU activations but with different values ofσW andσx. Onvisual inspection, we found that the model's auto-associative recall outputs for both observed and unobserved inputs became less blurry asmore datapoints were written into memory. Both GPCN and BayesPCN at the core are as much generative models as they are associative memories.


ProvablyEfficientCausalReinforcementLearning withConfoundedObservationalData

Neural Information Processing Systems

Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous empirical success. However, DRL requires a large dataset by interacting with the environment, which is unrealistic in critical scenarios such as autonomous driving and personalized medicine. In this paper, we study how to incorporate the dataset collected in the offline setting to improve the sample efficiency in the online setting. To incorporate the observational data, we face two challenges.