Tran, Hoang
Ensemble score filter with image inpainting for data assimilation in tracking surface quasi-geostrophic dynamics with partial observations
Liang, Siming, Tran, Hoang, Bao, Feng, Chipilski, Hristo G., van Leeuwen, Peter Jan, Zhang, Guannan
Data assimilation plays a pivotal role in understanding and predicting turbulent systems within geoscience and weather forecasting, where data assimilation is used to address three fundamental challenges, i.e., high-dimensionality, nonlinearity, and partial observations. Recent advances in machine learning (ML)-based data assimilation methods have demonstrated encouraging results. In this work, we develop an ensemble score filter (EnSF) that integrates image inpainting to solve the data assimilation problems with partial observations. The EnSF method exploits an exclusively designed training-free diffusion models to solve high-dimensional nonlinear data assimilation problems. Its performance has been successfully demonstrated in the context of having full observations, i.e., all the state variables are directly or indirectly observed. However, because the EnSF does not use a covariance matrix to capture the dependence between the observed and unobserved state variables, it is nontrivial to extend the original EnSF method to the partial observation scenario. In this work, we incorporate various image inpainting techniques into the EnSF to predict the unobserved states during data assimilation. At each filtering step, we first use the diffusion model to estimate the observed states by integrating the likelihood information into the score function. Then, we use image inpainting methods to predict the unobserved state variables. We demonstrate the performance of the EnSF with inpainting by tracking the Surface Quasi-Geostrophic (SQG) model dynamics under a variety of scenarios. The successful proof of concept paves the way to more in-depth investigations on exploiting modern image inpainting techniques to advance data assimilation methodology for practical geoscience and weather forecasting problems.
Agricultural Landscape Understanding At Country-Scale
Dua, Radhika, Saxena, Nikita, Agarwal, Aditi, Wilson, Alex, Singh, Gaurav, Tran, Hoang, Deshpande, Ishan, Kaur, Amandeep, Aggarwal, Gaurav, Nath, Chandan, Basu, Arnab, Batchu, Vishal, Holla, Sharath, Kurle, Bindiya, Missura, Olana, Aggarwal, Rahul, Garg, Shubhika, Shah, Nishi, Singh, Avneet, Tewari, Dinesh, Dondzik, Agata, Adsul, Bharat, Sohoni, Milind, Praveen, Asim Rama, Dangi, Aaryan, Kadivar, Lisan, Abhishek, E, Sudhansu, Niranjan, Hattekar, Kamlakar, Datar, Sameer, Chaithanya, Musty Krishna, Reddy, Anumas Ranjith, Kumar, Aashish, Tirumala, Betala Laxmi, Talekar, Alok
The global food system is facing unprecedented challenges. In 2023, 2.4 billion people experienced moderate to severe food insecurity [1], a crisis precipitated by anthropogenic climate change and evolving dietary preferences. Furthermore, the food system itself significantly contributes to the climate crisis, with food loss and waste accounting for 2.4 gigatonnes of carbon dioxide equivalent emissions per year (GT CO2e/yr) [2], and the production, mismanagement, and misapplication of agricultural inputs such as fertilizers and manure generating an additional 2.5 GT CO2e/yr [3]. To sustain a projected global population of 9.6 billion by 2050, the Food and Agriculture Organization (FAO) estimates that food production must increase by at least 60% [1]. However, this also presents an opportunity: transitioning to sustainable agricultural practices can transform the sector from a net source of greenhouse gas emissions to a vital carbon sink.
Empirical Tests of Optimization Assumptions in Deep Learning
Tran, Hoang, Zhang, Qinzi, Cutkosky, Ashok
There is a significant gap between our theoretical understanding of optimization algorithms used in deep learning and their practical performance. Theoretical development usually focuses on proving convergence guarantees under a variety of different assumptions, which are themselves often chosen based on a rough combination of intuitive match to practice and analytical convenience. The theory/practice gap may then arise because of the failure to prove a theorem under such assumptions, or because the assumptions do not reflect reality. In this paper, we carefully measure the degree to which these assumptions are capable of explaining modern optimization algorithms by developing new empirical metrics that closely track the key quantities that must be controlled in theoretical analysis. All of our tested assumptions (including typical modern assumptions based on bounds on the Hessian) fail to reliably capture optimization performance. This highlights a need for new empirical verification of analytical assumptions used in theoretical analysis.
Orthogonally weighted $\ell_{2,1}$ regularization for rank-aware joint sparse recovery: algorithm and analysis
Petrosyan, Armenak, Pieper, Konstantin, Tran, Hoang
We propose and analyze an efficient algorithm for solving the joint sparse recovery problem using a new regularization-based method, named orthogonally weighted $\ell_{2,1}$ ($\mathit{ow}\ell_{2,1}$), which is specifically designed to take into account the rank of the solution matrix. This method has applications in feature extraction, matrix column selection, and dictionary learning, and it is distinct from commonly used $\ell_{2,1}$ regularization and other existing regularization-based approaches because it can exploit the full rank of the row-sparse solution matrix, a key feature in many applications. We provide a proof of the method's rank-awareness, establish the existence of solutions to the proposed optimization problem, and develop an efficient algorithm for solving it, whose convergence is analyzed. We also present numerical experiments to illustrate the theory and demonstrate the effectiveness of our method on real-life problems.
Correcting Momentum with Second-order Information
Tran, Hoang, Cutkosky, Ashok
We develop a new algorithm for non-convex stochastic optimization that finds an $\epsilon$-critical point in the optimal $O(\epsilon^{-3})$ stochastic gradient and hessian-vector product computations. Our algorithm uses Hessian-vector products to "correct" a bias term in the momentum of SGD with momentum. This leads to better gradient estimates in a manner analogous to variance reduction methods. In contrast to prior work, we do not require excessively large batch sizes (or indeed any restrictions at all on the batch size), and both our algorithm and its analysis are much simpler. We validate our results on a variety of large-scale deep learning benchmarks and architectures, where we see improvements over SGD and Adam.
On Cross-validation for Sparse Reduced Rank Regression
She, Yiyuan, Tran, Hoang
In high-dimensional data analysis, regularization methods pursuing sparsity and/or low rank have received a lot of attention recently. To provide a proper amount of shrinkage, it is typical to use a grid search and a model comparison criterion to find the optimal regularization parameters. However, we show that fixing the parameters across all folds may result in an inconsistency issue, and it is more appropriate to cross-validate projection-selection patterns to obtain the best coefficient estimate. Our in-sample error studies in jointly sparse and rank-deficient models lead to a new class of information criteria with four scale-free forms to bypass the estimation of the noise level. By use of an identity, we propose a novel scale-free calibration to help cross-validation achieve the minimax optimal error rate non-asymptotically. Experiments support the efficacy of the proposed methods.