Goto

Collaborating Authors

 tij


A Theory of Nonparametric Covariance Function Estimation for Discretely Observed Data

Terada, Yoshikazu, Yara, Atsutomo

arXiv.org Machine Learning

We study nonparametric covariance function estimation for functional data observed with noise at discrete locations on a $d$-dimensional domain. Estimating the covariance function from discretely observed data is a challenging nonparametric problem, particularly in multidimensional settings, since the covariance function is defined on a product domain and thus suffers from the curse of dimensionality. This motivates the use of adaptive estimators, such as deep learning estimators. However, existing theoretical results are largely limited to estimators with explicit analytic representations, and the properties of general learning-based estimators remain poorly understood. We establish an oracle inequality for a broad class of learning-based estimators that applies to both sparse and dense observation regimes in a unified manner, and derive convergence rates for deep learning estimators over several classes of covariance functions. The resulting rates suggest that structural adaptation can mitigate the curse of dimensionality, similarly to classical nonparametric regression. We further compare the convergence rates of learning-based estimators with several existing procedures. For a one-dimensional smoothness class, deep learning estimators are suboptimal, whereas local linear smoothing estimators achieve a faster rate. For a structured function class, however, deep learning estimators attain the minimax rate up to polylogarithmic factors, whereas local linear smoothing estimators are suboptimal. These results reveal a distinctive adaptivity-variance trade-off in covariance function estimation.


Modeling Dynamic Missingness of Implicit Feedback for Recommendation

Menghan Wang, Mingming Gong, Xiaolin Zheng, Kun Zhang

Neural Information Processing Systems

Collaborative filtering methods based on implicit feedback (e.g., purchase records and browsing history) are widely used in recommender systems. Compared to explicit feedback (e.g., 1-5 star ratings), implicit feedback is more abundant and accessible in real-world applications. However, the missing data of implicit feedback also brings two challenges.



HotEncodings

Neural Information Processing Systems

Landmarks were distributed atdistance 200 from the origin, everyπ3 radians, starting at0. The listener particle had amassof0.1;itsforceactions Episodeslasted50timesteps. The 9-points environment used the same physics engine astriangle and episodes also lasted 50 timesteps. We found that additional time enabled more convergent behaviors: that is, agents hadtolearn toreach alocation andstopformaximal reward,asopposed totraveling atfull speed in a fixed direction. Whenexperimenting with environmental noise, allagents were trained using identical noise models and evaluated with zeronoise.


149ad6e32c08b73a3ecc3d11977fcc47-Paper-Conference.pdf

Neural Information Processing Systems

We propose a regularized pairwise pseudo-likelihood approach for matrix completion and provethat the proposed estimator can asymptotically recoverthe low-rank parameter matrix uptoanidentifiable equivalence class of aconstant shiftandscaling, atanear-optimal asymptotic convergencerateofthe standardwell-posed(non-informativemissing)setting,whileeffectivelymitigating the impact of informative missingness.


Theoretical Compression Bounds for Wide Multilayer Perceptrons

Cheairi, Houssam El, Gamarnik, David, Mazumder, Rahul

arXiv.org Artificial Intelligence

Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.


Adversarial Attacks on Downstream Weather Forecasting Models: Application to Tropical Cyclone Trajectory Prediction

Deng, Yue, Santos, Francisco, Tan, Pang-Ning, Luo, Lifeng

arXiv.org Machine Learning

Deep learning based weather forecasting (DLWF) models leverage past weather observations to generate future forecasts, supporting a wide range of downstream tasks, including tropical cyclone (TC) trajectory prediction. In this paper, we investigate their vulnerability to adversarial attacks, where subtle perturbations to the upstream weather forecasts can alter the downstream TC trajectory predictions. Although research on adversarial attacks in DLWF models has grown recently, generating perturbed upstream forecasts that reliably steer downstream output toward attacker-specified trajectories remains a challenge. First, conventional TC detection systems are opaque, non-differentiable black boxes, making standard gradient-based attacks infeasible. Second, the extreme rarity of TC events leads to severe class imbalance problem, making it difficult to develop efficient attack methods that will produce the attacker's target trajectories. Furthermore, maintaining physical consistency in adversarially generated forecasts presents another significant challenge. To overcome these limitations, we propose Cyc-Attack, a novel method that perturbs the upstream forecasts of DLWF models to generate adversarial trajectories. First, we pre-train a differentiable surrogate model to approximate the TC detector's output, enabling the construction of gradient-based attacks. Cyc-Attack also employs skewness-aware loss function with kernel dilation strategy to address the imbalance problem. Finally, a distance-based gradient weighting scheme and regularization are used to constrain the perturbations and eliminate spurious trajectories to ensure the adversarial forecasts are realistic and not easily detectable.



ERASOR++: Height Coding Plus Egocentric Ratio Based Dynamic Object Removal for Static Point Cloud Mapping

Zhang, Jiabao, Zhang, Yu

arXiv.org Artificial Intelligence

Mapping plays a crucial role in location and navigation within automatic systems. However, the presence of dynamic objects in 3D point cloud maps generated from scan sensors can introduce map distortion and long traces, thereby posing challenges for accurate mapping and navigation. To address this issue, we propose ERASOR++, an enhanced approach based on the Egocentric Ratio of Pseudo Occupancy for effective dynamic object removal. To begin, we introduce the Height Coding Descriptor, which combines height difference and height layer information to encode the point cloud. Subsequently, we propose the Height Stack Test, Ground Layer Test, and Surrounding Point Test methods to precisely and efficiently identify the dynamic bins within point cloud bins, thus overcoming the limitations of prior approaches. Through extensive evaluation on open-source datasets, our approach demonstrates superior performance in terms of precision and efficiency compared to existing methods. Furthermore, the techniques described in our work hold promise for addressing various challenging tasks or aspects through subsequent migration.