A Network Architecture

Neural Information Processing Systems

For a fair comparison, our network follows the same structure as CEM-RL [19]. The architecture is originally from Fujimoto et al. [5], the only difference is using tanh instead of RELU. We use (400, 300) hidden layer for all environment except Humanoid-v2. For Humanoid-v2, we used (256, 256) as in TD3 [5]. Most of hyperparameters are the same value as CEM-RL [19].


An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Neural Information Processing Systems

Deep reinforcement learning (DRL) algorithms and evolution strategies (ES) have been applied to various tasks, showing excellent performances. These have the opposite properties, with DRL having good sample efficiency and poor stability, while ES being vice versa. Recently, there have been attempts to combine these algorithms, but these methods fully rely on synchronous update scheme, making it not ideal to maximize the benefits of the parallelism in ES. To solve this challenge, asynchronous update scheme was introduced, which is capable of good time-efficiency and diverse policy exploration. In this paper, we introduce an Asynchronous Evolution Strategy-Reinforcement Learning (AES-RL) that maximizes the parallel efficiency of ES and integrates it with policy gradient methods. Specifically, we propose 1) a novel framework to merge ES and DRL asynchronously and 2) various asynchronous update methods that can take all advantages of asynchronism, ES, and DRL, which are exploration and time efficiency, stability, and sample efficiency, respectively. The proposed framework and update methods are evaluated in continuous control benchmark work, showing superior performance as well as time efficiency compared to the previous methods.


reflecting reviewers ' comments which are not mentioned in this response

Neural Information Processing Systems

We thank the reviewers for the reviews, providing meaningful insight with constructive feedback. The result was reversed in Hopper, where RL contributed 200.86 while EA actors did 363.53. Therefore, all performance result scores are measured in the fixed interaction step. R2: Ablation study is missing. We presented the effect of the variance update rule in Appendix C.3 by comparing the result Then, we provided all combinations of our proposed mean and variance in Table 2. We will add a section so that it can be seen at a glance.


Autoformalizing Mathematical Statements by Symbolic Equivalence and Semantic Consistency Zenan Li1 Yifan Wu2 Zhaoyu Li3 Xinming Wei 2

Neural Information Processing Systems

Autoformalization, the task of automatically translating natural language descriptions into a formal language, poses a significant challenge across various domains, especially in mathematics. Recent advancements in large language models (LLMs) have unveiled their promising capabilities to formalize even competition-level math problems. However, we observe a considerable discrepancy between pass@1 and pass@k accuracies in LLM-generated formalizations. To address this gap, we introduce a novel framework that scores and selects the best result from k autoformalization candidates based on two complementary self-consistency methods: symbolic equivalence and semantic consistency. Elaborately, symbolic equivalence identifies the logical homogeneity among autoformalization candidates using automated theorem provers, and semantic consistency evaluates the preservation of the original meaning by informalizing the candidates and computing the similarity between the embeddings of the original and informalized texts. Our extensive experiments on the MATH and miniF2F datasets demonstrate that our approach significantly enhances autoformalization accuracy, achieving up to 0.22-1.35x


Supplementary Material for " AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery "

Neural Information Processing Systems

In Sec. 2 we include a The data is publicly available at https://allclear.cs.cornell.edu. We include a datasheet for our dataset following the methodology from "Datasheets for Datasets" Gebru In this section, we include the prompts from Gebru et al. [2021] in blue, and in For what purpose was the dataset created? Was there a specific task in mind? The dataset was created to facilitate research development on cloud removal in satellite imagery. Specifically, our task is more temporally aligned than previous benchmarks.


AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery

Neural Information Processing Systems

Clouds in satellite imagery pose a significant challenge for downstream applications. A major challenge in current cloud removal research is the absence of a comprehensive benchmark and a sufficiently large and diverse training dataset. To address this problem, we introduce the largest public dataset -- AllClear for cloud removal, featuring 23,742 globally distributed regions of interest (ROIs) with diverse land-use patterns, comprising 4 million images in total. Each ROI includes complete temporal captures from the year 2022, with (1) multi-spectral optical imagery from Sentinel-2 and Landsat 8/9, (2) synthetic aperture radar (SAR) imagery from Sentinel-1, and (3) auxiliary remote sensing products such as cloud masks and land cover maps. We validate the effectiveness of our dataset by benchmarking performance, demonstrating the scaling law -- the PSNR rises from 28.47 to 33.87 with 30 more data, and conducting ablation studies on the temporal length and the importance of individual modalities. This dataset aims to provide comprehensive coverage of the Earth's surface and promote better cloud removal results.


Provable Tempered Overfitting of Minimal Nets and Typical Nets

Neural Information Processing Systems

We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.


Private Identity Testing for High-Dimensional Distributions

Neural Information Processing Systems

We construct two types of testers, exhibiting tradeoffs between sample complexity and computational complexity. Finally, we provide a two-way reduction between testing a subclass of multivariate product distributions and testing univariate distributions, and thereby obtain upper and lower bounds for testing this subclass of product distributions.


Private Identity Testing for High-Dimensional Distributions

Neural Information Processing Systems

We construct two types of testers, exhibiting tradeoffs between sample complexity and computational complexity. Finally, we provide a two-way reduction between testing a subclass of multivariate product distributions and testing univariate distributions, thereby obtaining upper and lower bounds for testing this subclass of product distributions.


comments on the presentation, which we will address while preparing our final manuscript

Neural Information Processing Systems

We thank all the reviewers for their careful reading and thoughtful comments. For example, ε = 1.1 would permit an algorithm to go from Additionally, data analysis pipelines (e.g., model selection) in practice typically contain many