Goto

Collaborating Authors

Supplement to ' Autoencoders that don't overfit towards the Identity '

Neural Information Processing Systems

This supplement provides in Section 2, the proof of the Theorem in the paper, in Section 3, the derivation of the ADMM equations for optimizing Eq. 10 in the paper, and in Section 4, the derivation of the update-equations for optimizing Eq. 11 in the paper, and in Section 5, the generalization of Section 3 in the paper to dropout at different layers in a deep network. This first section of the proof provides an overview, where we start with the objective function of Eq. 1 in the paper (re-stated in Eq. 2 below), and show that it is equal to the objective function in the Theorem in the paper (see Eq. 8 below) up to the factor ap + bq, which is an irrelevant constant when optimizing for B In the following, we provide the detailed steps. We first provide the sequence of manipulations at once, and then describe each step in the text below. We start by re-stating Eq. 1 in the paper (X Line 5 states the analytic simplifications obtained for the parts (a) and (b), respectively, when the number n of training-epochs approaches infinity (for convergence). The details are outlined in Sections 2.2 and 2.3 below.



e33d974aae13e4d877477d51d8bafdc4-AuthorFeedback.pdf

Neural Information Processing Systems

We would like to thank all five (!) reviewers for their detailed reviews and their suggestions / questions, which will help In the following we will try to address the main points raised. Due to space constraints, we had unfortunately shortened this part of the paper too much, as we now realize. 'not enough' on the other features it depends on, we call this'overfitting towards the identity function' in this paper. B (which is controlled by the value of dropout-probability p, or ฮ›), see Eq. 6. We find it remarkable in l. 154-6 (reviewer 4) that training (diagonal removed) differs from prediction (with diagonal).


Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Neural Information Processing Systems

Despite the impressive capabilities of Large Language Models (LLMs) on various tasks, they still struggle with scenarios that involves complex reasoning and planning. Self-correction and self-learning emerge as viable solutions, employing strategies that allow LLMs to refine their outputs and learn from self-assessed rewards. Yet, the efficacy of LLMs in self-refining its response, particularly in complex reasoning and planning task, remains dubious.



Appendix A Data and Code Availability 17 A.1 Code 17 A.2 Data 17 A.3 Result 17 B Dataset Documentation

Neural Information Processing Systems

The robust ability of LLMs to generate and acquire domain-specific knowledge has been a significant factor in this potential [17]. While researchers have explored the use of LLMs in answering agriculture-related exams [55], their performance in certain crop cultivation scenarios, such as pest management, has been less than satisfactory [66]. Moreover, there remains a considerable gap between the ability to answer exam questions and the application of this knowledge in real-world situations. To bridge the gap and thoroughly assess LLMs in supporting the crop science field, we introduce CROP. CROP comprises an instruction tuning dataset that equips LLMs with the necessary skills to aid tasks in crop production, along with a carefully designed benchmark to evaluate the extent to which LLMs fulfill the demands of real-world agricultural applications. We anticipate that CROP will serve the research community and also provide practical benefits to industry practitioners. E.2 LLM-based Multi-turn Dialogue Generation In recent research, several LLM-based approaches have emerged for constructing multi-turn dialogues.


Empowering and Assessing the Utility of Large Language Models in Crop Science 1

Neural Information Processing Systems

Large language models (LLMs) have demonstrated remarkable efficacy across knowledge-intensive tasks. Nevertheless, their untapped potential in crop science presents an opportunity for advancement.



Frequency-aware Generative Models for Multivariate Time Series Imputation Xinyu Yang

Neural Information Processing Systems

Missing data in multivariate time series are common issues that can affect the analysis and downstream applications. Although multivariate time series data generally consist of the trend, seasonal and residual terms, existing works mainly focus on optimizing the modeling for the first two items. However, we find that the residual term is more crucial for getting accurate fillings, since it is more related to the diverse changes of data and the biggest component of imputation errors. Therefore, in this study, we introduce frequency-domain information and design Frequency-aware Generative Models for Multivariate Time Series Imputation (FGTI). Specifically, FGTI employs a high-frequency filter to boost the residual term imputation, supplemented by a dominant-frequency filter for the trend and seasonal imputation.


Density-based User Representation using Gaussian Process Regression for Multi-interest Personalized Retrieval

Neural Information Processing Systems

Accurate modeling of the diverse and dynamic interests of users remains a significant challenge in the design of personalized recommender systems. Existing user modeling methods, like single-point and multi-point representations, have limitations w.r.t.