Regression
Learning-based estimation of cattle weight gain and its influencing factors
Hossain, Muhammad Riaz Hasib, Islam, Rafiqul, McGrath, Shawn R., Islam, Md Zahidul, Lamb, David
Many cattle farmers still depend on manual methods to measure the live weight gain of cattle at set intervals, which is time consuming, labour intensive, and stressful for both the animals and handlers. A remote and autonomous monitoring system using machine learning (ML) or deep learning (DL) can provide a more efficient and less invasive method and also predictive capabilities for future cattle weight gain (CWG). This system allows continuous monitoring and estimation of individual cattle live weight gain, growth rates and weight fluctuations considering various factors like environmental conditions, genetic predispositions, feed availability, movement patterns and behaviour. Several researchers have explored the efficiency of estimating CWG using ML and DL algorithms. However, estimating CWG suffers from a lack of consistency in its application. Moreover, ML or DL can provide weight gain estimations based on several features that vary in existing research. Additionally, previous studies have encountered various data related challenges when estimating CWG. This paper presents a comprehensive investigation in estimating CWG using advanced ML techniques based on research articles (between 2004 and 2024). This study investigates the current tools, methods, and features used in CWG estimation, as well as their strengths and weaknesses. The findings highlight the significance of using advanced ML approaches in CWG estimation and its critical influence on factors. Furthermore, this study identifies potential research gaps and provides research direction on CWG prediction, which serves as a reference for future research in this area.
Multi-modal Data Fusion and Deep Ensemble Learning for Accurate Crop Yield Prediction
Yewle, Akshay Dagadu, Mirzayeva, Laman, Karakuล, Oktay
This study introduces RicEns-Net, a novel Deep Ensemble model designed to predict crop yields by integrating diverse data sources through multimodal data fusion techniques. The research focuses specifically on the use of synthetic aperture radar (SAR), optical remote sensing data from Sentinel 1, 2, and 3 satellites, and meteorological measurements such as surface temperature and rainfall. The initial field data for the study were acquired through Ernst & Young's (EY) Open Science Challenge 2023. The primary objective is to enhance the precision of crop yield prediction by developing a machine-learning framework capable of handling complex environmental data. A comprehensive data engineering process was employed to select the most informative features from over 100 potential predictors, reducing the set to 15 features from 5 distinct modalities. This step mitigates the ``curse of dimensionality" and enhances model performance. The RicEns-Net architecture combines multiple machine learning algorithms in a deep ensemble framework, integrating the strengths of each technique to improve predictive accuracy. Experimental results demonstrate that RicEns-Net achieves a mean absolute error (MAE) of 341 kg/Ha (roughly corresponds to 5-6\% of the lowest average yield in the region), significantly exceeding the performance of previous state-of-the-art models, including those developed during the EY challenge.
A Safe Screening Rule for Sparse Logistic Regression
Jie Wang, Jiayu Zhou, Jun Liu, Peter Wonka, Jieping Ye
Although many recent efforts have been devoted to its efficient implementation, its application to high dimensional data still poses significant challenges. In this paper, we present a fast and effective sparse logistic regression screening rule (Slores) to identify the "0" components in the solution vector, which may lead to a substantial reduction in the number of features to be entered to the optimization. An appealing feature of Slores is that the data set needs to be scanned only once to run the screening and its computational cost is negligible compared to that of solving the sparse logistic regression problem. Moreover, Slores is independent of solvers for sparse logistic regression, thus Slores can be integrated with any existing solver to improve the efficiency. We have evaluated Slores using high-dimensional data sets from different applications. Experiments demonstrate that Slores outperforms the existing state-of-the-art screening rules and the efficiency of solving sparse logistic regression can be improved by one magnitude.
Review for NeurIPS paper: Extrapolation Towards Imaginary 0-Nearest Neighbour and Its Improved Convergence Rate
The paper presents a new nonparametric learning method, which seems to combine certain elements of k-nearest neighbors with elements of local regression estimation. It recovers the optimal rates for classification with smooth regression functions and Tsybakov noise, previously established for a local polynomial regression method, but uses a predictor representation involving far fewer parameters, as in a simple weighted k-NN predictor. The reviewers favor accepting the paper. However, they have some reservations, as they would prefer the paper be presented differently, with more space dedicated to presenting the new techniques, and with more investigation into the strengths of this particular method compared to the well-known standard techniques.
Privacy-Preserving Dataset Combination
Fuentes, Keren, Xu, Mimee, Chen, Irene
Access to diverse, high-quality datasets is crucial for machine learning model performance, yet data sharing remains limited by privacy concerns and competitive interests, particularly in regulated domains like healthcare. This dynamic especially disadvantages smaller organizations that lack resources to purchase data or negotiate favorable sharing agreements. We present SecureKL, a privacy-preserving framework that enables organizations to identify beneficial data partnerships without exposing sensitive information. Building on recent advances in dataset combination methods, we develop a secure multiparty computation protocol that maintains strong privacy guarantees while achieving >90\% correlation with plaintext evaluations. In experiments with real-world hospital data, SecureKL successfully identifies beneficial data partnerships that improve model performance for intensive care unit mortality prediction while preserving data privacy. Our framework provides a practical solution for organizations seeking to leverage collective data resources while maintaining privacy and competitive advantages. These results demonstrate the potential for privacy-preserving data collaboration to advance machine learning applications in high-stakes domains while promoting more equitable access to data resources.
Rethinking Word Similarity: Semantic Similarity through Classification Confusion
Zhou, Kaitlyn, Gao, Haishan, Chen, Sarah, Edelstein, Dan, Jurafsky, Dan, Shani, Chen
Word similarity has many applications to social science and cultural analytics tasks like measuring meaning change over time and making sense of contested terms. Yet traditional similarity methods based on cosine similarity between word embeddings cannot capture the context-dependent, asymmetrical, polysemous nature of semantic similarity. We propose a new measure of similarity, Word Confusion, that reframes semantic similarity in terms of feature-based classification confusion. Word Confusion is inspired by Tversky's suggestion that similarity features be chosen dynamically. Here we train a classifier to map contextual embeddings to word identities and use the classifier confusion (the probability of choosing a confounding word c instead of the correct target word t) as a measure of the similarity of c and t. The set of potential confounding words acts as the chosen features. Our method is comparable to cosine similarity in matching human similarity judgments across several datasets (MEN, WirdSim353, and SimLex), and can measure similarity using predetermined features of interest. We demonstrate our model's ability to make use of dynamic features by applying it to test a hypothesis about changes in the 18th C. meaning of the French word "revolution" from popular to state action during the French Revolution. We hope this reimagining of semantic similarity will inspire the development of new tools that better capture the multi-faceted and dynamic nature of language, advancing the fields of computational social science and cultural analytics and beyond.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
Even these ideas are not so novel. For example, the local reparametrization trick is something that we use all the time when we do Variational Bayes (VB) (say in a logistic regression model) and transform high-dimensional integrals into one-dimensional integrals under a Gaussian approximate posterior. For example, if you have a likelihood of the form \prod_{i 1} n \sigma(w T x_i) and apply VB with q(w mu,Sigma), then you end up with a sum of expectations of the form \sum_{i 1} n q(w mu,Sigma) \log \sigma(w T x_i) d w and then the local reparametrization trick is applied to transform each separate (initially high-dimensional integral over the vector w) into a 1-D integral over the univariate standard normal. The authors essentially use this separately for each activation unit and apply stochastic approximation instead of integration. Having said that, I must admit that as far as the stochastic variational inference algorithms are concerned and the related research community (born a couple of years ago!) the use of this local reparametrization trick, as far as I know, is novel and people should know about it because it is useful.
Review for NeurIPS paper: LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-resolution and Beyond
Weaknesses: The dictionary used in reconstructing HR images is hand-crafted. Why can the filters in the dictionary not be learned as kernels in neural network and enjoy the benefit of end-to-end learning as many pure deep learning-based SISR method? In the experiment, when comparing with SOTA SISA methods, only x2 and x4 results are shown while x3 results are missing. The authors are recommended to provide x3 results as well. In addition, FALSR-C and FALSR-A in Table 2 used only DIV2K as the training set, while the training set of the proposed method are both DIV2K and Flickr2K, and thus the comparison here is not fair.
Review for NeurIPS paper: LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-resolution and Beyond
This submission proposes to do single image super-resolution using a network which produces coefficients for a fixed bank of Gaussian/DoG filters. The super-resolution results produce nearly SotA super-resolution PSNR while the proposed approach is 1-2 orders of magnitude more efficient than SotA. Reviewers liked the idea of incorporating a filter bank dictionary. While all of the reviewers felt that these weaknesses put the submission below the acceptance threshold, metareviewers felt that the authors' response adequately addressed each of these concerns. Please add comparisons with the SotA approaches (EDSR, RCAN, ESRGAN, ProSR) in terms of PSNR, efficiency (MultAdds), and parameter count.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
We thank the reviewers for their comments and interest. R1 Assigned_Reviewer_1). R2 proposes a baseline method to compare with. Our interpretation of the comment is that in the expression Y - Z t beta _2, R2 uses Z to denote the feature-vector and Y a 0-1 label, so this proposal corresponds to standard least-squares regression (with lasso). Generally, logistic (lasso) regression is preferable for binary responses [1]. As we already evaluated our approach against the latter method (Figure 1b), the proposed comparison seems unnecessary given the space constraints.