Goto

Collaborating Authors

 accessed


MAGPI: Multifidelity-Augmented Gaussian Process Inputs for Surrogate Modeling from Scarce Data

Rex, Atticus, Qian, Elizabeth, Peterson, David

arXiv.org Machine Learning

Supervised machine learning describes the practice of fitting a parameterized model to labeled input-output data. Supervised machine learning methods have demonstrated promise in learning efficient surrogate models that can (partially) replace expensive high-fidelity models, making many-query analyses, such as optimization, uncertainty quantification, and inference, tractable. However, when training data must be obtained through the evaluation of an expensive model or experiment, the amount of training data that can be obtained is often limited, which can make learned surrogate models unreliable. However, in many engineering and scientific settings, cheaper \emph{low-fidelity} models may be available, for example arising from simplified physics modeling or coarse grids. These models may be used to generate additional low-fidelity training data. The goal of \emph{multifidelity} machine learning is to use both high- and low-fidelity training data to learn a surrogate model which is cheaper to evaluate than the high-fidelity model, but more accurate than any available low-fidelity model. This work proposes a new multifidelity training approach for Gaussian process regression which uses low-fidelity data to define additional features that augment the input space of the learned model. The approach unites desirable properties from two separate classes of existing multifidelity GPR approaches, cokriging and autoregressive estimators. Numerical experiments on several test problems demonstrate both increased predictive accuracy and reduced computational cost relative to the state of the art.


Auditing the Auditors: Does Community-based Moderation Get It Right?

Alimohammadi, Yeganeh, Huang, Karissa, Borgs, Christian, Chayes, Jennifer

arXiv.org Machine Learning

Online social platforms increasingly rely on crowd-sourced systems to label misleading content at scale, but these systems must both aggregate users' evaluations and decide whose evaluations to trust. To address the latter, many platforms audit users by rewarding agreement with the final aggregate outcome, a design we term consensus-based auditing. We analyze the consequences of this design in X's Community Notes, which in September 2022 adopted consensus-based auditing that ties users' eligibility for participation to agreement with the eventual platform outcome. We find evidence of strategic conformity: minority contributors' evaluations drift toward the majority and their participation share falls on controversial topics, where independent signals matter most. We formalize this mechanism in a behavioral model in which contributors trade off private beliefs against anticipated penalties for disagreement. Motivated by these findings, we propose a two-stage auditing and aggregation algorithm that weights contributors by the stability of their past residuals rather than by agreement with the majority. The method first accounts for differences across content and contributors, and then measures how predictable each contributor's evaluations are relative to the latent-factor model. Contributors whose evaluations are consistently informative receive greater influence in aggregation, even when they disagree with the prevailing consensus. In the Community Notes data, this approach improves out-of-sample predictive performance while avoiding penalization of disagreement.





ChangeEventDatasetforDiscoveryfrom Spatio-temporalRemoteSensingImagery

Neural Information Processing Systems

Thus, instead of simply detecting changed pixels, we want to identify change events. We define a change event as a group of pixels over space and time that are all changed by a single event. Weareinterested indeveloping systems thatcanautomatically detectchangeeventsandassign to each a semantic label that indicates the nature of the event, e.g., forest fires, road construction etc. Identifying change events is a much more challenging problem than change detection.



SEVIR: AStormEventImageryDatasetforDeep LearningApplicationsinRadarandSatellite Meteorology

Neural Information Processing Systems

Modern deep learning approaches haveshown promising results inmeteorological applications like precipitation nowcasting, synthetic radar generation, front detection and several others. Inorder toeffectively train and validate these complex algorithms, large and diverse datasets containing high-resolution imagery are required. Petabytes of weather data, such as from the Geostationary Environmental SatelliteSystem(GOES)andtheNext-Generation Radar(NEXRAD) system, are available to the public; however, the size and complexity of these datasets isahindrance todeveloping and training deep models.