Goto

Collaborating Authors

 Mercer County


Kissing to Find a Match: Efficient Low-Rank Permutation Representation - Supplementary Material Zorah Lähner University of Siegen

Neural Information Processing Systems

Onofre Martorell Princeton University University of the Balearic Islands Princeton, NJ 08544, United States Investigador ForInDoc del Govern de les Illes Balears yuval.bahat@gmail.com Our supplementary material includes a figure demonstrating the ability of our method to handle large matching problems and a graph showing the influence of the permutation matrix sparsity on the computation speed, as well as accuracy values on the experiments on point cloud assignment. As our method requires devising problem-specific adaptations, the supplementary material also includes a discussion on potential adaptations to our method. Further, it gives a short note on the non-linearity (ReLU) in our approach. Following our shape-matching experiments described in Sec.


Faster Integer Programming

Communications of the ACM

Many important practical computations, such as scheduling, combinatorial, and optimization problems, use techniques known as integer programming to find the best combination of many variables. In these problems, some or all of the variables are restricted to integer values, which requires exponentially greater resources to solve than if the variables could take any value. Last year, Victor Reis, now at the Institute for Advanced Study in Princeton, NJ, and his Ph.D. advisor Thomas Rothvoss of the University of Washington, proved a new upper bound on the time required to solve for any integer program. They analyzed an algorithm described more than a decade ago in the influential Ph.D. thesis of Daniel Dadush, now in the Netherlands at CWI (the Center for Mathematics and Informatics) and Utrecht University. "In some sense the algorithm is his, but we have proven that it works," Rothvoss said.


Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations

arXiv.org Artificial Intelligence

One of the most promising applications of machine learning (ML) in computational physics is to accelerate the solution of partial differential equations (PDEs). The key objective of ML-based PDE solvers is to output a sufficiently accurate solution faster than standard numerical methods, which are used as a baseline comparison. We first perform a systematic review of the ML-for-PDE solving literature. Of articles that use ML to solve a fluid-related PDE and claim to outperform a standard numerical method, we determine that 79% (60/76) compare to a weak baseline. Second, we find evidence that reporting biases, especially outcome reporting bias and publication bias, are widespread. We conclude that ML-for-PDE solving research is overoptimistic: weak baselines lead to overly positive results, while reporting biases lead to underreporting of negative results. To a large extent, these issues appear to be caused by factors similar to those of past reproducibility crises: researcher degrees of freedom and a bias towards positive results. We call for bottom-up cultural changes to minimize biased reporting as well as top-down structural reforms intended to reduce perverse incentives for doing so.


Between Randomness and Arbitrariness: Some Lessons for Reliable Machine Learning at Scale

arXiv.org Machine Learning

To develop rigorous knowledge about ML models -- and the systems in which they are embedded -- we need reliable measurements. But reliable measurement is fundamentally challenging, and touches on issues of reproducibility, scalability, uncertainty quantification, epistemology, and more. This dissertation addresses criteria needed to take reliability seriously: both criteria for designing meaningful metrics, and for methodologies that ensure that we can dependably and efficiently measure these metrics at scale and in practice. In doing so, this dissertation articulates a research vision for a new field of scholarship at the intersection of machine learning, law, and policy. Within this frame, we cover topics that fit under three different themes: (1) quantifying and mitigating sources of arbitrariness in ML, (2) taming randomness in uncertainty estimation and optimization algorithms, in order to achieve scalability without sacrificing reliability, and (3) providing methods for evaluating generative-AI systems, with specific focuses on quantifying memorization in language models and training latent diffusion models on open-licensed data. By making contributions in these three themes, this dissertation serves as an empirical proof by example that research on reliable measurement for machine learning is intimately and inescapably bound up with research in law and policy. These different disciplines pose similar research questions about reliable measurement in machine learning. They are, in fact, two complementary sides of the same research vision, which, broadly construed, aims to construct machine-learning systems that cohere with broader societal values.


Generative Subspace Adversarial Active Learning for Outlier Detection in Multiple Views of High-dimensional Data

arXiv.org Artificial Intelligence

Outlier detection in high-dimensional tabular data is an important task in data mining, essential for many downstream tasks and applications. Existing unsupervised outlier detection algorithms face one or more problems, including inlier assumption (IA), curse of dimensionality (CD), and multiple views (MV). To address these issues, we introduce Generative Subspace Adversarial Active Learning (GSAAL), a novel approach that uses a Generative Adversarial Network with multiple adversaries. These adversaries learn the marginal class probability functions over different data subspaces, while a single generator in the full space models the entire distribution of the inlier class. GSAAL is specifically designed to address the MV limitation while also handling the IA and CD, being the only method to do so. We provide a comprehensive mathematical formulation of MV, convergence guarantees for the discriminators, and scalability results for GSAAL. Our extensive experiments demonstrate the effectiveness and scalability of GSAAL, highlighting its superior performance compared to other popular OD methods, especially in MV scenarios.


Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests

Neural Information Processing Systems

We give a polynomial-time algorithm for provably learning the structure and parameters of bipartite noisy-or Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data. We obtain an efficient learning algorithm for a family of Bayesian networks that we call quartet-learnable. For each latent variable, the existence of a singly-coupled quartet allows us to uniquely identify and learn all parameters involving that latent variable. We give a proof of the polynomial sample complexity of our learning algorithm, and experimentally compare it to variational EM.



On the stochastics of human and artificial creativity

arXiv.org Artificial Intelligence

What constitutes human creativity, and is it possible for computers to exhibit genuine creativity? We argue that achieving human-level intelligence in computers, or so-called Artificial General Intelligence, necessitates attaining also human-level creativity. We contribute to this discussion by developing a statistical representation of human creativity, incorporating prior insights from stochastic theory, psychology, philosophy, neuroscience, and chaos theory. This highlights the stochastic nature of the human creative process, which includes both a bias guided, random proposal step, and an evaluation step depending on a flexible or transformable bias structure. The acquired representation of human creativity is subsequently used to assess the creativity levels of various contemporary AI systems. Our analysis includes modern AI algorithms such as reinforcement learning, diffusion models, and large language models, addressing to what extent they measure up to human level creativity. We conclude that these technologies currently lack the capability for autonomous creative action at a human level.


Online Robust Mean Estimation

arXiv.org Machine Learning

We study the problem of high-dimensional robust mean estimation in an online setting. Specifically, we consider a scenario where $n$ sensors are measuring some common, ongoing phenomenon. At each time step $t=1,2,\ldots,T$, the $i^{th}$ sensor reports its readings $x^{(i)}_t$ for that time step. The algorithm must then commit to its estimate $\mu_t$ for the true mean value of the process at time $t$. We assume that most of the sensors observe independent samples from some common distribution $X$, but an $\epsilon$-fraction of them may instead behave maliciously. The algorithm wishes to compute a good approximation $\mu$ to the true mean $\mu^\ast := \mathbf{E}[X]$. We note that if the algorithm is allowed to wait until time $T$ to report its estimate, this reduces to the well-studied problem of robust mean estimation. However, the requirement that our algorithm produces partial estimates as the data is coming in substantially complicates the situation. We prove two main results about online robust mean estimation in this model. First, if the uncorrupted samples satisfy the standard condition of $(\epsilon,\delta)$-stability, we give an efficient online algorithm that outputs estimates $\mu_t$, $t \in [T],$ such that with high probability it holds that $\|\mu-\mu^\ast\|_2 = O(\delta \log(T))$, where $\mu = (\mu_t)_{t \in [T]}$. We note that this error bound is nearly competitive with the best offline algorithms, which would achieve $\ell_2$-error of $O(\delta)$. Our second main result shows that with additional assumptions on the input (most notably that $X$ is a product distribution) there are inefficient algorithms whose error does not depend on $T$ at all.


Reddit Is Already on the Rebound

WIRED

Social media researchers at the Network Contagion Research Institute in Princeton, New Jersey, got a rude awakening early last month. They were roused by 6:30 am phone calls from a colleague warning that Reddit had started blocking the institute's Pushshift service from updating its ongoing archive of every post on the discussion platform. That was a problem for more than just NCRI, because some of Reddit's 50,000 volunteer moderators depend on Pushshift to quickly investigate problem users, and many academics rely on the service. If it went stale, mods, as Reddit calls moderators, would have to work overtime or let more trash content accumulate. Researchers studying online communities would be forced to put projects and doctoral dissertations on ice.