Goto

Collaborating Authors

 probe


Supplementary Material 1 Decoding using automatic differentiation inference ADVI

Neural Information Processing Systems

In the method section of our paper, we describe the general encoding-decoding paradigm. We provide a brief overview of our data preprocessing pipeline, which involves the following steps. We employ the method of Boussard et al. (2021) to estimate the location of Decentralized registration (Windolf et al., 2022) is applied to track and correct Figure 6: Motion drift in "good" and "bad" sorting recordings. "bad" sorting example, which is still affected by drift even after registration. To decode binary behaviors, such as the mouse's left or right choices, we utilize In this section, we provide visualizations to gain insights into the effectiveness of our proposed decoder.







The Search for Alien Artifacts Is Coming Into Focus

WIRED

From surveys of the pre-Sputnik skies to analysis of interstellar visitors, scientists are rethinking how and where to look for physical traces of alien technology. Science fiction is awash in the material remnants of extraterrestrial civilizations, which surface in everything from the classic books of Arthur C. Clarke to game franchises like and . The discovery of the first interstellar objects in the solar system within the past decade has sparked speculation that they could be alien artifacts or spaceships, though the scientific consensus remains that all three of these visitors have natural explanations. That said, scientists have been anticipating the possibility of encountering alien artifacts since the dawn of the space age. "In the history of technosignatures, the possibility that there could be artifacts in the solar system has been around for a long time," says Adam Frank, a professor of astrophysics at the University of Rochester.


Gaussian Process Probes (GPP) for Uncertainty-Aware Probing

Neural Information Processing Systems

Understanding which concepts models can and cannot represent has been fundamental to many tasks: from effective and responsible use of models to detecting out of distribution data. We introduce Gaussian process probes (GPP), a unified and simple framework for probing and measuring uncertainty about concepts represented by models. As a Bayesian extension of linear probing methods, GPP asks what kind of distribution over classifiers (of concepts) is induced by the model. This distribution can be used to measure both what the model represents and how confident the probe is about what the model represents. GPP can be applied to any pre-trained model with vector representations of inputs (e.g., activations). It does not require access to training data, gradients, or the architecture.


Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

Neural Information Processing Systems

Safety of Large Language Models (LLMs) has become a central issue given their rapid progress and wide applications. Greedy Coordinate Gradient (GCG) is shown to be effective in constructing prompts containing adversarial suffixes to break the presumingly safe LLMs, but the optimization of GCG is time-consuming and limits its practicality. To reduce the time cost of GCG and enable more comprehensive studies of LLM safety, in this work, we study a new algorithm called $\texttt{Probe sampling}$ to accelerate the GCG algorithm. At the core of the algorithm is a mechanism that dynamically determines how similar a smaller draft model's predictions are to the target model's predictions for prompt candidates. When the target model is similar to the draft model, we rely heavily on the draft model to filter out a large number of potential prompt candidates to reduce the computation time. Probe sampling achieves up to $5.6$ times speedup using Llama2-7b-chat and leads to equal or improved attack success rate (ASR) on the AdvBench. Furthermore, probe sampling is also able to accelerate other prompt optimization techniques and adversarial attack methods, leading to acceleration of $1.8\times$ for AutoPrompt, $2.4\times$ for APE and $2.4\times$ for AutoDAN.


Mixed Samples as Probes for Unsupervised Model Selection in Domain Adaptation

Neural Information Processing Systems

Unsupervised domain adaptation (UDA) has been widely applied in improving model generalization on unlabeled target data. However, accurately selecting the best UDA model for the target domain is challenging due to the absence of labeled target data and domain distribution shifts. Traditional model selection approaches involve training extra models with source data to estimate the target validation risk. Recent studies propose practical methods that are based on measuring various properties of model predictions on target data. Although effective for some UDA models, these methods often lack stability and may lead to poor selections for other UDA models.In this paper, we present MixVal, an innovative model selection method that operates solely with unlabeled target data during inference. MixVal leverages mixed target samples with pseudo labels to directly probe the learned target structure by each UDA model.