Not enough data to create a plot.
Try a different view from the menu above.
Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
Leveraging the model's outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground-truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias, especially under the natural shift.
Neural Stochastic Control Wei Lin School of Mathematical Sciences, SCMS, SCAM, and CCSB, Fudan University
Control problems are always challenging since they arise from the real-world systems where stochasticity and randomness are of ubiquitous presence. This naturally and urgently calls for developing efficient neural control policies for stabilizing not only the deterministic equations but the stochastic systems as well. Here, in order to meet this paramount call, we propose two types of controllers, viz., the exponential stabilizer (ES) based on the stochastic Lyapunov theory and the asymptotic stabilizer (AS) based on the stochastic asymptotic stability theory. The ES can render the controlled systems exponentially convergent but it requires a long computational time; conversely, the AS makes the training much faster but it can only assure the asymptotic (not the exponential) attractiveness of the control targets. These two stochastic controllers thus are complementary in applications. We also investigate rigorously the linear controller and the proposed neural stochastic controllers in both convergence time and energy cost and numerically compare them in these two indexes. More significantly, we use several representative physical systems to illustrate the usefulness of the proposed controllers in stabilization of dynamical systems.
Supplementary Material: Neural Star Domain as Primitive Representation
We provide an additional visualization of the single-view reconstruction results of the rifle, airplane, chair, and table categories from ShapeNet [1] in Figure 1. We provide an additional visualization of the primitives in Figures 2, 3, and 4 for the plane, chair, and rifle categories from ShapeNet [1], respectively. NSD provides multiple differentiable shape and surface representations that are available both during training and inference: mesh, surface points, normal, indicator function (signed distance function), and texture. They are visualized in Figure 5. Normal estimation As shown in Figure 5, NSD can also estimate differentiable normal vectors. Unlike methods using mesh templates, the proposed approach can derive normals at arbitrary resolution.
Neural Star Domain as Primitive Representation
Reconstructing 3D objects from 2D images is a fundamental task in computer vision, and an accurate structured reconstruction by parsimonious and semantic primitive representations has an even broader application range. When a target shape is reconstructed using multiple primitives, it is preferable that its fundamental properties, such as collective volume and surface, can be readily and comprehensively accessed so that the primitives can be treated as if they were a single shape. This becomes possible by a primitive representation with unified implicit and explicit representations. However, primitive representations in current approaches do not satisfy these requirements. To resolve this, we propose a novel primitive representation termed neural star domain (NSD) that learns primitive shapes in a star domain. We demonstrate that NSD is a universal approximator of the star domain; furthermore, it is not only parsimonious and semantic but also an implicit and explicit shape representation. The proposed approach outperforms existing methods in image reconstruction tasks in terms of semantic capability as well as sampling speed and quality for high-resolution meshes.
Exploiting LLM Quantization
Quantization leverages lower-precision weights to reduce the memory usage of large language models (LLMs) and is a key technique for enabling their deployment on commodity hardware. While LLM quantization's impact on utility has been extensively explored, this work for the first time studies its adverse effects from a security perspective. We reveal that widely used quantization methods can be exploited to produce a harmful quantized LLM, even though the full-precision counterpart appears benign, potentially tricking users into deploying the malicious quantized model. We demonstrate this threat using a three-staged attack framework: (i) first, we obtain a malicious LLM through fine-tuning on an adversarial task; (ii) next, we quantize the malicious model and calculate constraints that characterize all full-precision models that map to the same quantized model; (iii) finally, using projected gradient descent, we tune out the poisoned behavior from the fullprecision model while ensuring that its weights satisfy the constraints computed in step (ii). This procedure results in an LLM that exhibits benign behavior in full precision but when quantized, it follows the adversarial behavior injected in step (i). We experimentally demonstrate the feasibility and severity of such an attack across three diverse scenarios: vulnerable code generation, content injection, and over-refusal attack. In practice, the adversary could host the resulting full-precision model on an LLM community hub such as Hugging Face, exposing millions of users to the threat of deploying its malicious quantized version on their devices.
" Why Not Other Classes? ": Towards Class-Contrastive Back-Propagation Explanations
Numerous methods have been developed to explain the inner mechanism of deep neural network (DNN) based classifiers. Existing explanation methods are often limited to explaining predictions of a pre-specified class, which answers the question "why is the input classified into this class?" However, such explanations with respect to a single class are inherently insufficient because they do not capture features with class-discriminative power. That is, features that are important for predicting one class may also be important for other classes. To capture features with true class-discriminative power, we should instead ask "why is the input classified into this class, but not others?"
A Universal Growth Rate for Learning with Smooth Surrogate Losses
This paper presents a comprehensive analysis of the growth rate of H-consistency bounds (and excess error bounds) for various surrogate losses used in classification. We prove a square-root growth rate near zero for smooth margin-based surrogate losses in binary classification, providing both upper and lower bounds under mild assumptions.
Finite-Time Analysis of Round-Robin Kullback-Leibler Upper Confidence Bounds for Optimal Adaptive Allocation with Multiple Plays and Markovian Rewards
We study an extension of the classic stochastic multi-armed bandit problem which involves multiple plays and Markovian rewards in the rested bandits setting. In order to tackle this problem we consider an adaptive allocation rule which at each stage combines the information from the sample means of all the arms, with the Kullback-Leibler upper confidence bound of a single arm which is selected in round-robin way. For rewards generated from a one-parameter exponential family of Markov chains, we provide a finite-time upper bound for the regret incurred from this adaptive allocation rule, which reveals the logarithmic dependence of the regret on the time horizon, and which is asymptotically optimal. For our analysis we devise several concentration results for Markov chains, including a maximal inequality for Markov chains, that may be of interest in their own right. As a byproduct of our analysis we also establish asymptotically optimal, finite-time guarantees for the case of multiple plays, and i.i.d.
597c7b407a02cc0a92167e7a371eca25-AuthorFeedback.pdf
As discussed in the paper this generalizes the Bernoulli exponential family. Now our state space is S = {0,..., n}. Komiyama, Honda, Nakagawa, where they study Thompson sampling for multiple plays and i.i.d. Reviewer #3 we don't think that your suggestion as an alternative to the round-robin scheme works. B = T/2, where T is the time horizon.