performance
Agreement-on-the-line: Predicting the Performance of Neural Networks under Distribution Shift
Recently, Miller et al. showed that a model's in-distribution (ID) accuracy has a strong linear correlation with its out-of-distribution (OOD) accuracy, on several OOD benchmarks, a phenomenon they dubbed ``accuracy-on-the-line''. While a useful tool for model selection (i.e., the model most likely to perform the best OOD is the one with highest ID accuracy), this fact does not help to estimate the actual OOD performance of models without access to a labeled OOD validation set. In this paper, we show a similar surprising phenomena also holds for the agreement between pairs of neural network classifiers: whenever accuracy-on-the-line holds, we observe that the OOD agreement between the predictions of any two pairs of neural networks (with potentially different architectures) also observes a strong linear correlation with their ID agreement. Furthermore, we observe that the slope and bias of OOD vs ID agreement closely matches that of OOD vs ID accuracy. This phenomenon which we call agreement-on-the-line, has important practical applications: without any labeled data, we can predict the OOD accuracy of classifiers, since OOD agreement can be estimated with just unlabeled data. Our prediction algorithm outperforms previous methods both in shifts where agreement-on-the-line holds and, surprisingly, when accuracy is not on the line. This phenomenon also provides new insights into neural networks: unlike accuracy-on-the-line, agreement-on-the-line only appears to hold for neural network classifiers.
Unravelling the Performance of Physics-informed Graph Neural Networks for Dynamical Systems
Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. Then, we evaluate them on spring, pendulum, and gravitational and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.
Singleton-Optimized Conformal Prediction
Wang, Tao, Sun, Yan, Dobriban, Edgar
Conformal prediction can be used to construct prediction sets that cover the true outcome with a desired probability, but can sometimes lead to large prediction sets that are costly in practice. The most useful outcome is a singleton prediction-an unambiguous decision-yet existing efficiency-oriented methods primarily optimize average set size. Motivated by this, we propose a new nonconformity score that aims to minimize the probability of producing non-singleton sets. Starting from a non-convex constrained optimization problem as a motivation, we provide a geometric reformulation and associated algorithm for computing the nonconformity score and associated split conformal prediction sets in O(K) time for K-class problems. Using this score in split conformal prediction leads to our proposed Singleton-Optimized Conformal Prediction (SOCOP) method. We evaluate our method in experiments on image classification and LLM multiple-choice question-answering, comparing with standard nonconformity scores such as the (negative) label probability estimates and their cumulative distribution function; both of which are motivated by optimizing length. The results show that SOCOP increases singleton frequency (sometimes by over 20%) compared to the above scores, with minimal impact on average set size.
- North America > United States > Pennsylvania (0.04)
- North America > United States > New Jersey (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model
Compared to traditional Artificial Neural Network (ANN), Spiking Neural Network (SNN) has garnered widespread academic interest for its intrinsic ability to transmit information in a more energy-efficient manner. The recently proposed multi-threshold model provides more possibilities for further enhancing the learning capability of SNNs. In this paper, we rigorously analyze the relationship among the multi-threshold model, vanilla spiking model and quantized ANNs from a mathematical perspective, then propose a novel LM-HT model, which is an equidistant multi-threshold model that can dynamically regulate the global input current and membrane potential leakage on the time dimension. The LM-HT model can also be transformed into a vanilla single threshold model through reparameterization, thereby achieving more flexible hardware deployment. In addition, we note that the LM-HT model can seamlessly integrate with ANN-SNN Conversion framework under special initialization.
Reviews: The Effect of Network Width on the Performance of Large-batch Training
It has been wide discussed on how to develop algorithms allow large batches, so that one could train neural networks in a distributed environment. The paper investigates the effect of network width on the performance of large-batch training both theoretically and experimentally. The authors claim that with the same number of parameters, it is more likely to train neural networks using proper large batches easily with a wide network architecture. The theoretical support on 2-layers linear/nonlinear networks and multilayer linear networks is also given. The paper is well-written and easy to follow.
DSD$^2$: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Quétu, Victor, Tartaglione, Enzo
Neoteric works have shown that modern deep learning models can exhibit a sparse double descent phenomenon. Indeed, as the sparsity of the model increases, the test performance first worsens since the model is overfitting the training data; then, the overfitting reduces, leading to an improvement in performance, and finally, the model begins to forget critical information, resulting in underfitting. Such a behavior prevents using traditional early stop criteria. In this work, we have three key contributions. First, we propose a learning framework that avoids such a phenomenon and improves generalization. Second, we introduce an entropy measure providing more insights into the insurgence of this phenomenon and enabling the use of traditional stop criteria. Third, we provide a comprehensive quantitative analysis of contingent factors such as re-initialization methods, model width and depth, and dataset noise. The contributions are supported by empirical evidence in typical setups. Our code is available at https://github.com/VGCQ/DSD2.
- Education (0.47)
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
- Energy > Oil & Gas > Midstream (0.46)
BetterUp Announces Sales Performance Solution on Salesforce AppExchange
BetterUp, the human transformation company,announced it has launched BetterUp Sales Performance on Salesforce AppExchange, empowering customers to reach peak performance through dedicated professional sales coaching and measure the impact coaching is having on business outcomes including quota attainment, deal velocity, time to productivity, and more. "BetterUp's Sales Performance is a welcome addition to AppExchange as it powers transformation for customers by unlocking team performance across sales organizations" Integrated directly with Salesforce, BetterUp's Sales Performance is currently available on AppExchange. Additionally, BetterUp and Salesforce customers will soon have access to a dashboard embedded into Salesforce that measures revenue-related metrics driven by coaches and non-coached sales representatives. "Salesforce is one of the first adopters of the Human Transformation Platform and remains committed to the importance of mental fitness, inclusion and connection," said Alexi Robichaux, CEO and co-founder of BetterUp. "With our expanded partnership, sales managers and sellers using Sales Cloud will now have support to strategize on deals, prep for prospect calls, and develop the right mindsets around resilience, focus, and agility. We're thrilled to bring this solution to Salesforce customers to help them achieve and exceed sales performance."
A New Pattern For The Jamstack: Segmented Rendering -- Smashing Magazine
Eric lives in France and he owns a company named LBKE where he works as a developer and as a R&D consultant. If you think that static rendering is limited to generic, public content that is the same for every user of your website, you should definitely read this article. Segmented Rendering is a new pattern for the Jamstack that lets you personalize content statically, without any sort of client-side rendering or per-request Server-Side Rendering. Let's focus on a scenario very useful for blog owners: handling paid content. Wow, you just got promoted!
- Information Technology > Artificial Intelligence (0.36)
- Information Technology > Software (0.32)