Not enough data to create a plot.
Try a different view from the menu above.
Technology
Appendix
Figure 9: Example showing how a single line of HTML code is rendered by a browser's renderer. In this example, we can see that the tags
delimit different blocks which are therefore spaced by line breaks while other tags, such as , are rendered on the same line of text that precedes and follows them.
Fast, Provably convergent IRLS Algorithm for p-norm Linear Regression
Deeksha Adil, Richard Peng, Sushant Sachdeva
Iteratively Reweighted Least Squares (IRLS) is an easy to implement family of algorithms for solving these problems that has been studied for over 50 years. However, these algorithms often diverge for p>3, and since the work of Osborne (1985), it has been an open problem whether there is an IRLS algorithm that is guaranteed to converge rapidly for p>3. We propose p-IRLS, the first IRLS algorithm that provably converges geometrically for any p 2 [2, 1). Our algorithm is simple to implement and is guaranteed to find a high accuracy solution in a sub-linear number of iterations. Our experiments demonstrate that it performs even better than our theoretical bounds, beats the standard Matlab/CVX implementation for solving these problems by 10-50x, and is the fastest among available implementations in the high-accuracy regime.
Interpreting Learned Feedback Patterns in Large Language Models Luke Marks Amir Abdullah Clement Neo
Reinforcement learning from human feedback (RLHF) is widely used to train large language models (LLMs). However, it is unclear whether LLMs accurately learn the underlying preferences in human feedback data. We coin the term Learned Feedback Pattern (LFP) for patterns in an LLM's activations learned during RLHF that improve its performance on the fine-tuning task. We hypothesize that LLMs with LFPs accurately aligned to the fine-tuning feedback exhibit consistent activation patterns for outputs that would have received similar feedback during RLHF. To test this, we train probes to estimate the feedback signal implicit in the activations of a fine-tuned LLM. We then compare these estimates to the true feedback, measuring how accurate the LFPs are to the fine-tuning feedback. Our probes are trained on a condensed, sparse and interpretable representation of LLM activations, making it easier to correlate features of the input with our probe's predictions. We validate our probes by comparing the neural features they correlate with positive feedback inputs against the features GPT-4 describes and classifies as related to LFPs. Understanding LFPs can help minimize discrepancies between LLM behavior and training objectives, which is essential for the safety and alignment of LLMs.
Why the argument for WFH could get a big boost from AI
The pandemic changed how people worked, shifting most professionals to remote or hybrid models. For the software company Atlassian, this flexible, distributed approach persists to this day. "We have 13,000 employees spread across the globe, and individuals can choose their working location every day," said Annie Dean, Head of Team Anywhere, Atlassian's distributed work policy. "It's about how we work, not where we work." The implementation of the flexible model has produced positive effects for employees and the company alike. Internal data reveals that even though only 34% of employees have opted to work from home, 92% of Atlassian employees reported that the ability to work from anywhere allows them to perform their best, and 91% said it's an important reason for staying at the company.
Average-Case Averages: Private Algorithms for Smooth Sensitivity and Mean Estimation
The simplest and most widely applied method for guaranteeing differential privacy is to add instance-independent noise to a statistic of interest that is scaled to its global sensitivity. However, global sensitivity is a worst-case notion that is often too conservative for realized dataset instances. We provide methods for scaling noise in an instance-dependent way and demonstrate that they provide greater accuracy under average-case distributional assumptions. Specifically, we consider the basic problem of privately estimating the mean of a real distribution from i.i.d.
Separations in the Representational Capabilities of Transformers and Recurrent Architectures Michael Hahn 2 Phil Blunsom 1,3
Transformer architectures have been widely adopted in foundation models. Due to their high inference costs, there is renewed interest in exploring the potential of efficient recurrent architectures (RNNs). In this paper, we analyze the differences in the representational capabilities of Transformers and RNNs across several tasks of practical relevance, including index lookup, nearest neighbor, recognizing bounded Dyck languages, and string equality. For the tasks considered, our results show separations based on the size of the model required for different architectures. For example, we show that a one-layer Transformer of logarithmic width can perform index lookup, whereas an RNN requires a hidden state of linear size. Conversely, while constant-size RNNs can recognize bounded Dyck languages, we show that one-layer Transformers require a linear size for this task. Furthermore, we show that two-layer Transformers of logarithmic size can perform decision tasks such as string equality or disjointness, whereas both one-layer Transformers and recurrent models require linear size for these tasks. We also show that a log-size two-layer Transformer can implement the nearest neighbor algorithm in its forward pass; on the other hand recurrent models require linear size. Our constructions are based on the existence of N nearly orthogonal vectors in O(log N) dimensional space and our lower bounds are based on reductions from communication complexity problems.
Polynomially Over-Parameterized Convolutional Neural Networks Contain Structured Strong Winning Lottery Tickets
Arthur da Cunha, Universitรฉ Cรดte d'Azur, Inria, CNRS, I3S, Aarhus University, Aarhus, Denmark, dac@cs.au.dk, "3026 Francesco d'Amore, Aalto University, Bocconi University, Espoo, Finland, francesco.damore@aalto.fi "3026 Emanuele Natale, Universitรฉ Cรดte d'Azur, Inria, CNRS, I3S, Sophia Antipolis, France, emanuele.natale@inria.fr
The Strong Lottery Ticket Hypothesis (SLTH) states that randomly-initialised neural networks likely contain subnetworks that perform well without any training. Although unstructured pruning has been extensively studied in this context, its structured counterpart, which can deliver significant computational and memory efficiency gains, has been largely unexplored. One of the main reasons for this gap is the limitations of the underlying mathematical tools used in formal analyses of the SLTH. In this paper, we overcome these limitations: we leverage recent advances in the multidimensional generalisation of the Random Subset-Sum Problem and obtain a variant that admits the stochastic dependencies that arise when addressing structured pruning in the SLTH. We apply this result to prove, for a wide class of random Convolutional Neural Networks, the existence of structured subnetworks that can approximate any sufficiently smaller network. This result provides the first sub-exponential bound around the SLTH for structured pruning, opening up new avenues for further research on the hypothesis and contributing to the understanding of the role of over-parameterization in deep learning.
TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting Decomposition
Creation of 3D content by stylization is a promising yet challenging problem in computer vision and graphics research. In this work, we focus on stylizing photorealistic appearance renderings of a given surface mesh of arbitrary topology. Motivated by the recent surge of cross-modal supervision of the Contrastive Language-Image Pre-training (CLIP) model, we propose TANGO, which transfers the appearance style of a given 3D shape according to a text prompt in a photorealistic manner. Technically, we propose to disentangle the appearance style as the spatially varying bidirectional reflectance distribution function, the local geometric variation, and the lighting condition, which are jointly optimized, via supervision of the CLIP loss, by a spherical Gaussians based differentiable renderer. As such, TANGO enables photorealistic 3D style transfer by automatically predicting reflectance effects even for bare, low-quality meshes, without training on a task-specific dataset. Extensive experiments show that TANGO outperforms existing methods of text-driven 3D style transfer in terms of photorealistic quality, consistency of 3D geometry, and robustness when stylizing low-quality meshes. Our codes and results are available at our project webpage https://cyw-3d.github.io/tango.
Will Elden Ring film be 'awesome' or 'meh'? Fans have thoughts
Elden Ring is a role-playing adventure game set in the war-torn, devastated Lands Between, where players must collect runes which represent that world's order and laws, in order to restore it and become the Elden Lord. TikToker Everythingethan added a note of caution, saying: "I want to know what part of the timeline we're adapting... I don't know if I want to see this live action. I think it would be kind of cursed at times. I think animation is the best way to adapt video games nine times out of 10."
5 AI terms you keep hearing and what they actually mean
Tyler Saltsman, founder and CEO of EdgeRunner AI, warns that creating artificial general intelligence could "destroy the world as we know it." Whether it's powering your phone's autocorrect or helping someone create a new recipe with a few words, artificial intelligence (AI) is everywhere right now. But if you're still nodding along when someone mentions "neural networks" or "generative AI," you're not alone. Today I am breaking down five buzzy AI terms that you've probably seen in headlines, group chats or app updates, minus the tech talk. Understanding these basics will help you talk AI with confidence, even if you're not a programmer.