competitor
Anthropic Is at War With Itself
The AI company shouting about AI's dangers can't quite bring itself to slow down. T hese are not the words you want to hear when it comes to human extinction, but I was hearing them: "Things are moving uncomfortably fast." I was sitting in a conference room with Sam Bowman, a safety researcher at Anthropic. Worth $183 billion at the latest estimate, the AI firm has every incentive to speed things up, ship more products, and develop more advanced chatbots to stay competitive with the likes of OpenAI, Google, and the industry's other giants. But Anthropic is at odds with itself--thinking deeply, even anxiously, about seemingly every decision. Anthropic has positioned itself as the AI industry's superego: the firm that speaks with the most authority about the big questions surrounding the technology, while rival companies develop advertisements and affiliate shopping links (a difference that Anthropic's CEO, Dario Amodei, was eager to call out during an interview in Davos last week).
- Asia > Middle East > Qatar (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Minnesota (0.04)
- (5 more...)
- Government (1.00)
- Banking & Finance (0.68)
- Information Technology > Security & Privacy (0.46)
Scaling Laws for Hyperparameter Optimization
Hyperparameter optimization is an important subfield of machine learning that focuses on tuning the hyperparameters of a chosen algorithm to achieve peak performance. Recently, there has been a stream of methods that tackle the issue of hyperparameter optimization, however, most of the methods do not exploit the dominant power law nature of learning curves for Bayesian optimization. In this work, we propose Deep Power Laws (DPL), an ensemble of neural network models conditioned to yield predictions that follow a power-law scaling pattern. Our method dynamically decides which configurations to pause and train incrementally by making use of gray-box evaluations. We compare our method against 7 state-of-the-art competitors on 3 benchmarks related to tabular, image, and NLP datasets covering 59 diverse tasks. Our method achieves the best results across all benchmarks by obtaining the best any-time results compared to all competitors.
User-Specified Local Differential Privacy in Unconstrained Adaptive Online Learning
Local differential privacy is a strong notion of privacy in which the provider of the data guarantees privacy by perturbing the data with random noise. In the standard application of local differential differential privacy the distribution of the noise is constant and known by the learner. In this paper we generalize this approach by allowing the provider of the data to choose the distribution of the noise without disclosing any parameters of the distribution to the learner, under the constraint that the distribution is symmetrical. We consider this problem in the unconstrained Online Convex Optimization setting with noisy feedback. In this setting the learner receives the subgradient of a loss function, perturbed by noise, and aims to achieve sublinear regret with respect to some competitor, without constraints on the norm of the competitor. We derive the first algorithms that have adaptive regret bounds in this setting, i.e. our algorithms adapt to the unknown competitor norm, unknown noise, and unknown sum of the norms of the subgradients, matching state of the art bounds in all cases.
Fast Factorized Learning: Powered by In-Memory Database Systems
Stöckl, Bernhard, Schüle, Maximilian E.
Learning models over factorized joins avoids redundant computations by identifying and pre-computing shared cofactors. Previous work has investigated the performance gain when computing cofactors on traditional disk-based database systems. Due to the absence of published code, the experiments could not be reproduced on in-memory database systems. This work describes the implementation when using cofactors for in-database factorized learning. We benchmark our open-source implementation for learning linear regression on factorized joins with PostgreSQL -- as a disk-based database system -- and HyPer -- as an in-memory engine. The evaluation shows a performance gain of factorized learning on in-memory database systems by 70\% to non-factorized learning and by a factor of 100 compared to disk-based database systems. Thus, modern database engines can contribute to the machine learning pipeline by pre-computing aggregates prior to data extraction to accelerate training.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (5 more...)
Mitigating Self-Preference by Authorship Obfuscation
Language models (LMs) judges are widely used to evaluate the quality of LM outputs. Despite many advantages, LM judges display concerning biases that can impair their integrity in evaluations. One such bias is self-preference: LM judges preferring their own answers over those produced by other LMs or humans. The bias is hard to eliminate as frontier LM judges can distinguish their own outputs from those of others, even when the evaluation candidates are not labeled with their sources. In this paper, we investigate strategies to mitigate self-preference by reducing the LM judges' ability to recognize their own outputs. We apply black-box perturbations to evaluation candidates in pairwise comparison to obfuscate the authorship and reduce self-recognition. We find that perturbations as simple as synonym replacement for a few words predictably reduce self-preference. However, we also uncover fundamental challenges to eliminating the bias: when we extrapolate our perturbations to a more complete neutralization of stylistic differences between the evaluation candidates, self-preference recovers. Our findings suggest that self-recognition and self-preference can happen on many semantic levels, and complete mitigation remains challenging despite promising initial results.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Universality in Collective Intelligence on the Rubik's Cube
Krakauer, David, Kardeş, Gülce, Grochow, Joshua
Progress in understanding expert performance is limited by the scarcity of quantitative data on long-term knowledge acquisition and deployment. Here we use the Rubik's Cube as a cognitive model system existing at the intersection of puzzle solving, skill learning, expert knowledge, cultural transmission, and group theory. By studying competitive cube communities, we find evidence for universality in the collective learning of the Rubik's Cube in both sighted and blindfolded conditions: expert performance follows exponential progress curves whose parameters reflect the delayed acquisition of algorithms that shorten solution paths. Blindfold solves form a distinct problem class from sighted solves and are constrained not only by expert knowledge but also by the skill improvements required to overcome short-term memory bottlenecks, a constraint shared with blindfold chess. Cognitive artifacts such as the Rubik's Cube help solvers navigate an otherwise enormous mathematical state space. In doing so, they sustain collective intelligence by integrating communal knowledge stores with individual expertise and skill, illustrating how expertise can, in practice, continue to deepen over the course of a single lifetime.
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
Learning Sparse Gaussian Graphical Models with Overlapping Blocks
We present a novel framework, called GRAB (GRaphical models with overlApping Blocks), to capture densely connected components in a network estimate. GRAB takes as input a data matrix of p variables and n samples, and jointly learns both a network among p variables and densely connected groups of variables (called `blocks'). GRAB has four major novelties as compared to existing network estimation methods: 1) It does not require the blocks to be given a priori.
questions and then offer our responses. 2 R1: Computational complexity, especially in light of integer linear programming. Bag pairing does indeed reduce
We thank the authors for their careful reading of the paper. Below we repeat or paraphrase the reviewers' comments and This is a design choice in setting up the experiment. LPs, and we chose ours to be uniform on the given intervals. R4: Generalisation bounds not formulated in terms of excess of risk. The term "generalization error bound" The reviewer may be asking about a "calibration" type excess risk bound.
In Alex Karp's World, Palantir Is the Underdog
My parents didn't go to college, but his father was a pediatrician, Jewish American. His mother was an artist that still is an artist, and she's African American. So he is Black and Jewish parentage. He is dyslexic, and that's a big part of his identity. And when we talked about going to Central High School, which is kind of a magnet school, it's all academic, and it draws from all over the city.
- Asia > Middle East > Israel (0.14)
- North America > United States > Texas (0.04)
- North America > United States > New York (0.04)
- (14 more...)
- Leisure & Entertainment (1.00)
- Law (1.00)
- Information Technology (1.00)
- (3 more...)
- Information Technology > Communications > Mobile (0.65)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
Tokenize Once, Recommend Anywhere: Unified Item Tokenization for Multi-domain LLM-based Recommendation
Large language model (LLM)-based recommender systems have achieved high-quality performance by bridging the discrepancy between the item space and the language space through item tokenization. However, existing item tokenization methods typically require training separate models for each item domain, limiting generalization. Moreover, the diverse distributions and semantics across item domains make it difficult to construct a unified tokenization that preserves domain-specific information. To address these challenges, we propose UniTok, a Unified item Tokenization framework that integrates our own mixture-of-experts (MoE) architecture with a series of codebooks to convert items into discrete tokens, enabling scalable tokenization while preserving semantic information across multiple item domains. Specifically, items from different domains are first projected into a unified latent space through a shared encoder. They are then routed to domain-specific experts to capture the unique semantics, while a shared expert, which is always active, encodes common knowledge transferable across domains. Additionally, to mitigate semantic imbalance across domains, we present a mutual information calibration mechanism, which guides the model towards retaining similar levels of semantic information for each domain. Comprehensive experiments on wide-ranging real-world datasets demonstrate that the proposed UniTok framework is (a) highly effective: achieving up to 51.89% improvements over strong benchmarks, (b) theoretically sound: showing the analytical validity of our architectural design and optimization; and (c) highly generalizable: demonstrating robust performance across diverse domains without requiring per-domain retraining, a capability not supported by existing baselines.
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Singapore (0.04)
- (9 more...)