Goto

Collaborating Authors

 base




Weak-to-StrongSearch: AlignLargeLanguageModelsvia SearchingoverSmallLanguageModels

Neural Information Processing Systems

Large language models are usually fine-tuned to align with human preferences. However, fine-tuning a large language model can be challenging. In this work, we introduceweak-to-strong search, framing the alignment of a large language model as a test-time greedy search to maximize the log-probability difference between small tuned and untuned models while sampling from the frozen large model. This method serves both as (1) a compute-efficient model up-scaling strategy that avoids directly tuning the large model and as (2) an instance of weak-to-strong generalization thatenhances astrong model with weak test-time guidance.


Blackbox Attacks via Surrogate Ensemble Search

Neural Information Processing Systems

Blackbox adversarial attacks can be categorized into transfer-and query-based attacks. Transfer methods do not require any feedback from the victim model, but provide lower success rates compared to query-based methods. Query attacks often require a large number of queries for success. To achieve the best of both approaches, recent efforts have tried to combine them, but still require hundreds of queries to achieve high success rates (especially for targeted attacks). In this paper, we propose a novel method for Blackbox Attacks via Surrogate Ensemble Search (BASES) that can generate highly successful blackbox attacks using an extremely small number of queries.


ADT Smart Home Security System review: Smart home security

PCWorld

When you purchase through links in our articles, we may earn a small commission. You can install this home security/smart home system yourself, but professional monitoring is mandatory (and that's A-OK with me). There is no support for Amazon's Alexa ADT Base doesn't include a display The ADT Smart Home Security System emphasizes security over convenience, but there are enough smart home elements for us to recommend it, whether you set it up on your own or pay for ADT's white-glove installation. Keep a close eye on the services you sign up for, as they're not all mandatory. ADT is one of the oldest home security companies in the U.S., and the ADT Smart Home Security product reviewed here is its latest offering that melds home security with a robust smart home system. As with every ADT product, you must commit to paying for professional monitoring of this system, where the staff at a central office keeps track of emergency events and will offer to dispatch police, fire, and medical personnel as needed. But unlike many of ADT's other products, you can either have ADT's technicians install the system in your home or you can do it yourself. This is a security-first system, but smart home features don't completely take a back seat. ADT sells smart light bulbs and smart plugs as well as Nest smart thermostats (more on that in a bit), and there's a Z-Wave radio in the ADT Base that forms the heart of the system, so you can add smart home components--including third-party products--on your own.



Base of RoPE Bounds Context Length

Neural Information Processing Systems

Position embedding is a core component of current Large Language Models (LLMs). Rotary position embedding (RoPE), a technique that encodes the position information with a rotation matrix, has been the de facto choice for position embedding in many LLMs, such as the Llama series. RoPE has been further utilized to extend long context capability, which is roughly based on adjusting the \textit{base} parameter of RoPE to mitigate out-of-distribution (OOD) problems in position embedding. However, in this paper, we find that LLMs may obtain a superficial long-context ability based on the OOD theory. We revisit the role of RoPE in LLMs and propose a novel property of long-term decay, we derive that the \textit{base of RoPE bounds context length}: there is an absolute lower bound for the base value to obtain certain context length capability.


Reviews: Meta Architecture Search

Neural Information Processing Systems

The authors propose Bayesian Meta Architecture Search (BASE), a method for meta learning neural network architectures and their weights across tasks. The paper frames this problem as an Bayesian inference problem and employs Gumbel-Softmax, reparametrization and optimization embedding, a variation inference method, to optimize a distribution over neural network architectures and their weights across different tasks. Originality: Meta learning neural network architectures is a very natural next step for NAS research, which as not been done so far (at least I'm not aware of any work). It is not only very natural but also very important as it allows to make NAS more scalable and of more practical relevance. The Bayesian view, however, is not really novel, but rather an obvious extension of [1]. In general, the related work section is very short and does not provide a proper summary of the current state of the art in this field of research Quality: BASE is well motivated and derived.


Learning-assisted Stochastic Capacity Expansion Planning: A Bayesian Optimization Approach

Brenner, Aron, Khorramfar, Rahman, Mallapragada, Dharik, Amin, Saurabh

arXiv.org Artificial Intelligence

Solving large-scale capacity expansion problems (CEPs) is central to cost-effective decarbonization of regional-scale energy systems. To ensure the intended outcomes of CEPs, modeling uncertainty due to weather-dependent variable renewable energy (VRE) supply and energy demand becomes crucially important. However, the resulting stochastic optimization models are often less computationally tractable than their deterministic counterparts. Here, we propose a learning-assisted approximate solution method to tractably solve two-stage stochastic CEPs. Our method identifies low-cost planning decisions by constructing and solving a sequence of tractable temporally aggregated surrogate problems. We adopt a Bayesian optimization approach to searching the space of time series aggregation hyperparameters and compute approximate solutions that minimize costs on a validation set of supply-demand projections. Importantly, we evaluate solved planning outcomes on a held-out set of test projections. We apply our approach to generation and transmission expansion planning for a joint power-gas system spanning New England. We show that our approach yields an estimated cost savings of up to 3.8% in comparison to benchmark time series aggregation approaches.


Unsupervised Embedding Quality Evaluation

Tsitsulin, Anton, Munkhoeva, Marina, Perozzi, Bryan

arXiv.org Artificial Intelligence

Unsupervised learning has recently significantly gained in popularity, especially with deep learning-based approaches. Despite numerous successes and approaching supervised-level performance on a variety of academic benchmarks, it is still hard to train and evaluate SSL models in practice due to the unsupervised nature of the problem. Even with networks trained in a supervised fashion, it is often unclear whether they will perform well when transferred to another domain. Past works are generally limited to assessing the amount of information contained in embeddings, which is most relevant for self-supervised learning of deep neural networks. This works chooses to follow a different approach: can we quantify how easy it is to linearly separate the data in a stable way? We survey the literature and uncover three methods that could be potentially used for evaluating quality of representations. We also introduce one novel method based on recent advances in understanding the high-dimensional geometric structure of self-supervised learning. We conduct extensive experiments and study the properties of these metrics and ones introduced in the previous work. Our results suggest that while there is no free lunch, there are metrics that can robustly estimate embedding quality in an unsupervised way.