eft
h eft(s,a) f (|s,a) 1 i +Eρπt 1 h eft(s,a) f (|s,a) 1 i, (A.1) where (t): = Es ζ V
From the Posterior Sampling Lemma, we know that ifψ is the distribution off, then for any sigma-algebraσ(Ht)-measurablefunctiong, E[g(f)|Ht]=E[g(ft)|Ht]. We can further know from the construction of the confidence set (c.f. This lemma is widely adopted in RL. Proof can be found in various previous works, e.g. Prior work that shares similarities with ours contains DPI [59]and GPS [31,39]as dual policyoptimization procedures areadopted.
Weak-to-StrongSearch: AlignLargeLanguageModelsvia SearchingoverSmallLanguageModels
Large language models are usually fine-tuned to align with human preferences. However, fine-tuning a large language model can be challenging. In this work, we introduceweak-to-strong search, framing the alignment of a large language model as a test-time greedy search to maximize the log-probability difference between small tuned and untuned models while sampling from the frozen large model. This method serves both as (1) a compute-efficient model up-scaling strategy that avoids directly tuning the large model and as (2) an instance of weak-to-strong generalization thatenhances astrong model with weak test-time guidance.
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.95)
- (3 more...)
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
Zhou, Zhanhui, Liu, Zhixuan, Liu, Jie, Dong, Zhichen, Yang, Chao, Qiao, Yu
Large language models are usually fine-tuned to align with human preferences. However, fine-tuning a large language model can be challenging. In this work, we introduce weak-to-strong search, framing the alignment of a large language model as a test-time greedy search to maximize the log-likelihood difference between small tuned and untuned models while sampling from the frozen large model. This method serves both as (i) a compute-efficient model up-scaling strategy that avoids directly tuning the large model and as (ii) an instance of weak-to-strong generalization that enhances a strong model with weak test-time guidance. Empirically, we demonstrate the flexibility of weak-to-strong search across different tasks. In controlled-sentiment generation and summarization, we use tuned and untuned gpt2s to effectively improve the alignment of large models without additional training. Crucially, in a more difficult instruction-following benchmark, AlpacaEval 2.0, we show that reusing off-the-shelf small models (e.g., zephyr-7b-beta and its untuned version) can significantly improve the length-controlled win rates of both white-box and black-box large models against gpt-4-turbo (e.g., 34.4 37.9 for Llama-3-70B-Instruct and 16.0 20.1 for gpt-3.5-turbo-instruct),
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
Beyond Spatio-Temporal Representations: Evolving Fourier Transform for Temporal Graphs
Bastos, Anson, Singh, Kuldeep, Nadgeri, Abhishek, Singh, Manish, Suzumura, Toyotaro
We present the Evolving Graph Fourier Transform (EFT), the first invertible spectral transform that captures evolving representations on temporal graphs. We motivate our work by the inadequacy of existing methods for capturing the evolving graph spectra, which are also computationally expensive due to the temporal aspect along with the graph vertex domain. We view the problem as an optimization over the Laplacian of the continuous time dynamic graph. Additionally, we propose pseudo-spectrum relaxations that decompose the transformation process, making it highly computationally efficient. Hence, as a reference implementation, we develop a simple neural model induced with EFT for capturing evolving graph spectra. We empirically validate our theoretical findings on a number of large-scale and standard temporal graph benchmarks and demonstrate that our model achieves state-of-the-art performance. In numerous practical situations, graphs exhibit temporal characteristics, as seen in applications like social networks, citation graphs, and bank transactions, among others (Kazemi et al., 2020). These temporal graphs can be divided into two types: 1) temporal graphs with constant graph structure (Grassi et al., 2017; Cao et al., 2020), and 2) temporal graphs with dynamic structures (Zhou et al., 2022; Bastos et al., 2023; da Xu et al., 2020). Our focus in this work is the latter case. The evolving graphs have been comprehensively studied from the spatio-temporal graph-neural network (GNN) perspective, focusing on propagating local information (Pareja et al., 2020; Shi et al., 2021; Xiang et al., 2022; da Xu et al., 2020). Albeit the success of spectral GNNs for static graphs for capturing non-local dependencies in graph signals (Wang & Zhang, 2022), they have not been applied to temporal graphs with evolving structure. To make spectral GNN work for temporal graphs effectively and efficiently, there is a necessity for an invertible transform that collectively captures evolving spectra along the graph vertex and time domain.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- (4 more...)
Episodic-free Task Selection for Few-shot Learning
Episodic training is a mainstream training strategy for few-shot learning. In few-shot scenarios, however, this strategy is often inferior to some non-episodic training strategy, e. g., Neighbourhood Component Analysis (NCA), which challenges the principle that training conditions must match testing conditions. Thus, a question is naturally asked: How to search for episodic-free tasks for better few-shot learning? In this work, we propose a novel meta-training framework beyond episodic training. In this framework, episodic tasks are not used directly for training, but for evaluating the effectiveness of some selected episodic-free tasks from a task set that are performed for training the meta learners. The selection criterion is designed with the affinity, which measures the degree to which loss decreases when executing the target tasks after training with the selected tasks. In experiments, the training task set contains some promising types, e. g., contrastive learning and classification, and the target few-shot tasks are achieved with the nearest centroid classifiers on the miniImageNet, tiered-ImageNet and CIF AR-FS datasets. The experimental results demonstrate the effectiveness of our approach.
- Europe > Ukraine (0.15)
- Asia > Russia (0.15)
- North America > United States > Illinois (0.05)
- (5 more...)
- Health & Medicine (0.56)
- Government (0.50)
AI Chatbot for Generating Episodic Future Thinking (EFT) Cue Texts for Health
We describe an AI-powered chatbot to aid with health improvement by generating Episodic Future Thinking (EFT) cue texts that should reduce delay discounting. In prior studies, EFT has been shown to address maladaptive health behaviors. Those studies involved participants, working with researchers, vividly imagining future events, and writing a description that they subsequently will frequently review, to ensure a shift from an inclination towards immediate rewards. That should promote behavior change, aiding in health tasks such as treatment adherence and lifestyle modifications. The AI chatbot is designed to guide users in generating personalized EFTs, automating the current labor-intensive interview-based process. This can enhance the efficiency of EFT interventions and make them more accessible, targeting specifically those with limited educational backgrounds or communication challenges. By leveraging AI for EFT intervention, we anticipate broadened access and improved health outcomes across diverse populations
Bridging The Gap: Entailment Fused-T5 for Open-retrieval Conversational Machine Reading Comprehension
Zhang, Xiao, Huang, Heyan, Chi, Zewen, Mao, Xian-Ling
Open-retrieval conversational machine reading comprehension (OCMRC) simulates reallife conversational interaction scenes. Machines are required to make a decision of Yes/No/Inquire or generate a follow-up question when the decision is Inquire based on retrieved rule texts, user scenario, user question, and dialogue history. Recent studies explored the methods to reduce the information gap between decision-making and question generation and thus improve the performance of generation. However, the information gap still exists because these pipeline structures are still limited in decision-making, span extraction, and question rephrasing three stages. Decision-making and generation are reasoning separately, and the entailment reasoning utilized in decision-making is hard to share through all stages. To tackle the above problem, we proposed a novel one-stage endto-end framework, called Entailment Fused-Figure 1: An example in the OCMRC dataset. Given T5 (EFT), to bridge the information gap between the user scenario and user question, machines are decision-making and generation in a required to first retrieve related rule texts in the global understanding manner. The extensive knowledge database, and then make a decision of experimental results demonstrate that our proposed Yes/No/Inquire or generate a follow-up question framework achieves new state-of-the-art when the decision is Inquire based on retrieved rule performance on the OR-ShARC benchmark.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Beijing > Beijing (0.05)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > China > Hong Kong (0.04)