zephyr
Stratospheric internet could finally start taking off this year
High-altitude platforms could help connect over 2 billion people around the world who are still offline. Today, an estimated 2.2 billion people But that number could drop this year, thanks to tests of stratospheric airships, uncrewed aircraft, and other high-altitude platforms for internet delivery. Even with nearly 10,000 active Starlink satellites in orbit and the OneWeb constellation of 650 satellites, solid internet coverage is not a given across vast swathes of the planet. One of the most prominent efforts to plug the connectivity gap was Google X's Loon project . Launched in 2011, it aimed to deliver access using high-altitude balloons stationed above predetermined spots on Earth. But the project faced literal headwinds--the Loons kept drifting away and new ones had to be released constantly, making the venture economically unfeasible.
- Asia > Japan (0.07)
- Europe > United Kingdom > Scotland (0.05)
- Asia > Indonesia (0.05)
- (5 more...)
- Telecommunications (1.00)
- Information Technology (1.00)
- Transportation > Air (0.94)
- Aerospace & Defense > Aircraft (0.69)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Communications > Social Media (0.98)
Camera Control at the Edge with Language Models for Scene Understanding
Buynitsky, Alexiy, Ehsani, Sina, Pallakonda, Bhanu, Mishra, Pragyana
In this paper, we present Optimized Prompt-based Unified System (OPUS), a framework that utilizes a Large Language Model (LLM) to control Pan-Tilt-Zoom (PTZ) cameras, providing contextual understanding of natural environments. To achieve this goal, the OPUS system improves cost-effectiveness by generating keywords from a high-level camera control API and transferring knowledge from larger closed-source language models to smaller ones through Supervised Fine-Tuning (SFT) on synthetic data. This enables efficient edge deployment while maintaining performance comparable to larger models like GPT-4. OPUS enhances environmental awareness by converting data from multiple cameras into textual descriptions for language models, eliminating the need for specialized sensory tokens. In benchmark testing, our approach significantly outperformed both traditional language model techniques and more complex prompting methods, achieving a 35% improvement over advanced techniques and a 20% higher task accuracy compared to closed-source models like Gemini Pro. The system demonstrates OPUS's capability to simplify PTZ camera operations through an intuitive natural language interface. This approach eliminates the need for explicit programming and provides a conversational method for interacting with camera systems, representing a significant advancement in how users can control and utilize PTZ camera technology.
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Minor Embedding for Quantum Annealing with Reinforcement Learning
Nembrini, Riccardo, Dacrema, Maurizio Ferrari, Cremonesi, Paolo
Quantum Annealing (QA) is a quantum computing paradigm for solving combinatorial optimization problems formulated as Quadratic Unconstrained Binary Optimization (QUBO) problems. An essential step in QA is minor embedding, which maps the problem graph onto the sparse topology of the quantum processor. This process is computationally expensive and scales poorly with increasing problem size and hardware complexity. Existing heuristics are often developed for specific problem graphs or hardware topologies and are difficult to generalize. Reinforcement Learning (RL) offers a promising alternative by treating minor embedding as a sequential decision-making problem, where an agent learns to construct minor embeddings by iteratively mapping the problem variables to the hardware qubits. We propose a RL-based approach to minor embedding using a Proximal Policy Optimization agent, testing its ability to embed both fully connected and randomly generated problem graphs on two hardware topologies, Chimera and Zephyr. The results show that our agent consistently produces valid minor embeddings, with reasonably efficient number of qubits, in particular on the more modern Zephyr topology. Our proposed approach is also able to scale to moderate problem sizes and adapts well to different graph structures, highlighting RL's potential as a flexible and general-purpose framework for minor embedding in QA.
- Europe > Austria > Vienna (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > New York > Richmond County > New York City (0.04)
- (13 more...)
Harnessing AI Agents to Advance Research on Refugee Child Mental Health
Shrivastava, Aditya, Gupta, Komal, Arora, Shraddha
The international refugee crisis deepens, exposing millions of displaced children to extreme psychological trauma. This research suggests a compact, AI - based framework for processing unstructured refugee health data and distilling knowledge on child mental health. We compare two Retrieval - Augmented Generation (RAG) pipelines, Zephyr - 7B - beta and DeepSeek R1 - 7B, to determine how well they process challenging humanitarian datasets while avoiding hallucination hazards. By combining cutting - edge AI methods with migration research and child psychology, this study presents a scalable strategy to assist policymakers, mental health practitioners, and humanitarian agencies to better assist displaced children and recognize their mental wellbeing. In total, both the models worked properly but significantly Deepsee k R1 is superior to Zephyr with an accuracy of answer relevance 0.91 Keywords: Retrieval - Augmented Generation, Zephyr - 7B - beta, DeepSeek R1 - 7B, Answer Relevance, Hallucination, LLM as a Judge, Refugee Crises
- North America > United States (0.04)
- Asia > India > Haryana (0.04)
Online Preference Alignment for Language Models via Count-based Exploration
Bai, Chenjia, Zhang, Yang, Qiu, Shuang, Zhang, Qiaosheng, Xu, Kang, Li, Xuelong
Reinforcement Learning from Human Feedback (RLHF) has shown great potential in fine-tuning Large Language Models (LLMs) to align with human preferences. Existing methods perform preference alignment from a fixed dataset, which can be limited in data coverage, and the resulting reward model is hard to generalize in out-of-distribution responses. Thus, online RLHF is more desirable to empower the LLM to explore outside the support of the initial dataset by iteratively collecting the prompt-response pairs. In this paper, we study the fundamental problem in online RLHF, i.e. \emph{how to explore} for LLM. We give a theoretical motivation in linear reward assumption to show that an optimistic reward with an upper confidence bound (UCB) term leads to a provably efficient RLHF policy. Then, we reformulate our objective to direct preference optimization with an exploration term, where the UCB-term can be converted to a count-based exploration bonus. We further propose a practical algorithm, named \emph{Count-based Online Preference Optimization (COPO)}, which leverages a simple coin-flip counting module to estimate the pseudo-count of a prompt-response pair in previously collected data. COPO encourages LLMs to balance exploration and preference optimization in an iterative manner, which enlarges the exploration space and the entire data coverage of iterative LLM policies. We conduct online RLHF experiments on Zephyr and Llama-3 models. The results on instruction-following and standard academic benchmarks show that COPO significantly increases performance.
- North America > United States > Washington > King County > Seattle (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
Mutagenesis screen to map the functionals of parameters of Large Language Models
Hu, Yue, Hu, Kai, Zhao, Patrick X., Khan, Javed, Xu, Chengming
Large Language Models (LLMs) have significantly advanced artificial intelligence, excelling in numerous tasks. Although the functionality of a model is inherently tied to its parameters, a systematic method for exploring the connections between the parameters and the functionality are lacking. Models sharing similar structure and parameter counts exhibit significant performance disparities across various tasks, prompting investigations into the varying patterns that govern their performance. We adopted a mutagenesis screen approach inspired by the methods used in biological studies, to investigate Llama2-7b and Zephyr. This technique involved mutating elements within the models' matrices to their maximum or minimum values to examine the relationship between model parameters and their functionalities. Our research uncovered multiple levels of fine structures within both models. Many matrices showed a mixture of maximum and minimum mutations following mutagenesis, but others were predominantly sensitive to one type. Notably, mutations that produced phenotypes, especially those with severe outcomes, tended to cluster along axes. Additionally, the location of maximum and minimum mutations often displayed a complementary pattern on matrix in both models, with the Gate matrix showing a unique two-dimensional asymmetry after rearrangement. In Zephyr, certain mutations consistently resulted in poetic or conversational rather than descriptive outputs. These "writer" mutations grouped according to the high-frequency initial word of the output, with a marked tendency to share the row coordinate even when they are in different matrices. Our findings affirm that the mutagenesis screen is an effective tool for deciphering the complexities of large language models and identifying unexpected ways to expand their potential, providing deeper insights into the foundational aspects of AI systems.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Maryland > Montgomery County > Bethesda (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Jiang, Liwei, Rao, Kavel, Han, Seungju, Ettinger, Allyson, Brahman, Faeze, Kumar, Sachin, Mireshghallah, Niloofar, Lu, Ximing, Sap, Maarten, Choi, Yejin, Dziri, Nouha
We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with LLMs, our work investigates jailbreaks from chatbot users who were not specifically instructed to break the system. WildTeaming reveals previously unidentified vulnerabilities of frontier LLMs, resulting in up to 4.6x more diverse and successful adversarial attacks compared to state-of-the-art jailbreak methods. While many datasets exist for jailbreak evaluation, very few open-source datasets exist for jailbreak training, as safety training data has been closed even when model weights are open. With WildTeaming we create WildJailbreak, a large-scale open-source synthetic safety dataset with 262K vanilla (direct request) and adversarial (complex jailbreak) prompt-response pairs. To mitigate exaggerated safety behaviors, WildJailbreak provides two contrastive types of queries: 1) harmful queries (vanilla & adversarial) and 2) benign queries that resemble harmful queries in form but contain no harm. As WildJailbreak considerably upgrades the quality and scale of existing safety resources, it uniquely enables us to examine the scaling effects of data and the interplay of data properties and model capabilities during safety training. Through extensive experiments, we identify the training properties that enable an ideal balance of safety behaviors: appropriate safeguarding without over-refusal, effective handling of vanilla and adversarial queries, and minimal, if any, decrease in general capabilities. All components of WildJailbeak contribute to achieving balanced safety behaviors of models.
- North America > United States (1.00)
- Africa > South Africa (0.04)
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
- (5 more...)
- Media (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (6 more...)
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Zhang, Shenao, Yu, Donghan, Sharma, Hiteshi, Yang, Ziyi, Wang, Shuohang, Hassan, Hany, Wang, Zhaoran
Preference optimization, particularly through Reinforcement Learning from Human Feedback (RLHF), has achieved significant success in aligning Large Language Models (LLMs) to adhere to human intentions. Unlike offline alignment with a fixed dataset, online feedback collection from humans or AI on model generations typically leads to more capable reward models and better-aligned LLMs through an iterative process. However, achieving a globally accurate reward model requires systematic exploration to generate diverse responses that span the vast space of natural language. Random sampling from standard reward-maximizing LLMs alone is insufficient to fulfill this requirement. To address this issue, we propose a bilevel objective optimistically biased towards potentially high-reward responses to actively explore out-of-distribution regions. By solving the inner-level problem with the reparameterized reward function, the resulting algorithm, named Self-Exploring Language Models (SELM), eliminates the need for a separate RM and iteratively updates the LLM with a straightforward objective. Compared to Direct Preference Optimization (DPO), the SELM objective reduces indiscriminate favor of unseen extrapolations and enhances exploration efficiency. Our experimental results demonstrate that when finetuned on Zephyr-7B-SFT and Llama-3-8B-Instruct models, SELM significantly boosts the performance on instruction-following benchmarks such as MT-Bench and AlpacaEval 2.0, as well as various standard academic benchmarks in different settings. Our code and models are available at https://github.com/shenao-zhang/SELM.
SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucination
Fallah, Pouya, Gooran, Soroush, Jafarinasab, Mohammad, Sadeghi, Pouya, Farnia, Reza, Tarabkhah, Amirreza, Taghavi, Zainab Sadat, Sameti, Hossein
Language models, particularly generative models, are susceptible to hallucinations, generating outputs that contradict factual knowledge or the source text. This study explores methods for detecting hallucinations in three SemEval-2024 Task 6 tasks: Machine Translation, Definition Modeling, and Paraphrase Generation. We evaluate two methods: semantic similarity between the generated text and factual references, and an ensemble of language models that judge each other's outputs. Our results show that semantic similarity achieves moderate accuracy and correlation scores in trial data, while the ensemble method offers insights into the complexities of hallucination detection but falls short of expectations. This work highlights the challenges of hallucination detection and underscores the need for further research in this critical area.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.81)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.73)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.49)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.49)
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models
Molenda, Piotr, Liusie, Adian, Gales, Mark J. F.
Watermarking generative-AI systems, such as LLMs, has gained considerable interest, driven by their enhanced capabilities across a wide range of tasks. Although current approaches have demonstrated that small, context-dependent shifts in the word distributions can be used to apply and detect watermarks, there has been little work in analyzing the impact that these perturbations have on the quality of generated texts. Balancing high detectability with minimal performance degradation is crucial in terms of selecting the appropriate watermarking setting; therefore this paper proposes a simple analysis framework where comparative assessment, a flexible NLG evaluation framework, is used to assess the quality degradation caused by a particular watermark setting. We demonstrate that our framework provides easy visualization of the quality-detection trade-off of watermark settings, enabling a simple solution to find an LLM watermark operating point that provides a well-balanced performance. This approach is applied to two different summarization systems and a translation system, enabling cross-model analysis for a task, and cross-task analysis.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)