Philippine Sea
Pigs have been island hopping for 50,000 years
With human help, the mammals can defy'the world's most fundamental natural boundaries.' Breakthroughs, discoveries, and DIY tips sent every weekday. Despite not exactly being world-renowned swimmers, pigs have spread across the Asia-Pacific region for thousands of years . With the genetic and archeological data from over 700 pigs, a team of scientists documented how people helped the mammals make their way across thousands of miles. "This research reveals what happens when people transport animals enormous distances, across one of the world's most fundamental natural boundaries," evolutionary geneticist and study co-author author Dr. David Stanton of the University of Cardiff and Queen Mary University of London said in a statement. "These movements led to pigs with a melting pot of ancestries. These patterns were technically very difficult to disentangle, but have ultimately helped us understand how and why animals came to be distributed across the Pacific islands."
Toward Copyright Integrity and Verifiability via Multi-Bit Watermarking for Intelligent Transportation Systems
Wang, Yihao, Li, Lingxiao, Tang, Yifan, Zhang, Ru, Liu, Jianyi
Intelligent transportation systems (ITS) use advanced technologies such as artificial intelligence to significantly improve traffic flow management efficiency, and promote the intelligent development of the transportation industry. However, if the data in ITS is attacked, such as tampering or forgery, it will endanger public safety and cause social losses. Therefore, this paper proposes a watermarking that can verify the integrity of copyright in response to the needs of ITS, termed ITSmark. ITSmark focuses on functions such as extracting watermarks, verifying permission, and tracing tampered locations. The scheme uses the copyright information to build the multi-bit space and divides this space into multiple segments. These segments will be assigned to tokens. Thus, the next token is determined by its segment which contains the copyright. In this way, the obtained data contains the custom watermark. To ensure the authorization, key parameters are encrypted during copyright embedding to obtain cipher data. Only by possessing the correct cipher data and private key, can the user entirely extract the watermark. Experiments show that ITSmark surpasses baseline performances in data quality, extraction accuracy, and unforgeability. It also shows unique capabilities of permission verification and tampered location tracing, which ensures the security of extraction and the reliability of copyright verification. Furthermore, ITSmark can also customize the watermark embedding position and proportion according to user needs, making embedding more flexible.
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
Jin, Zhuoran, Yuan, Hongbang, Men, Tianyi, Cao, Pengfei, Chen, Yubo, Liu, Kang, Zhao, Jun
Despite the significant progress made by existing retrieval augmented language models (RALMs) in providing trustworthy responses and grounding in reliable sources, they often overlook effective alignment with human preferences. In the alignment process, reward models (RMs) act as a crucial proxy for human values to guide optimization. However, it remains unclear how to evaluate and select a reliable RM for preference alignment in RALMs. To this end, we propose RAG-RewardBench, the first benchmark for evaluating RMs in RAG settings. First, we design four crucial and challenging RAG-specific scenarios to assess RMs, including multi-hop reasoning, fine-grained citation, appropriate abstain, and conflict robustness. Then, we incorporate 18 RAG subsets, six retrievers, and 24 RALMs to increase the diversity of data sources. Finally, we adopt an LLM-as-a-judge approach to improve preference annotation efficiency and effectiveness, exhibiting a strong correlation with human annotations. Based on the RAG-RewardBench, we conduct a comprehensive evaluation of 45 RMs and uncover their limitations in RAG scenarios. Additionally, we also reveal that existing trained RALMs show almost no improvement in preference alignment, highlighting the need for a shift towards preference-aligned training.We release our benchmark and code publicly at https://huggingface.co/datasets/jinzhuoran/RAG-RewardBench/ for future work.
After months fighting Houthis on the USS Eisenhower, sailors face a new kind of sea threat
Kirk Lippold discusses the reported three U.S. strikes against Houthis in Yemen on'Your World.' Sailors aboard the aircraft carrier USS Dwight D. Eisenhower and its accompanying warships have spent four months straight at sea defending against ballistic missiles and flying attack drones fired by Iranian-backed Houthis, and are now more regularly also defending against a new threat -- fast unmanned vessels that are fired at them through the water. While the Houthis have launched unmanned surface vessels, or USVs, in the past against Saudi coalition forces that have intervened in Yemen's civil war, they were used for the first time against U.S. military and commercial vessels in the Red Sea on Jan. 4. In the weeks since, the Navy has had to intercept and destroy multiple USVs. It's "more of an unknown threat that we don't have a lot of intel on, that could be extremely lethal -- an unmanned surface vessel," said Rear Adm. Marc Miguez, commander of Carrier Strike Group Two, of which the Eisenhower is the flagship. The Houthis "have ways of obviously controlling them just like they do the (unmanned aerial vehicles), and we have very little little fidelity as to all the stockpiles of what they have USV-wise," Miguez said.
A Closer Look at the Limitations of Instruction Tuning
Ghosh, Sreyan, Evuru, Chandra Kiran Reddy, Kumar, Sonal, S, Ramaneswaran, Aneja, Deepali, Jin, Zeyu, Duraiswami, Ramani, Manocha, Dinesh
Instruction Tuning (IT), the process of training large language models (LLMs) using instruction-response pairs, has emerged as the predominant method for transforming base pre-trained LLMs into open-domain conversational agents. While IT has achieved notable success and widespread adoption, its limitations and shortcomings remain underexplored. In this paper, through rigorous experiments and an in-depth analysis of the changes LLMs undergo through IT, we reveal various limitations of IT. In particular, we show that (1) IT fails to enhance knowledge or skills in LLMs. LoRA fine-tuning is limited to learning response initiation and style tokens, and full-parameter fine-tuning leads to knowledge degradation. (2) Copying response patterns from IT datasets derived from knowledgeable sources leads to a decline in response quality. (3) Full-parameter fine-tuning increases hallucination by inaccurately borrowing tokens from conceptually similar instances in the IT dataset for generating responses. (4) Popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model. Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets. We hope the insights and challenges revealed inspire future work.
Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language
Drchal, Jan, Ullrich, Herbert, Mlynář, Tomáš, Moravec, Václav
This article presents a pipeline for automated fact-checking leveraging publicly available Language Models and data. The objective is to assess the accuracy of textual claims using evidence from a ground-truth evidence corpus. The pipeline consists of two main modules -- the evidence retrieval and the claim veracity evaluation. Our primary focus is on the ease of deployment in various languages that remain unexplored in the field of automated fact-checking. Unlike most similar pipelines, which work with evidence sentences, our pipeline processes data on a paragraph level, simplifying the overall architecture and data requirements. Given the high cost of annotating language-specific fact-checking training data, our solution builds on the Question Answering for Claim Generation (QACG) method, which we adapt and use to generate the data for all models of the pipeline. Our strategy enables the introduction of new languages through machine translation of only two fixed datasets of moderate size. Subsequently, any number of training samples can be generated based on an evidence corpus in the target language. We provide open access to all data and fine-tuned models for Czech, English, Polish, and Slovak pipelines, as well as to our codebase that may be used to reproduce the results.We comprehensively evaluate the pipelines for all four languages, including human annotations and per-sample difficulty assessment using Pointwise V-information. The presented experiments are based on full Wikipedia snapshots to promote reproducibility. To facilitate implementation and user interaction, we develop the FactSearch application featuring the proposed pipeline and the preliminary feedback on its performance.
Large Language Models for User Interest Journeys
Christakopoulou, Konstantina, Lalama, Alberto, Adams, Cj, Qu, Iris, Amir, Yifat, Chucri, Samer, Vollucci, Pierce, Soldo, Fabio, Bseiso, Dina, Scodel, Sarah, Dixon, Lucas, Chi, Ed H., Chen, Minmin
Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation. Their potential for deeper user understanding and improved personalized user experience on recommendation platforms is, however, largely untapped. This paper aims to address this gap. Recommender systems today capture users' interests through encoding their historical activities on the platforms. The generated user representations are hard to examine or interpret. On the other hand, if we were to ask people about interests they pursue in their life, they might talk about their hobbies, like I just started learning the ukulele, or their relaxation routines, e.g., I like to watch Saturday Night Live, or I want to plant a vertical garden. We argue, and demonstrate through extensive experiments, that LLMs as foundation models can reason through user activities, and describe their interests in nuanced and interesting ways, similar to how a human would. We define interest journeys as the persistent and overarching user interests, in other words, the non-transient ones. These are the interests that we believe will benefit most from the nuanced and personalized descriptions. We introduce a framework in which we first perform personalized extraction of interest journeys, and then summarize the extracted journeys via LLMs, using techniques like few-shot prompting, prompt-tuning and fine-tuning. Together, our results in prompting LLMs to name extracted user journeys in a large-scale industrial platform demonstrate great potential of these models in providing deeper, more interpretable, and controllable user understanding. We believe LLM powered user understanding can be a stepping stone to entirely new user experiences on recommendation platforms that are journey-aware, assistive, and enabling frictionless conversation down the line.
Evaluation of drain, a deep-learning approach to rain retrieval from gpm passive microwave radiometer
Viltard, Nicolas, Sambath, Vibolroth, Lepetit, Pierre, Martini, Audrey, Barthès, Laurent, Mallet, Cécile
LATMOS-IPSL, Université Paris-Saclay, UVSQ, CNRS, 78280, Guyancourt, France *Météo-France, Avenue Coriolis, Toulouse Abstract-- Retrieval of rain from Passive Microwave from about 52,000 images to about 103,000 allowing us radiometers data has been a challenge ever since the to build a training database of 70,000 images for training launch of the first Defense Meteorological Satellite and 33,000 images for validation. Enormous progress has been years 2014 to 2018 and a few months from 2020 and made since the launch of the Tropical Rainfall 2021 are used but the whole year 2019 was kept separate Measuring Mission (TRMM) in 1997 but until for the performance assessment (test) and most results recently the data were processed pixel-by-pixel or presented hereafter are computed for that year. Deep large database is meant to dampen the effects of learning has obtained remarkable improvement in seasonal and interannual variability of rain. the computer vision field, and offers a whole new Second, DRAIN retrieves now a set of 99 quantiles way to tackle the rain retrieval problem. The Global instead of a simple averaged rain rate as in [1]. These Precipitation Measurement (GPM) Core satellite quantiles represent the probability that the rain rate is carries similarly to TRMM, a passive microwave below a certain threshold.
East China Sea: Japanese Fighter Jets Intercept Three PLA Drones Over The Week
The Japan Air Self Defense Force (JASDF) had to scramble fighter jets three times over the week to monitor Chinese drones that flew over the East China Sea and the strategic Miyako Strait that opens to the Philippine Sea and the broader Western Pacific Ocean. A People's Liberation Army Tengoen TB-001 Scorpion medium-altitude, long-endurance (MALE) drone flew into the East China Sea northwest of Okinawa Tuesday, prompting JASDF to send fighters to investigate its activities, reported The Drive. A PLA Harbin BZK-005 MALE drone then flew a sortie back and forth through the Miyako Strait Wednesday, followed by another TB-001 through Miyako Strait, which lies southwest of the island of Okinawa, on Thursday. According to the Japanese officials, one Shaanxi Y-8Q maritime patrol plane and one Shaanxi Y-9JB electronic intelligence aircraft accompanied the drones on their flights the last two days. This comes as a testament to PLA's growing unmanned aircraft capabilities and its focus on deploying increasingly sophisticated unmanned aerial vehicles.