hecker
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.74)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)
Enhancing Health Fact-Checking with LLM-Generated Synthetic Data
Zhang, Jingze, Qian, Jiahe, Zhou, Yiliang, Peng, Yifan
Fact-checking for health-related content is challenging due to the limited availability of annotated training data. In this study, we propose a synthetic data generation pipeline that leverages large language models (LLMs) to augment training data for health-related fact checking. In this pipeline, we summarize source documents, decompose the summaries into atomic facts, and use an LLM to construct sentence-fact entailment tables. From the entailment relations in the table, we further generate synthetic text-claim pairs with binary veracity labels. These synthetic data are then combined with the original data to fine-tune a BERT -based fact-checking model. Evaluation on two public datasets, PubHealth and SciFact, shows that our pipeline improved F1 scores by up to 0.019 and 0.049, respectively, compared to models trained only on the original data.
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- Asia > Middle East > Jordan (0.04)
- South America > Brazil (0.04)
- (7 more...)
- Personal > Obituary (0.46)
- Research Report > New Finding (0.35)
- Health & Medicine > Therapeutic Area (0.94)
- Media (0.93)
- Leisure & Entertainment > Sports (0.69)
RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
Ru, Dongyu, Qiu, Lin, Hu, Xiangkun, Zhang, Tianhang, Shi, Peng, Chang, Shuaichen, Jiayang, Cheng, Wang, Cunxiang, Sun, Shichao, Li, Huanyu, Zhang, Zizhao, Wang, Binjie, Jiang, Jiarong, He, Tong, Wang, Zhiguo, Liu, Pengfei, Zhang, Yue, Zhang, Zheng
Despite Retrieval-Augmented Generation (RAG) showing promising capability in leveraging external knowledge, a comprehensive evaluation of RAG systems is still challenging due to the modular nature of RAG, evaluation of long-form responses and reliability of measurements. In this paper, we propose a fine-grained evaluation framework, RAGChecker, that incorporates a suite of diagnostic metrics for both the retrieval and generation modules. Meta evaluation verifies that RAGChecker has significantly better correlations with human judgments than other evaluation metrics. Using RAGChecker, we evaluate 8 RAG systems and conduct an in-depth analysis of their performance, revealing insightful patterns and trade-offs in the design choices of RAG architectures. The metrics of RAGChecker can guide researchers and practitioners in developing more effective RAG systems. This work has been open sourced at https://github.com/amazon-science/RAGChecker.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (2 more...)
US military resumes drone, crewed aircraft operations in post-coup Niger
The United States military has resumed operations in Niger, flying drones and other aircraft out of airbases in the country more than a month after a coup halted activities, the head of Air Forces in Europe and Air Forces Africa said. Since the July coup that removed President Mohamed Bazoum, the approximately 1,100 US soldiers deployed in the West African country have been confined to their military bases. General James Hecker said on Wednesday that negotiations with the military rulers of Niger resulted in some intelligence and surveillance missions resuming. "For a while, we weren't doing any missions on the bases, they pretty much closed down the airfields," Hecker told reporters at the annual Air and Space Forces Association convention. "Through the diplomatic process, we are now doing, I wouldn't say 100 percent of the missions that we were doing before, but we're doing a large amount of missions that we're doing before," he said.
- North America > United States (1.00)
- Africa > Niger > Agadez > Agadez (0.11)
- Europe > France (0.09)
- (2 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models
Li, Miaoran, Peng, Baolin, Zhang, Zhu
Fact-checking is an essential task in NLP that is commonly utilized for validating the factual accuracy of claims. Prior work has mainly focused on fine-tuning pre-trained languages models on specific datasets, which can be computationally intensive and time-consuming. With the rapid development of large language models (LLMs), such as ChatGPT and GPT-3, researchers are now exploring their in-context learning capabilities for a wide range of tasks. In this paper, we aim to assess the capacity of LLMs for fact-checking by introducing Self-Checker, a framework comprising a set of plug-and-play modules that facilitate fact-checking by purely prompting LLMs in an almost zero-shot setting. This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments. Empirical results demonstrate the potential of Self-Checker in utilizing LLMs for fact-checking. However, there is still significant room for improvement compared to SOTA fine-tuned models, which suggests that LLM adoption could be a promising approach for future fact-checking research.
- North America > United States > Rhode Island (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Iowa (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
LongChecker: Improving scientific claim verification by modeling full-abstract context
Wadden, David, Lo, Kyle, Wang, Lucy Lu, Cohan, Arman, Beltagy, Iz, Hajishirzi, Hannaneh
We introduce the LongChecker system for scientific claim verification. Given a scientific claim and an evidence-containing research abstract, LongChecker predicts a veracity label and identifies supporting rationales in a multitask fashion based on a shared encoding of the claim and abstract. We perform experiments on the SciFact dataset, and find that LongChecker achieves state-of-the-art performance. We conduct analysis to understand the source of this improvement, and find that identifying the relationship between a claim and a rationale reporting a scientific finding often requires understanding the context in which the rationale appears. By making labeling decisions based on all available context, LongChecker achieves better performance on cases requiring this type of understanding. In addition, we show that LongChecker is able to leverage weakly-supervised in-domain data to facilitate few-shot domain adaptation for scientific claim verification.
- North America > United States > Washington > King County > Seattle (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
10 years later, 'SpyParty' hits Steam Early Access on April 12th
Chris Hecker, the creator of SpyParty, is smiling and gesturing wildly over the open lid of a laptop showcasing the game's six new, upgraded maps. After 10 years of development, SpyParty is finally going to land on Steam Early Access on April 12th, and Hecker is barely containing a cacophony of emotions -- not all of them bad. "That's fucking crazy and I'm terrified, like literally, abject terrified, and my anxiety level is through the fucking roof," he says. "But I'm excited too, and we'll see. There's a whole bunch of things I'm concerned about with that and excited about that. Like the fact that I have the best online competitive gaming community ever."
'SpyParty' finally looks like a real video game
Yes, after nearly 10 years, SpyParty is still in development. It's an underground kind of independent, competitive game where one player is a spy attempting to complete discreet tasks at a fancy party, and another player is positioned outside, observing the scene through the scope of a sniper rifle. The spy attempts to blend in with a room full of AI-powered partygoers while the sniper tries to figure out which one is actually human (and then shoot that character, of course). And soon, it will all be much, much prettier. Creator Chris Hecker, artist John Cimino and newly hired environment artist Reika Yoshino today revealed five new characters, a professional-looking UI and an updated version of SpyParty's largest map, Veranda.