Denton County
'It Was Nuts': The Extreme Tests that Show Why Hail Is a Multibillion-Dollar Problem
'It Was Nuts': The Extreme Tests that Show Why Hail Is a Multibillion-Dollar Problem The costs of a hail damage have ballooned over the past two decades, prompting researchers to resort to extreme measures to understand how these storms destroy buildings. The scars left on houses look like shotgun blasts, sometimes. In the aftermath of major storms, Andrew Shick, owner and chief executive of Illinois-based firm Roofing USA, has driven through suburbs blasted by hail and been left stunned by the damage. Earlier this year, he visited a farm complex in western Illinois where roofs, even sturdy metal ones, were left pockmarked and perforated after 3-inch balls of ice fell from the sky. "It was nuts," he recalls.
ReviewGuard: Enhancing Deficient Peer Review Detection via LLM-Driven Data Augmentation
Zhang, Haoxuan, Li, Ruochi, Shrestha, Sarthak, Mamidala, Shree Harshini, Putta, Revanth, Aggarwal, Arka Krishan, Xiao, Ting, Ding, Junhua, Chen, Haihua
Peer review serves as the gatekeeper of science, yet the surge in submissions and widespread adoption of large language models (LLMs) in scholarly evaluation present unprecedented challenges. While recent work has focused on using LLMs to improve review efficiency, unchecked deficient reviews from both human experts and AI systems threaten to systematically undermine academic integrity. To address this issue, we introduce ReviewGuard, an automated system for detecting and categorizing deficient reviews through a four-stage LLM-driven framework: data collection from ICLR and NeurIPS on OpenReview, GPT-4.1 annotation with human validation, synthetic data augmentation yielding 6,634 papers with 24,657 real and 46,438 synthetic reviews, and fine-tuning of encoder-based models and open-source LLMs. Feature analysis reveals that deficient reviews exhibit lower rating scores, higher self-reported confidence, reduced structural complexity, and more negative sentiment than sufficient reviews. AI-generated text detection shows dramatic increases in AI-authored reviews since ChatGPT's emergence. Mixed training with synthetic and real data substantially improves detection performance - for example, Qwen 3-8B achieves recall of 0.6653 and F1 of 0.7073, up from 0.5499 and 0.5606 respectively. This study presents the first LLM-driven system for detecting deficient peer reviews, providing evidence to inform AI governance in peer review. Code, prompts, and data are available at https://github.com/haoxuan-unt2024/ReviewGuard
ClaimIQ at CheckThat! 2025: Comparing Prompted and Fine-Tuned Language Models for Verifying Numerical Claims
Anik, Anirban Saha, Chowdhury, Md Fahimul Kabir, Wyckoff, Andrew, Choudhury, Sagnik Ray
This paper presents our system for Task 3 of the CLEF 2025 CheckThat! Lab, which focuses on verifying numerical and temporal claims using retrieved evidence. We explore two complementary approaches: zero-shot prompting with instruction-tuned large language models (LLMs) and supervised fine-tuning using parameter-efficient LoRA. To enhance evidence quality, we investigate several selection strategies, including full-document input and top-k sentence filtering using BM25 and MiniLM. Our best-performing model LLaMA fine-tuned with LoRA achieves strong performance on the English validation set. However, a notable drop in the test set highlights a generalization challenge. These findings underscore the importance of evidence granularity and model adaptation for robust numerical fact verification.
A Multi-Modal Deep Learning Framework for Colorectal Pathology Diagnosis: Integrating Histological and Colonoscopy Data in a Pilot Study
Ramesh, Krithik, Koneru, Ritvik
Colorectal diseases, including inflammatory conditions and neoplasms, require quick, accurate care to be effectively treated. Traditional diagnostic pipelines require extensive preparation and rely on separate, individual evaluations on histological images and colonoscopy footage, introducing possible variability and inefficiencies. This pilot study proposes a unified deep learning network that uses convolutional neural networks (CN N s) to classify both histopathological slides and colonoscopy video frames in one pipeline. The pipeline integrates class-balancing learning, robust augmentation, and calibration methods to ensure accurate results. Static colon histology images were taken from the PathMNIST dataset, and the lower gastrointestinal (colonoscopy) videos were drawn from the HyperKvasir dataset. The CNN architecture used was ResNet-50. This study demonstrates an interpretable and reproducible diagnostic pipeline that unifies multiple diagnostic modalities to advance and ease the detection of colorectal diseases.