Goto

Collaborating Authors

 fatality


Forests of Uncertaint(r)ees: Using tree-based ensembles to estimate probability distributions of future conflict

Mittermaier, Daniel, Bohne, Tobias, Hofer, Martin, Racek, Daniel

arXiv.org Artificial Intelligence

Predictions of fatalities from violent conflict on the PRIO-GRID-month (pgm) level are characterized by high levels of uncertainty, limiting their usefulness in practical applications. We discuss the two main sources of uncertainty for this prediction task, the nature of violent conflict and data limitations, embedding this in the wider literature on uncertainty quantification in machine learning. We develop a strategy to quantify uncertainty in conflict forecasting, shifting from traditional point predictions to full predictive distributions. Our approach compares and combines multiple tree-based classifiers and distributional regressors in a custom auto-ML setup, estimating distributions for each pgm individually. We also test the integration of regional models in spatial ensembles as a potential avenue to reduce uncertainty. The models are able to consistently outperform a suite of benchmarks derived from conflict history in predictions up to one year in advance, with performance driven by regions where conflict was observed. With our evaluation, we emphasize the need to understand how a metric behaves for a given prediction problem, in our case characterized by extremely high zero-inflatedness. While not resulting in better predictions, the integration of smaller models does not decrease performance for this prediction task, opening avenues to integrate data sources with less spatial coverage in the future.


Revealed: The 5 most DANGEROUS TikTok trends - including one that has caused 100 fatalities

Daily Mail - Science & tech

America's flight-mare begins as more than 700 departures ALREADY canceled across US and Trump steps in to end the shutdown Multiple people hospitalized from'white powder' as suspicious package with'political propaganda' sparks evacuation at Joint Base Andrews Prince Harry apologises to Canada over baseball cap'Hatgate' - and adds a joke about thinning on top Alix Earle suffers'total humiliation' at hands of her stepmom: Family insiders reveal former escort's betrayal that they fear will now'completely break' star Jeremy Renner's film partner claims he sent her explicit photos and videos to woo her then threatened the unthinkable when they fell out Moment Prince William refuses to be drawn on Andrew scandal and Harry and Meghan rift as he tells CNN: 'I want to surround myself with people who want to do good' Melania Trump stuns as she accepts'Patriot of the Year' award and issues inspiring message to Americans Elon Musk used biometric data from employees to program'sexy' chatbot during epic quest to win AI arms race Sydney Sweeney wins patriotic hearts with stunning response to criticism of her'good genes' ad Ritzy suburb of NJ's new governor stunned as cops pounce on'yuppie jihadi' neighbor at his $1.2M home over alleged bomb plot Iconic golf ball-sized Florentine diamond once owned by Medici and Habsburg dynasties is FOUND in unusual location 100 years after'vanishing' My addiction to ADHD medication ruined me. I had to choose to either abort my baby or lose my own life... but that was just the start Distressing red flags before Dallas Cowboys star's sudden death at 24 - revealed by roommate who shares harrowing backstory... including recent family tragedy Real-life horror as progressives elect convicted KILLER as councilmember of Maine town that inspired Stephen King's It Israeli hostage who revealed sexual abuse by his captors details full horror he endured: 20-minute torture seven times a day, made to dance, blindfolded with stones in his ears for weeks - 'I have met the Devil' It triggered an earthquake across America. Now, TUCKER CARLSON gives an astonishing defense of the interview that nearly destroyed him... and what he wished he'd known first READ MORE: Gen Z are'rawdogging boredom' to fix their attention spans TikTok has given rise to many strange trends over the years - from'rawdogging boredom' to the viral'turtle rabbit' choreography . While most trends are harmless fun, experts have raised concerns about others - including some that have proved deadly. In a new report, the Omega Law Group has highlighted five of the most dangerous trends that have swept social media in recent years.


Bears have attacked over 100 people in Japan since March

Popular Science

The dangerous encounters are on the rise. Breakthroughs, discoveries, and DIY tips sent every weekday. Japan is experiencing an unprecedented bear problem . According to the nation's Ministry of the Environment, the furry predators are confirmed to have killed seven people since March--the highest number on record since the government first started monitoring deaths in 2006. Given the country's ongoing ecological and demographic issues, however, it may be more accurate to say that the bears are facing an unprecedented human problem . Conservationists explained to the AFP that there are multiple factors contributing to the multiplying encounters between the wild animals and humans.


GPT-5 Model Corrected GPT-4V's Chart Reading Errors, Not Prompting

Yang, Kaichun, Chen, Jian

arXiv.org Artificial Intelligence

We present a quantitative evaluation to understand the effect of zero-shot large-language model (LLMs) and prompting uses on chart reading tasks. We asked LLMs to answer 107 visualization questions to compare inference accuracies between the agen-tic GPT -5 and multimodal GPT -4V, for difficult image instances, where GPT -4V failed to produce correct answers. Our results show that model architecture dominates the inference accuracy: GPT - 5 largely improved accuracy, while prompt variants yielded only small effects. Pre-registration of this work is available here; the Google Drive materials are here. Benchmarking visual literacy, i.e., "the ability and skill to read and interpret visually represented data and to extract information from data visualizations" [1] shapes progress in measuring AI's ability in handling visualization images. Often, the same tasks as designed to assess visual literacy questions traditionally performed by human observers are now being assigned to algorithms. Following this trend, our goal in this paper is to quantify the new GPT -5's ability to read charts. Specifically, we used questions where GPT -4V failed and other LLMs achieved only low accuracy, as reported in V erma et al.'s CHART -6 benchmark [2].


Design and Application of Multimodal Large Language Model Based System for End to End Automation of Accident Dataset Generation

Chowdhury, MD Thamed Bin Zaman, Hossain, Moazzem

arXiv.org Artificial Intelligence

Road traffic accidents remain a major public safety and socio-economic issue in developing countries like Bangladesh. Existing accident data collection is largely manual, fragmented, and unreliable, resulting in underreporting and inconsistent records. This research proposes a fully automated system using Large Language Models (LLMs) and web scraping techniques to address these challenges. The pipeline consists of four components: automated web scraping code generation, news collection from online sources, accident news classification with structured data extraction, and duplicate removal. The system uses the multimodal generative LLM Gemini-2.0-Flash for seamless automation. The code generation module classifies webpages into pagination, dynamic, or infinite scrolling categories and generates suitable Python scripts for scraping. LLMs also classify and extract key accident information such as date, time, location, fatalities, injuries, road type, vehicle types, and pedestrian involvement. A deduplication algorithm ensures data integrity by removing duplicate reports. The system scraped 14 major Bangladeshi news sites over 111 days (Oct 1, 2024 - Jan 20, 2025), processing over 15,000 news articles and identifying 705 unique accidents. The code generation module achieved 91.3% calibration and 80% validation accuracy. Chittagong reported the highest number of accidents (80), fatalities (70), and injuries (115), followed by Dhaka, Faridpur, Gazipur, and Cox's Bazar. Peak accident times were morning (8-9 AM), noon (12-1 PM), and evening (6-7 PM). A public repository was also developed with usage instructions. This study demonstrates the viability of an LLM-powered, scalable system for accurate, low-effort accident data collection, providing a foundation for data-driven road safety policymaking in Bangladesh.


Move Fast and Break Nothing

The Atlantic - Technology

Listen to more stories on the Noa app. Every trip in a self-driving Waymo has the same dangerous moment. But at the very end, you, a flawed human being, will have to place your hand on the door handle, look both ways, and push the door open. From mid-February to mid-August of this year, Waymo's driverless cars were involved in three collisions that came down to roughly identical circumstances: A passenger flung their door open and hit somebody passing by on a bike or scooter. That's according to an independent analysis of crash reports the company has disclosed to the government, which found that most of the 45 serious accidents involving Waymos were the fault of other motorists or seemingly an act of God.


Debiased Front-Door Learners for Heterogeneous Effects

Jung, Yonghan

arXiv.org Machine Learning

In observational settings where treatment and outcome share unmeasured confounders but an observed mediator remains unconfounded, the front-door (FD) adjustment identifies causal effects through the mediator. We study the heterogeneous treatment effect (HTE) under FD identification and introduce two debiased learners: FD-DR-Learner and FD-R-Learner. Both attain fast, quasi-oracle rates (i.e., performance comparable to an oracle that knows the nuisances) even when nuisance functions converge as slowly as n^-1/4. We provide error analyses establishing debiasedness and demonstrate robust empirical performance in synthetic studies and a real-world case study of primary seat-belt laws using Fatality Analysis Reporting System (FARS) dataset. Together, these results indicate that the proposed learners deliver reliable and sample-efficient HTE estimates in FD scenarios. The implementation is available at https://github.com/yonghanjung/FD-CATE. Keywords: Front-door adjustment; Heterogeneous treatment effects; Debiased learning; Quasi-oracle rates; Causal inference.


Harnessing ADAS for Pedestrian Safety: A Data-Driven Exploration of Fatality Reduction

Sulle, Methusela, Mwakalonge, Judith, Comert, Gurcan, Siuhi, Saidi, Gyimah, Nana Kankam

arXiv.org Artificial Intelligence

Pedestrian fatalities continue to rise in the United States, driven by factors such as human distraction, increased vehicle size, and complex traffic environments. Advanced Driver Assistance Systems (ADAS) offer a promising avenue for improving pedestrian safety by enhancing driver awareness and vehicle responsiveness. This study conducts a comprehensive data-driven analysis utilizing the Fatality Analysis Reporting System (FARS) to quantify the effectiveness of specific ADAS features like Pedestrian Automatic Emergency Braking (PAEB), Forward Collision Warning (FCW), and Lane Departure Warning (LDW), in lowering pedestrian fatalities. By linking vehicle specifications with crash data, we assess how ADAS performance varies under different environmental and behavioral conditions, such as lighting, weather, and driver/pedestrian distraction. Results indicate that while ADAS can reduce crash severity and prevent some fatalities, its effectiveness is diminished in low-light and adverse weather. The findings highlight the need for enhanced sensor technologies and improved driver education. This research informs policymakers, transportation planners, and automotive manufacturers on optimizing ADAS deployment to improve pedestrian safety and reduce traffic-related deaths.


A self-supervised neural-analytic method to predict the evolution of COVID-19 in Romania

Stochiţoiu, Radu D., Petrica, Marian, Rebedea, Traian, Popescu, Ionel, Leordeanu, Marius

arXiv.org Artificial Intelligence

Analysing and understanding the transmission and evolution of the COVID-19 pandemic is mandatory to be able to design the best social and medical policies, foresee their outcomes and deal with all the subsequent socio-economic effects. We address this important problem from a computational and machine learning perspective. More specifically, we want to statistically estimate all the relevant parameters for the new coronavirus COVID-19, such as the reproduction number, fatality rate or length of infectiousness period, based on Romanian patients, as well as be able to predict future outcomes. This endeavor is important, since it is well known that these factors vary across the globe, and might be dependent on many causes, including social, medical, age and genetic factors. We use a recently published improved version of SEIR, which is the classic, established model for infectious diseases. We want to infer all the parameters of the model, which govern the evolution of the pandemic in Romania, based on the only reliable, true measurement, which is the number of deaths. Once the model parameters are estimated, we are able to predict all the other relevant measures, such as the number of exposed and infectious people. To this end, we propose a self-supervised approach to train a deep convolutional network to guess the correct set of Modified-SEIR model parameters, given the observed number of daily fatalities. Then, we refine the solution with a stochastic coordinate descent approach. We compare our deep learning optimization scheme with the classic grid search approach and show great improvement in both computational time and prediction accuracy. We find an optimistic result in the case fatality rate for Romania which may be around 0.3% and we also demonstrate that our model is able to correctly predict the number of daily fatalities for up to three weeks in the future.


LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World

Semnani, Sina J., Zhang, Pingyue, Zhai, Wanyue, Li, Haozhuo, Beauchamp, Ryan, Billing, Trey, Kishi, Katayoun, Li, Manling, Lam, Monica S.

arXiv.org Artificial Intelligence

This paper presents LEMONADE, a large-scale conflict event dataset comprising 39,786 events across 20 languages and 171 countries, with extensive coverage of region-specific entities. LEMONADE is based on a partially reannotated subset of the Armed Conflict Location & Event Data (ACLED), which has documented global conflict events for over a decade. To address the challenge of aggregating multilingual sources for global event analysis, we introduce abstractive event extraction (AEE) and its subtask, abstractive entity linking (AEL). Unlike conventional span-based event extraction, our approach detects event arguments and entities through holistic document understanding and normalizes them across the multilingual dataset. We evaluate various large language models (LLMs) on these tasks, adapt existing zero-shot event extraction systems, and benchmark supervised models. Additionally, we introduce ZEST, a novel zero-shot retrieval-based system for AEL. Our best zero-shot system achieves an end-to-end F1 score of 58.3%, with LLMs outperforming specialized event extraction models such as GoLLIE. For entity linking, ZEST achieves an F1 score of 45.7%, significantly surpassing OneNet, a state-of-the-art zero-shot baseline that achieves only 23.7%. However, these zero-shot results lag behind the best supervised systems by 20.1% and 37.0% in the end-to-end and AEL tasks, respectively, highlighting the need for further research.