margarine
News Deja Vu: Connecting Past and Present with Semantic Search
Franklin, Brevin, Silcock, Emily, Arora, Abhishek, Bryan, Tom, Dell, Melissa
Social scientists and the general public often analyze contemporary events by drawing parallels with the past, a process complicated by the vast, noisy, and unstructured nature of historical texts. For example, hundreds of millions of page scans from historical newspapers have been noisily transcribed. Traditional sparse methods for searching for relevant material in these vast corpora, e.g., with keywords, can be brittle given complex vocabularies and OCR noise. This study introduces News Deja Vu, a novel semantic search tool that leverages transformer large language models and a bi-encoder approach to identify historical news articles that are most similar to modern news queries. News Deja Vu first recognizes and masks entities, in order to focus on broader parallels rather than the specific named entities being discussed. Then, a contrastively trained, lightweight bi-encoder retrieves historical articles that are most similar semantically to a modern query, illustrating how phenomena that might seem unique to the present have varied historical precedents. Aimed at social scientists, the user-friendly News Deja Vu package is designed to be accessible for those who lack extensive familiarity with deep learning. It works with large text datasets, and we show how it can be deployed to a massive scale corpus of historical, open-source news articles. While human expertise remains important for drawing deeper insights, News Deja Vu provides a powerful tool for exploring parallels in how people have perceived past and present.
- North America > Canada (0.14)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > New York (0.04)
- (12 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Marginal Post Processing of Bayesian Inference Products with Normalizing Flows and Kernel Density Estimators
Bevins, Harry T. J., Handley, William J., Lemos, Pablo, Sims, Peter H., Acedo, Eloy de Lera, Fialkov, Anastasia, Alsing, Justin
Bayesian analysis has become an indispensable tool across many different cosmological fields including the study of gravitational waves, the Cosmic Microwave Background and the 21-cm signal from the Cosmic Dawn among other phenomena. The method provides a way to fit complex models to data describing key cosmological and astrophysical signals and a whole host of contaminating signals and instrumental effects modelled with `nuisance parameters'. In this paper, we summarise a method that uses Masked Autoregressive Flows and Kernel Density Estimators to learn marginal posterior densities corresponding to core science parameters. We find that the marginal or 'nuisance-free' posteriors and the associated likelihoods have an abundance of applications including; the calculation of previously intractable marginal Kullback-Leibler divergences and marginal Bayesian Model Dimensionalities, likelihood emulation and prior emulation. We demonstrate each application using toy examples, examples from the field of 21-cm cosmology and samples from the Dark Energy Survey. We discuss how marginal summary statistics like the Kullback-Leibler divergences and Bayesian Model Dimensionalities can be used to examine the constraining power of different experiments and how we can perform efficient joint analysis by taking advantage of marginal prior and likelihood emulators. We package our multipurpose code up in the pip-installable code margarine for use in the wider scientific community.
- North America > Canada > Quebec > Montreal (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)