Goto

Collaborating Authors

 serial interval


Large Language Models for Scientific Information Extraction: An Empirical Study for Virology

arXiv.org Artificial Intelligence

In this paper, we champion the use of structured and semantic content representation of discourse-based scholarly communication, inspired by tools like Wikipedia infoboxes or structured Amazon product descriptions. These representations provide users with a concise overview, aiding scientists in navigating the dense academic landscape. Our novel automated approach leverages the robust text generation capabilities of LLMs to produce structured scholarly contribution summaries, offering both a practical solution and insights into LLMs' emergent abilities. For LLMs, the prime focus is on improving their general intelligence as conversational agents. We argue that these models can also be applied effectively in information extraction (IE), specifically in complex IE tasks within terse domains like Science. This paradigm shift replaces the traditional modular, pipelined machine learning approach with a simpler objective expressed through instructions. Our results show that finetuned FLAN-T5 with 1000x fewer parameters than the state-of-the-art GPT-davinci is competitive for the task.


Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions

Science

In epidemiology, serial intervals are measured from when one infected person starts to show symptoms to when the next person infected becomes symptomatic. For any specific infection, the serial interval is assumed to be a fixed characteristic. Using valuable transmission pair data for coronavirus disease (COVID-19) in mainland China, Ali et al. noticed that the average serial interval changed as nonpharmaceutical interventions were introduced. In mid-January 2020, serial intervals were on average 7.8 days, whereas in early February 2020, they decreased to an average of 2.2 days. The more quickly infected persons were identified and isolated, the shorter the serial interval became and the fewer the opportunities for virus transmission. The change in serial interval may not only measure the effectiveness of infection control interventions but may also indicate rising population immunity. Science , this issue p. [1106][1] Studies of novel coronavirus disease 2019 (COVID-19), which is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), have reported varying estimates of epidemiological parameters, including serial interval distributions—i.e., the time between illness onset in successive cases in a transmission chain—and reproduction numbers. By compiling a line-list database of transmission pairs in mainland China, we show that mean serial intervals of COVID-19 shortened substantially from 7.8 to 2.6 days within a month (9 January to 13 February 2020). This change was driven by enhanced nonpharmaceutical interventions, particularly case isolation. We also show that using real-time estimation of serial intervals allowing for variation over time provides more accurate estimates of reproduction numbers than using conventionally fixed serial interval distributions. These findings could improve our ability to assess transmission dynamics, forecast future incidence, and estimate the impact of control measures. [1]: /lookup/doi/10.1126/science.abc9004