Verner, Duane R.
WildfireGPT: Tailored Large Language Model for Wildfire Analysis
Xie, Yangxinyu, Mallick, Tanwi, Bergerson, Joshua David, Hutchison, John K., Verner, Duane R., Branham, Jordan, Alexander, M. Ross, Ross, Robert B., Feng, Yan, Levy, Leslie-Anne, Su, Weijie
Understanding and adapting to climate change is paramount for professionals such as urban planners, emergency managers, and infrastructure operators, as it directly influences urban development, disaster response, and the maintenance of essential services. Nonetheless, this task presents a complex challenge that necessitates the integration of advanced technology and scientific insights. Recent advances in LLMs present an innovative solution, particularly in democratizing climate science. They possess the unique capability to interpret and explain technical aspects of climate change through conversations, making this crucial information accessible to people from all backgrounds Rillig et al. [2023], Bulian et al. [2023], Chen et al. [2023]. However, given that LLMs are generalized models, their performance can be improved by providing additional domain-specific information. Recent research has been focusing on augmenting LLMs with external tools and data sources to ensure that the information provided is scientifically accurate: for example, leveraging authoritative data sources such as ClimateWatch Kraus et al. [2023] and findings from the IPCC AR6 reports Vaghefi et al. [2023] helps in refining the LLM's outputs, ensuring that the information is grounded in the latest research.
Analyzing Regional Impacts of Climate Change using Natural Language Processing Techniques
Mallick, Tanwi, Murphy, John, Bergerson, Joshua David, Verner, Duane R., Hutchison, John K, Levy, Leslie-Anne
Understanding the multifaceted effects of climate change across diverse geographic locations is crucial for timely adaptation and the development of effective mitigation strategies. As the volume of scientific literature on this topic continues to grow exponentially, manually reviewing these documents has become an immensely challenging task. Utilizing Natural Language Processing (NLP) techniques to analyze this wealth of information presents an efficient and scalable solution. By gathering extensive amounts of peer-reviewed articles and studies, we can extract and process critical information about the effects of climate change in specific regions. We employ BERT (Bidirectional Encoder Representations from Transformers) for Named Entity Recognition (NER), which enables us to efficiently identify specific geographies within the climate literature. This, in turn, facilitates location-specific analyses. We conduct region-specific climate trend analyses to pinpoint the predominant themes or concerns related to climate change within a particular area, trace the temporal progression of these identified issues, and evaluate their frequency, severity, and potential development over time. These in-depth examinations of location-specific climate data enable the creation of more customized policy-making, adaptation, and mitigation strategies, addressing each region's unique challenges and providing more effective solutions rooted in data-driven insights. This approach, founded on a thorough exploration of scientific texts, offers actionable insights to a wide range of stakeholders, from policymakers to engineers to environmentalists. By proactively understanding these impacts, societies are better positioned to prepare, allocate resources wisely, and design tailored strategies to cope with future climate conditions, ensuring a more resilient future for all.
Analyzing the impact of climate change on critical infrastructure from the scientific literature: A weakly supervised NLP approach
Mallick, Tanwi, Bergerson, Joshua David, Verner, Duane R., Hutchison, John K, Levy, Leslie-Anne, Balaprakash, Prasanna
Natural language processing (NLP) is a promising approach for analyzing large volumes of climate-change and infrastructure-related scientific literature. However, best-in-practice NLP techniques require large collections of relevant documents (corpus). Furthermore, NLP techniques using machine learning and deep learning techniques require labels grouping the articles based on user-defined criteria for a significant subset of a corpus in order to train the supervised model. Even labeling a few hundred documents with human subject-matter experts is a time-consuming process. To expedite this process, we developed a weak supervision-based NLP approach that leverages semantic similarity between categories and documents to (i) establish a topic-specific corpus by subsetting a large-scale open-access corpus and (ii) generate category labels for the topic-specific corpus. In comparison with a months-long process of subject-matter expert labeling, we assign category labels to the whole corpus using weak supervision and supervised learning in about 13 hours. The labeled climate and NCF corpus enable targeted, efficient identification of documents discussing a topic (or combination of topics) of interest and identification of various effects of climate change on critical infrastructure, improving the usability of scientific literature and ultimately supporting enhanced policy and decision making. To demonstrate this capability, we conduct topic modeling on pairs of climate hazards and NCFs to discover trending topics at the intersection of these categories. This method is useful for analysts and decision-makers to quickly grasp the relevant topics and most important documents linked to the topic.