Goto

Collaborating Authors

 Indian Ocean


Applications of machine learning to predict seasonal precipitation for East Africa

arXiv.org Machine Learning

Seasonal climate forecasts are commonly based on model runs from fully coupled forecasting systems that use Earth system models to represent interactions between the atmosphere, ocean, land and other Earth-system components. Recently, machine learning (ML) methods are increasingly being investigated for this task where large-scale climate variability is linked to local or regional temperature or precipitation in a linear or non-linear fashion. This paper investigates the use of interpretable ML methods to predict seasonal precipitation for East Africa in an operational setting. Dimension reduction is performed by decomposing the precipitation fields via empirical orthogonal functions (EOFs), such that only the respective factor loadings need to the predicted. Indices of large-scale climate variability--including the rate of change in individual indices as well as interactions between different indices--are then used as potential features to obtain tercile forecasts from an interpretable ML algorithm. Several research questions regarding the use of data and the effect of model complexity are studied. The results are compared against the ECMWF seasonal forecasting system (SEAS5) for three seasons--MAM, JJAS and OND--over the period 1993-2020. Compared to climatology for the same period, the ECMWF forecasts have negative skill in MAM and JJAS and significant positive skill in OND. The ML approach is on par with climatology in MAM and JJAS and a significantly positive skill in OND, if not quite at the level of the OND ECMWF forecast.


How to navigate the green economy: Here are four success stories

Los Angeles Times

No one knows just many green jobs will be created in the United States over the next decade, but there's little disagreement that the demand will create a huge opportunity for the next generation of workers. Thanks to the Inflation Reduction Act of 2022, an estimated 800 billion will flow from the federal government over the next 10 years to fund massive clean energy programs, from solar and wind installations to energy-efficient buildings to business and neighborhood microgrid power production. Billions more in state and federal money will help build out an electrified transportation system, including electric cars, trucks, trains and the infrastructure to support them. High school and university educators are reporting increasing interest in "green" careers: jobs that help address global warming and other environmental issues, with enough of a future to pay the bills and then some. Given the drumbeat of bad news on a changing climate, it might even be considered a matter of survival.


Identity-related Speech Suppression in Generative AI Content Moderation

arXiv.org Artificial Intelligence

Automated content moderation systems have long been used to help reduce the occurrence of violent, hateful, sexual, or otherwise undesired user-generated content online, including in online comment sections and by social media platforms [7, 19, 24]. As content is generated by AI systems, automated content moderation techniques are being applied to the text generated by these systems to filter unwanted content before it is shown to users [21, 22]. However, content moderation is known to suffer from identity-related biases, such that speech by or about marginalized identities is more likely to be incorrectly flagged as inappropriate content [5, 10, 27]. In this paper, we conduct an audit of five content moderation systems to measure identity-related speech suppression, introducing benchmark datasets and definitions to quantify these biases in the context of generative AI systems. Previous assessments of content moderation systems have used benchmark datasets to measure effectiveness and bias. These include datasets composed of user-generated content, such as tweets or internet comments, that have been hand-labeled according to a content moderation rubric [2, 8]. However, most of these datasets are composed of short-form content and do not include the types of text involved in generative AI systems, be they user-generated prompts or system-provided responses. Automated content moderation systems applied in generative AI settings may have unexpected or undesired results, for example flagging PG-rated movie scripts as inappropriate content [21]. As generative AI is increasingly used for creative and expressive text generation from schools to Hollywood, this paper is motivated by this question: whose stories won't be told?


Houthis claim downing another US MQ-9 Reaper drone over Yemen

Al Jazeera

The Houthis have claimed to have shot down a United States military drone over Yemen, in the latest attack by the group, which has disrupted shipping trade through the crucial Bab al-Mandeb Strait, drawing US strikes. The Yemeni group has carried out dozens of attacks on ships with links to Israel in a show of solidarity with Palestinians amid Israel's 11-month-old war on Gaza. Yahya Saree, the military spokesman of the Houthi group, said in a prerecorded video message released early on Sunday that the MQ-9 Reaper was shot down by air defences over Marib as "it was carrying out hostile activities". This is the eighth drone of this type to be shot down since the start of the war on Gaza, he said. The group has not so far released footage of the downed attack and surveillance aircraft that costs about 30m.


Evaluation of Tropical Cyclone Track and Intensity Forecasts from Artificial Intelligence Weather Prediction (AIWP) Models

arXiv.org Artificial Intelligence

In just the past few years multiple data-driven Artificial Intelligence Weather Prediction (AIWP) models have been developed, with new versions appearing almost monthly. Given this rapid development, the applicability of these models to operational forecasting has yet to be adequately explored and documented. To assess their utility for operational tropical cyclone (TC) forecasting, the NHC verification procedure is used to evaluate seven-day track and intensity predictions for northern hemisphere TCs from May-November 2023. Four open-source AIWP models are considered (FourCastNetv1, FourCastNetv2-small, GraphCast-operational and Pangu-Weather). The AIWP track forecast errors and detection rates are comparable to those from the best-performing operational forecast models. However, the AIWP intensity forecast errors are larger than those of even the simplest intensity forecasts based on climatology and persistence. The AIWP models almost always reduce the TC intensity, especially within the first 24 h of the forecast, resulting in a substantial low bias. The contribution of the AIWP models to the NHC model consensus was also evaluated. The consensus track errors are reduced by up to 11% at the longer time periods. The five-day NHC official track forecasts have improved by about 2% per year since 2001, so this represents more than a five-year gain in accuracy. Despite substantial negative intensity biases, the AIWP models have a neutral impact on the intensity consensus. These results show that the current formulation of the AIWP models have promise for operational TC track forecasts, but improved bias corrections or model reformulations will be needed for accurate intensity forecasts.


Improving Water Quality Time-Series Prediction in Hong Kong using Sentinel-2 MSI Data and Google Earth Engine Cloud Computing

arXiv.org Artificial Intelligence

Effective water quality monitoring in coastal regions is crucial due to the progressive deterioration caused by pollution and human activities. To address this, this study develops time-series models to predict chlorophyll-a (Chl-a), suspended solids (SS), and turbidity using Sentinel-2 satellite data and Google Earth Engine (GEE) in the coastal regions of Hong Kong. Leveraging Long Short-Term Memory (LSTM) Recurrent Neural Networks, the study incorporates extensive temporal datasets to enhance prediction accuracy. The models utilize spectral data from Sentinel-2, focusing on optically active components, and demonstrate that selected variables closely align with the spectral characteristics of Chl-a and SS. The results indicate improved predictive performance over previous methods, highlighting the potential for remote sensing technology in continuous and comprehensive water quality assessment.


DAAD: Dynamic Analysis and Adaptive Discriminator for Fake News Detection

arXiv.org Artificial Intelligence

In current web environment, fake news spreads rapidly across online social networks, posing serious threats to society. Existing multimodal fake news detection (MFND) methods can be classified into knowledge-based and semantic-based approaches. However, these methods are overly dependent on human expertise and feedback, lacking flexibility. To address this challenge, we propose a Dynamic Analysis and Adaptive Discriminator (DAAD) approach for fake news detection. For knowledge-based methods, we introduce the Monte Carlo Tree Search (MCTS) algorithm to leverage the self-reflective capabilities of large language models (LLMs) for prompt optimization, providing richer, domain-specific details and guidance to the LLMs, while enabling more flexible integration of LLM comment on news content. For semantic-based methods, we define four typical deceit patterns: emotional exaggeration, logical inconsistency, image manipulation, and semantic inconsistency, to reveal the mechanisms behind fake news creation. To detect these patterns, we carefully design four discriminators and expand them in depth and breadth, using the soft-routing mechanism to explore optimal detection models. Experimental results on three real-world datasets demonstrate the superiority of our approach. The code will be available at: https://github.com/SuXinqi/DAAD.


Watch a huge 'No Boys Allowed' shark slumber party

Popular Science

It appears that no boy sharks were invited to this gathering of sleeping female Port Jackson sharks (Heterodontus portusjacksoni) in Australia. The fish were spotted snuggled up along the seafloor at Beagle Marine Park in the central Bass Strait. "There were thousands of sharks tightly packed like a carpet spread across the seafloor," voyage leader and University of Tasmania quantitative marine spatial ecologist Jacquomo Monk said in a statement. "Port Jackson sharks grow to 1.65 meters [5.4 feet] in length and are found across southern Australia." Scientists supported by Australia's National Environmental Science Program from the South Australian Research and Development Institute's research vessel MRV Ngerin were operating an underwater robot when they spotted and recorded the gathering.


DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts

arXiv.org Artificial Intelligence

Data-driven storytelling is a powerful method for conveying insights by combining narrative techniques with visualizations and text. These stories integrate visual aids, such as highlighted bars and lines in charts, along with textual annotations explaining insights. However, creating such stories requires a deep understanding of the data and meticulous narrative planning, often necessitating human intervention, which can be time-consuming and mentally taxing. While Large Language Models (LLMs) excel in various NLP tasks, their ability to generate coherent and comprehensive data stories remains underexplored. In this work, we introduce a novel task for data story generation and a benchmark containing 1,449 stories from diverse sources. To address the challenges of crafting coherent data stories, we propose a multiagent framework employing two LLM agents designed to replicate the human storytelling process: one for understanding and describing the data (Reflection), generating the outline, and narration, and another for verification at each intermediary step. While our agentic framework generally outperforms non-agentic counterparts in both model-based and human evaluations, the results also reveal unique challenges in data story generation.


Enhancing Complex Causality Extraction via Improved Subtask Interaction and Knowledge Fusion

arXiv.org Artificial Intelligence

Event Causality Extraction (ECE) aims at extracting causal event pairs from texts. Despite ChatGPT's recent success, fine-tuning small models remains the best approach for the ECE task. However, existing fine-tuning based ECE methods cannot address all three key challenges in ECE simultaneously: 1) Complex Causality Extraction, where multiple causal-effect pairs occur within a single sentence; 2) Subtask~ Interaction, which involves modeling the mutual dependence between the two subtasks of ECE, i.e., extracting events and identifying the causal relationship between extracted events; and 3) Knowledge Fusion, which requires effectively fusing the knowledge in two modalities, i.e., the expressive pretrained language models and the structured knowledge graphs. In this paper, we propose a unified ECE framework (UniCE to address all three issues in ECE simultaneously. Specifically, we design a subtask interaction mechanism to enable mutual interaction between the two ECE subtasks. Besides, we design a knowledge fusion mechanism to fuse knowledge in the two modalities. Furthermore, we employ separate decoders for each subtask to facilitate complex causality extraction. Experiments on three benchmark datasets demonstrate that our method achieves state-of-the-art performance and outperforms ChatGPT with a margin of at least 30% F1-score. More importantly, our model can also be used to effectively improve the ECE performance of ChatGPT via in-context learning.