Atlantic Ocean
War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars
Hua, Wenyue, Fan, Lizhou, Li, Lingyao, Mei, Kai, Ji, Jianchao, Ge, Yingqiang, Hemphill, Libby, Zhang, Yongfeng
Can we avoid wars at the crossroads of history? This question has been pursued by individuals, scholars, policymakers, and organizations throughout human history. In this research, we attempt to answer the question based on the recent advances of Artificial Intelligence (AI) and Large Language Models (LLMs). We propose \textbf{WarAgent}, an LLM-powered multi-agent AI system, to simulate the participating countries, their decisions, and the consequences, in historical international conflicts, including the World War I (WWI), the World War II (WWII), and the Warring States Period (WSP) in Ancient China. By evaluating the simulation effectiveness, we examine the advancements and limitations of cutting-edge AI systems' abilities in studying complex collective human behaviors such as international conflicts under diverse settings. In these simulations, the emergent interactions among agents also offer a novel perspective for examining the triggers and conditions that lead to war. Our findings offer data-driven and AI-augmented insights that can redefine how we approach conflict resolution and peacekeeping strategies. The implications stretch beyond historical analysis, offering a blueprint for using AI to understand human history and possibly prevent future international conflicts. Code and data are available at \url{https://github.com/agiresearch/WarAgent}.
ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks
Ma, Bolei, Nie, Ercong, Yuan, Shuzhou, Schmid, Helmut, Färber, Michael, Kreuter, Frauke, Schütze, Hinrich
Prompt-based methods have been successfully applied to multilingual pretrained language models for zero-shot cross-lingual understanding. However, most previous studies primarily focused on sentence-level classification tasks, and only a few considered token-level labeling tasks such as Named Entity Recognition (NER) and Part-of-Speech (POS) tagging. In this paper, we propose Token-Level Prompt Decomposition (ToPro), which facilitates the prompt-based method for token-level sequence labeling tasks. The ToPro method decomposes an input sentence into single tokens and applies one prompt template to each token. Our experiments on multilingual NER and POS tagging datasets demonstrate that ToPro-based fine-tuning outperforms Vanilla fine-tuning and Prompt-Tuning in zero-shot cross-lingual transfer, especially for languages that are typologically different from the source language English. Our method also attains state-of-the-art performance when employed with the mT5 model. Besides, our exploratory study in multilingual large language models shows that ToPro performs much better than the current in-context learning method. Overall, the performance improvements show that ToPro could potentially serve as a novel and simple benchmarking method for sequence labeling tasks.
Gravity-Informed Deep Learning Framework for Predicting Ship Traffic Flow and Invasion Risk of Non-Indigenous Species via Ballast Water Discharge
Song, Ruixin, Spadon, Gabriel, Pelot, Ronald, Matwin, Stan, Soares, Amilcar
Invasive species in water bodies pose a major threat to the environment and biodiversity globally. Due to increased transportation and trade, non-native species have been introduced to new environments, causing damage to ecosystems and leading to economic losses in agriculture, forestry, and fisheries. Therefore, there is a pressing need for risk assessment and management techniques to mitigate the impact of these invasions. This study aims to develop a new physics-inspired model to forecast maritime shipping traffic and thus inform risk assessment of invasive species spread through global transportation networks. Inspired by the gravity model for international trades, our model considers various factors that influence the likelihood and impact of vessel activities, such as shipping flux density, distance between ports, trade flow, and centrality measures of transportation hubs. Additionally, by analyzing the risk network of invasive species, we provide a comprehensive framework for assessing the invasion threat level given a pair of origin and destination. Accordingly, this paper introduces transformers to gravity models to rebuild the short- and long-term dependencies that make the risk analysis feasible. Thus, we introduce a physics-inspired framework that achieves an 89% segmentation accuracy for existing and non-existing trajectories and an 84.8% accuracy for the number of vessels flowing between key port areas, representing more than 10% improvement over the traditional deep-gravity model. Along these lines, this research contributes to a better understanding of invasive species risk assessment. It allows policymakers, conservationists, and stakeholders to prioritize management actions by identifying high-risk invasion pathways. Besides, our model is versatile and can include new data sources, making it suitable for assessing species invasion risks in a changing global landscape.
Estimation of AMOC transition probabilities using a machine learning based rare-event algorithm
Jacques-Dumas, Valérian, van Westen, René M., Dijkstra, Henk A.
The Atlantic Meridional Overturning Circulation (AMOC) is an important component of the global climate, known to be a tipping element, as it could collapse under global warming. The main objective of this study is to compute the probability that the AMOC collapses within a specified time window, using a rare-event algorithm called Trajectory-Adaptive Multilevel Splitting (TAMS). However, the efficiency and accuracy of TAMS depend on the choice of the score function. Although the definition of the optimal score function, called ``committor function" is known, it is impossible in general to compute it a priori. Here, we combine TAMS with a Next-Generation Reservoir Computing technique that estimates the committor function from the data generated by the rare-event algorithm. We test this technique in a stochastic box model of the AMOC for which two types of transition exist, the so-called F(ast)-transitions and S(low)-transitions. Results for the F-transtions compare favorably with those in the literature where a physically-informed score function was used. We show that coupling a rare-event algorithm with machine learning allows for a correct estimation of transition probabilities, transition times, and even transition paths for a wide range of model parameters. We then extend these results to the more difficult problem of S-transitions in the same model. In both cases of F- and S-transitions, we also show how the Next-Generation Reservoir Computing technique can be interpreted to retrieve an analytical estimate of the committor function.
Data-Driven Strategies for Coping with Incomplete DVL Measurements
Autonomous underwater vehicles are specialized platforms engineered for deep underwater operations. Critical to their functionality is autonomous navigation, typically relying on an inertial navigation system and a Doppler velocity log. In real-world scenarios, incomplete Doppler velocity log measurements occur, resulting in positioning errors and mission aborts. To cope with such situations, a model and learning approaches were derived. This paper presents a comparative analysis of two cutting-edge deep learning methodologies, namely LiBeamsNet and MissBeamNet, alongside a model-based average estimator. These approaches are evaluated for their efficacy in regressing missing Doppler velocity log beams when two beams are unavailable. In our study, we used data recorded by a DVL mounted on an autonomous underwater vehicle operated in the Mediterranean Sea. We found that both deep learning architectures outperformed model-based approaches by over 16% in velocity prediction accuracy.
Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation
Tang, Xinyu, Shin, Richard, Inan, Huseyin A., Manoel, Andre, Mireshghallah, Fatemehsadat, Lin, Zinan, Gopi, Sivakanth, Kulkarni, Janardhan, Sim, Robert
We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy (DP) guarantees, and show empirically that it can achieve effective ICL. We conduct extensive experiments on standard benchmarks and compare our algorithm with non-private ICL and zero-shot solutions. Our results demonstrate that our algorithm can achieve competitive performance with strong privacy levels. The emergence of in-context learning (ICL) with large language models (LLMs), popularized by the seminal work of Brown et al. (2020), has revolutionized the field of natural language processing and machine learning; see Dong et al. (2023) for a survey on ICL and the references therein. In-context learning involves downstream task adaptation without modifying a pre-trained model's weights. This is achieved by conditioning the model through a series of demonstrations of the task at hand appended as a prompt. An advantage of ICL is that it offers a cost-effective and adaptable alternative to finetuning LLMs. By leveraging the model's pre-trained knowledge, it enables efficient generalization across tasks, allows for quick adaptation to new domains or concepts, and requires only a handful of labeled examples for adaptation. However, privacy is a concern when deploying LLMs with users' data incorporated into prompts. As an example, consider healthcare AI applications, where clinical reports belonging to the patients may be used as demonstrations to provide relevant context to the LLM to answer queries. A malicious adversary might attempt to circumvent API restrictions through jailbreaking thereby gaining direct access to the demonstrations as depicted in Figure 1. More generally, it is a major concern that LLMs may regurgitate prompt data in their output (Priyanshu et al., 2023; Duan et al., 2023; Wang et al., 2023). These scenarios raise privacy risks regarding the data used for constructing the prompt.
SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples
Fu, Deqing, Godbole, Ameya, Jia, Robin
Detecting negatives (such as non-entailment relationships, unanswerable questions, and false claims) is an important and challenging aspect of many natural language understanding tasks. Though manually collecting challenging negative examples can help models detect them, it is both costly and domain-specific. In this work, we propose Self-labeled Counterfactuals for Extrapolating to Negative Examples (SCENE), an automatic method for synthesizing training data that greatly improves models' ability to detect challenging negative examples. In contrast with standard data augmentation, which synthesizes new examples for existing labels, SCENE can synthesize negative examples zero-shot from only positive ones. Given a positive example, SCENE perturbs it with a mask infilling model, then determines whether the resulting example is negative based on a self-training heuristic. With access to only answerable training examples, SCENE can close 69.6% of the performance gap on SQuAD 2.0, a dataset where half of the evaluation examples are unanswerable, compared to a model trained on SQuAD 2.0. Our method also extends to boolean question answering and recognizing textual entailment, and improves generalization from SQuAD to ACE-whQA, an out-of-domain extractive QA benchmark.
Russia-Ukraine war: List of key events, day 702
Ukraine's air force said Russia launched 14 attack drones and five missiles on the southern Black Sea regions with air defence systems destroying 11 of the drones. The Ministry of Internal Affairs of Ukraine said six people were injured in the historic city of Odesa and residential buildings and a warehouse were damaged. Ukrainian security sources said they orchestrated a drone attack on an oil refinery in the southern Russian town of Tuapse, about 240 kilometres (150 miles) southeast of the Russian-annexed Crimean peninsula. The attack caused a major fire, but there were no reports of casualties. Nepal's Foreign Minister Narayan Prakash Saud told the Associated Press news agency that Nepal had asked Russia to send back hundreds of Nepali nationals who had been recruited to fight against Ukraine and repatriate the bodies of those who had died in the conflict.
Validating Climate Models with Spherical Convolutional Wasserstein Distance
Garrett, Robert C., Harris, Trevor, Li, Bo, Wang, Zhuo
We introduce the spherical convolutional historical simulations coincide with observational measurements, Wasserstein distance to more comprehensively we can compare each model's synthetic climate measure differences between climate models and distribution to the distribution of observational or quasiobservational reanalysis data. This new similarity measure accounts data products (Raäisaänen, 2007), to assess for spatial variability using convolutional their reconstructive skill. For complete spatial coverage we projections and quantifies local differences in the compare against reanalysis data, a blend of observations distribution of climate variables. We apply this and short-range weather forecasts through data assimilation method to evaluate the historical model outputs (Bengtsson et al., 2004). This has become one popular of the Coupled Model Intercomparison Project climate model validation method (Flato et al., 2014).
Next-Generation Earth System Models: Towards Reliable Hybrid Models for Weather and Climate Applications
Beucler, Tom, Koch, Erwan, Kotlarski, Sven, Leutwyler, David, Michel, Adrien, Koh, Jonathan
Recommendation 1: Develop Hybrid AI-Physical Models: Emphasize the integration of AI and physical modeling for improved reliability, especially for longer prediction horizons, acknowledging the delicate balance between knowledge-based and data-driven components required for optimal performance. Recommendation 2: Emphasize Robustness in AI Downscaling Approaches, favoring techniques that respect physical laws, preserve inter-variable dependencies and spatial structures, and accurately represent extremes at the local scale. Recommendation 3: Promote Inclusive Model Development: Ensure Earth System Model development is open and accessible to diverse stakeholders, enabling forecasters, the public, and AI/statistics experts to use, develop, and engage with the model and its predictions/projections. Figure Caption: Advancements in data collection, data access, hybrid AI-physical Earth system modeling, and downscaling empower stakeholders with increased accessibility to local predictions and projections, encouraging collaborative efforts across disciplines to improve climate change preparedness. Here, we review how machine learning has interactions (Rosenfeld et al., 2014). In the ocean, uncertainties persist due that can be integrated forward in time, serve the to unresolved mesoscale eddies and turbulent double purpose of understanding and prediction processes (Couldrey et al., 2021).