Arctic Ocean
Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models
Mukherjee, Anjishnu, Zhu, Ziwei, Anastasopoulos, Antonios
In this work, we present a comprehensive three-phase study to examine (1) the effectiveness of large multimodal models (LMMs) in recognizing cultural contexts; (2) the accuracy of their representations of diverse cultures; and (3) their ability to adapt content across cultural boundaries. We first introduce Dalle Street, a large-scale dataset generated by DALL-E 3 and validated by humans, containing 9,935 images of 67 countries and 10 concept classes. We reveal disparities in cultural understanding at the sub-region level with both open-weight (LLaVA) and closed-source (GPT-4V) models on Dalle Street and other existing benchmarks. Next, we assess models' deeper culture understanding by an artifact extraction task and identify over 18,000 artifacts associated with different countries. Finally, we propose a highly composable pipeline, CultureAdapt, to adapt images from culture to culture. Our findings reveal a nuanced picture of the cultural competence of LMMs, highlighting the need to develop culture-aware systems. Dataset and code are available at https://github.com/iamshnoo/crossroads
Deep Learning of Multivariate Extremes via a Geometric Representation
Murphy-Barltrop, Callum J. R., Majumder, Reetam, Richards, Jordan
The study of geometric extremes, where extremal dependence properties are inferred from the deterministic limiting shapes of scaled sample clouds, provides an exciting approach to modelling the extremes of multivariate data. These shapes, termed limit sets, link together several popular extremal dependence modelling frameworks. Although the geometric approach is becoming an increasingly popular modelling tool, current inference techniques are limited to a low dimensional setting (d < 4), and generally require rigid modelling assumptions. In this work, we propose a range of novel theoretical results to aid with the implementation of the geometric extremes framework and introduce the first approach to modelling limit sets using deep learning. By leveraging neural networks, we construct asymptotically-justified yet flexible semi-parametric models for extremal dependence of high-dimensional data. We showcase the efficacy of our deep approach by modelling the complex extremal dependencies between meteorological and oceanographic variables in the North Sea off the coast of the UK.
Applications of Explainable artificial intelligence in Earth system science
Huang, Feini, Jiang, Shijie, Li, Lu, Zhang, Yongkun, Zhang, Ye, Zhang, Ruqing, Li, Qingliang, Li, Danxi, Shangguan, Wei, Dai, Yongjiu
In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a set of powerful tools that make the models more transparent. The purpose of this review is twofold: First, to provide ESS scholars, especially newcomers, with a foundational understanding of XAI, serving as a primer to inspire future research advances; second, to encourage ESS professionals to embrace the benefits of AI, free from preconceived biases due to its lack of interpretability. We begin with elucidating the concept of XAI, along with typical methods. We then delve into a review of XAI applications in the ESS literature, highlighting the important role that XAI has played in facilitating communication with AI model decisions, improving model diagnosis, and uncovering scientific insights. We identify four significant challenges that XAI faces within the ESS, and propose solutions. Furthermore, we provide a comprehensive illustration of multifaceted perspectives. Given the unique challenges in ESS, an interpretable hybrid approach that seamlessly integrates AI with domain-specific knowledge appears to be a promising way to enhance the utility of AI in ESS. A visionary outlook for ESS envisions a harmonious blend where process-based models govern the known, AI models explore the unknown, and XAI bridges the gap by providing explanations.
Are We Done with MMLU?
Gema, Aryo Pradipta, Leang, Joshua Ong Jun, Hong, Giwon, Devoto, Alessio, Mancino, Alberto Carlo Maria, Saxena, Rohit, He, Xuanli, Zhao, Yu, Du, Xiaotang, Madani, Mohammad Reza Ghasemi, Barale, Claire, McHardy, Robert, Harris, Joshua, Kaddour, Jean, van Krieken, Emile, Minervini, Pasquale
We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive framework for identifying dataset errors using a novel error taxonomy. Then, we create MMLU-Redux, which is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects. Using MMLU-Redux, we demonstrate significant discrepancies with the model performance metrics that were originally reported. Our results strongly advocate for revising MMLU's error-ridden questions to enhance its future utility and reliability as a benchmark.
SOMson -- Sonification of Multidimensional Data in Kohonen Maps
Kohonen Maps, aka. Self-organizing maps (SOMs) are neural networks that visualize a high-dimensional feature space on a low-dimensional map. While SOMs are an excellent tool for data examination and exploration, they inherently cause a loss of detail. Visualizations of the underlying data do not integrate well and, therefore, fail to provide an overall picture. Consequently, we suggest SOMson, an interactive sonification of the underlying data, as a data augmentation technique. The sonification increases the amount of information provided simultaneously by the SOM. Instead of a user study, we present an interactive online example, so readers can explore SOMson themselves. Its strengths, weaknesses, and prospects are discussed.
Marrying Causal Representation Learning with Dynamical Systems for Science
Yao, Dingling, Muller, Caroline, Locatello, Francesco
Causal representation learning promises to extend causal models to hidden causal variables from raw entangled measurements. However, most progress has focused on proving identifiability results in different settings, and we are not aware of any successful real-world application. At the same time, the field of dynamical systems benefited from deep learning and scaled to countless applications but does not allow parameter identification. In this paper, we draw a clear connection between the two and their key assumptions, allowing us to apply identifiable methods developed in causal representation learning to dynamical systems. At the same time, we can leverage scalable differentiable solvers developed for differential equations to build models that are both identifiable and practical. Overall, we learn explicitly controllable models that isolate the trajectory-specific parameters for further downstream tasks such as out-of-distribution classification or treatment effect estimation. We experiment with a wind simulator with partially known factors of variation. We also apply the resulting model to real-world climate data and successfully answer downstream causal questions in line with existing literature on climate change.
Trajectory tracking control of a Remotely Operated Underwater Vehicle based on Fuzzy Disturbance Adaptation and Controller Parameter Optimization
The exploration of under-ice environments presents unique challenges due to limited access for scientific research. This report investigates the potential of deploying a fully actuated Remotely Operated Vehicle (ROV) for shallow area exploration beneath ice sheets. Leveraging advancements in marine robotics technology, ROVs offer a promising solution for extending human presence into remote underwater locations. To enable successful under-ice exploration, the ROV must follow precise trajectories for effective localization signal reception. This study develops a multi-input-multi-output (MIMO) nonlinear system controller, incorporating a Lyapunov-based stability guarantee and an adaptation law to mitigate unknown environmental disturbances. Fuzzy logic is employed to dynamically adjust adaptation rates, enhancing performance in highly nonlinear ROV dynamic systems. Additionally, a Particle Swarm Optimization (PSO) algorithm automates the tuning of controller parameters for optimal trajectory tracking. The report details the ROV dynamic model, the proposed control framework, and the PSO-based tuning process. Simulation-based experiments validate the efficacy of the methodology, with experimental results demonstrating superior trajectory tracking performance compared to baseline controllers. This work contributes to the advancement of under-ice exploration capabilities and sets the stage for future research in marine robotics and autonomous underwater systems.
Estimating Direct and Indirect Causal Effects of Spatiotemporal Interventions in Presence of Spatial Interference
Ali, Sahara, Faruque, Omar, Wang, Jianwu
Spatial interference (SI) occurs when the treatment at one location affects the outcomes at other locations. Accounting for spatial interference in spatiotemporal settings poses further challenges as interference violates the stable unit treatment value assumption, making it infeasible for standard causal inference methods to quantify the effects of time-varying treatment at spatially varying outcomes. In this paper, we first formalize the concept of spatial interference in case of time-varying treatment assignments by extending the potential outcome framework under the assumption of no unmeasured confounding. We then propose our deep learning based potential outcome model for spatiotemporal causal inference. We utilize latent factor modeling to reduce the bias due to time-varying confounding while leveraging the power of U-Net architecture to capture global and local spatial interference in data over time. Our causal estimators are an extension of average treatment effect (ATE) for estimating direct (DATE) and indirect effects (IATE) of spatial interference on treated and untreated data. Being the first of its kind deep learning based spatiotemporal causal inference technique, our approach shows advantages over several baseline methods based on the experiment results on two synthetic datasets, with and without spatial interference. Our results on real-world climate dataset also align with domain knowledge, further demonstrating the effectiveness of our proposed method.
Unicorn: U-Net for Sea Ice Forecasting with Convolutional Neural Ordinary Differential Equations
Park, Jaesung, Hong, Sungchul, Cho, Yoonseo, Jeon, Jong-June
Sea ice at the North Pole is vital to global climate dynamics. However, accurately forecasting sea ice poses a significant challenge due to the intricate interaction among multiple variables. Leveraging the capability to integrate multiple inputs and powerful performances seamlessly, many studies have turned to neural networks for sea ice forecasting. This paper introduces a novel deep architecture named Unicorn, designed to forecast weekly sea ice. Our model integrates multiple time series images within its architecture to enhance its forecasting performance. Moreover, we incorporate a bottleneck layer within the U-Net architecture, serving as neural ordinary differential equations with convolution operations, to capture the spatiotemporal dynamics of latent variables. Through real data analysis with datasets spanning from 1998 to 2021, our proposed model demonstrates significant improvements over state-of-the-art models in the sea ice concentration forecasting task. It achieves an average MAE improvement of 12% compared to benchmark models. Additionally, our method outperforms existing approaches in sea ice extent forecasting, achieving a classification performance improvement of approximately 18%. These experimental results show the superiority of our proposed model.
Efficient Bi-manipulation using RGBD Multi-model Fusion based on Attention Mechanism
Shen, Jian, Huang, Jiaxin, Song, Zhigong
Dual-arm robots have great application prospects in intelligent manufacturing due to their human-like structure when deployed with advanced intelligence algorithm. However, the previous visuomotor policy suffers from perception deficiencies in environments where features of images are impaired by the various conditions, such as abnormal lighting, occlusion and shadow etc. The Focal CVAE framework is proposed for RGB-D multi-modal data fusion to address this challenge. In this study, a mixed focal attention module is designed for the fusion of RGB images containing color features and depth images containing 3D shape and structure information. This module highlights the prominent local features and focuses on the relevance of RGB and depth via cross-attention. A saliency attention module is proposed to improve its computational efficiency, which is applied in the encoder and the decoder of the framework. We illustrate the effectiveness of the proposed method via extensive simulation and experiments. It's shown that the performances of bi-manipulation are all significantly improved in the four real-world tasks with lower computational cost. Besides, the robustness is validated through experiments under different scenarios where there is a perception deficiency problem, demonstrating the feasibility of the method.