AITopics | Antarctica

Collaborating Authors

Antarctica

How Does the Spatial Distribution of Pre-training Data Affect Geospatial Foundation Models?

Purohit, Mirali, Muhawenayo, Gedeon, Rolf, Esther, Kerner, Hannah

arXiv.org Artificial IntelligenceJan-21-2025

Foundation models have made rapid advances in many domains including Earth observation, where Geospatial Foundation Models (GFMs) can help address global challenges such as climate change, agriculture, and disaster response. Previous work on GFMs focused on tailoring model architecture and pre-text tasks, and did not investigate the impact of pre-training data selection on model performance. However, recent works from other domains show that the pre-training data distribution is an important factor influencing the performance of the foundation models. With this motivation, our research explores how the geographic distribution of pre-training data affects the performance of GFMs. We evaluated several pre-training data distributions by sampling different compositions from a global data pool. Our experiments with two GFMs on downstream tasks indicate that balanced and globally representative data compositions often outperform region-specific sampling, highlighting the importance of diversity and global coverage in pre-training data. Our results suggest that the most appropriate data sampling technique may depend on the specific GFM architecture. These findings will support the development of robust GFMs by incorporating quality pre-training data distributions, ultimately improving machine learning solutions for Earth observation.

artificial intelligence, cropharvest, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2501.12535

Country:

South America (0.04)
Oceania (0.04)
Europe (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Anomalous Agreement: How to find the Ideal Number of Anomaly Classes in Correlated, Multivariate Time Series Data

Rewicki, Ferdinand, Denzler, Joachim, Niebling, Julia

arXiv.org Machine LearningJan-13-2025

Detecting and classifying abnormal system states is critical for condition monitoring, but supervised methods often fall short due to the rarity of anomalies and the lack of labeled data. Therefore, clustering is often used to group similar abnormal behavior. However, evaluating cluster quality without ground truth is challenging, as existing measures such as the Silhouette Score (SSC) only evaluate the cohesion and separation of clusters and ignore possible prior knowledge about the data. To address this challenge, we introduce the Synchronized Anomaly Agreement Index (SAAI), which exploits the synchronicity of anomalies across multivariate time series to assess cluster quality. We demonstrate the effectiveness of SAAI by showing that maximizing SAAI improves accuracy on the task of finding the true number of anomaly classes K in correlated time series by 0.23 compared to SSC and by 0.32 compared to X-Means. We also show that clusters obtained by maximizing SAAI are easier to interpret compared to SSC.

anomaly, saai, time sery, (15 more...)

arXiv.org Machine Learning

2501.07172

Country:

Europe > Lithuania > Vilnius County > Vilnius (0.04)
Europe > Germany (0.04)
Asia > Singapore (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Add feedback

Zero-shot Shark Tracking and Biometrics from Aerial Imagery

Lalgudi, Chinmay K, Leone, Mark E, Clark, Jaden V, Madrigal-Mora, Sergio, Espinoza, Mario

arXiv.org Artificial IntelligenceJan-10-2025

The recent widespread adoption of drones for studying marine animals provides opportunities for deriving biological information from aerial imagery. The large scale of imagery data acquired from drones is well suited for machine learning (ML) analysis. Development of ML models for analyzing marine animal aerial imagery has followed the classical paradigm of training, testing, and deploying a new model for each dataset, requiring significant time, human effort, and ML expertise. We introduce Frame Level ALIgment and tRacking (FLAIR), which leverages the video understanding of Segment Anything Model 2 (SAM2) and the vision-language capabilities of Contrastive Language-Image Pre-training (CLIP). FLAIR takes a drone video as input and outputs segmentation masks of the species of interest across the video. Notably, FLAIR leverages a zero-shot approach, eliminating the need for labeled data, training a new model, or fine-tuning an existing model to generalize to other species. With a dataset of 18,000 drone images of Pacific nurse sharks, we trained state-of-the-art object detection models to compare against FLAIR. We show that FLAIR massively outperforms these object detectors and performs competitively against two human-in-the-loop methods for prompting SAM2, achieving a Dice score of 0.81. FLAIR readily generalizes to other shark species without additional human effort and can be combined with novel heuristics to automatically extract relevant information including length and tailbeat frequency. FLAIR has significant potential to accelerate aerial imagery analysis workflows, requiring markedly less human effort and expertise than traditional machine learning workflows, while achieving superior accuracy. By reducing the effort required for aerial imagery analysis, FLAIR allows scientists to spend more time interpreting results and deriving insights about marine ecosystems.

artificial intelligence, machine learning, shark, (16 more...)

arXiv.org Artificial Intelligence

2501.05717

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
Pacific Ocean (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Video game giant Ubisoft delays release date of Assassin's Creed Shadows again

BBC NewsJan-9-2025, 18:42:14 GMT

Assassin's Creed Shadows delayed again UbisoftFemale ninja Naoe is one of Assassin's Creed Shadows' two playable protagonists Video game giant Ubisoft has announced a further delay to its upcoming Assassin's Creed Shadows. The long-running series is one of the French publisher's flagship franchises, with recent instalment, Valhalla, reportedly making more than 1bn. Assassin's Creed Shadows, set in 16th Century Japan, was due to be released on PC, PlayStation and Xbox last November before an initial delay to February 2025. Announcing the new release date of 20 March, executive producer Marc-Alexis Coté said a "few additional weeks are needed" to ensure the game's launch goes smoothly. Players complained that Ubisoft's major 2024 release, Star Wars Outlaws, was launched with bugs and glitches.

assassin, creed shadow, ubisoft, (7 more...)

BBC News

Country:

Asia > Japan (0.28)
South America (0.16)
North America > Central America (0.16)
(14 more...)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (1.00)

Add feedback

Comparison Study: Glacier Calving Front Delineation in Synthetic Aperture Radar Images With Deep Learning

Gourmelon, Nora, Heidler, Konrad, Loebel, Erik, Cheng, Daniel, Klink, Julian, Dong, Anda, Wu, Fei, Maul, Noah, Koch, Moritz, Dreier, Marcel, Pyles, Dakota, Seehaus, Thorsten, Braun, Matthias, Maier, Andreas, Christlein, Vincent

arXiv.org Artificial IntelligenceJan-9-2025

Calving front position variation of marine-terminating glaciers is an indicator of ice mass loss and a crucial parameter in numerical glacier models. Deep Learning (DL) systems can automatically extract this position from Synthetic Aperture Radar (SAR) imagery, enabling continuous, weather- and illumination-independent, large-scale monitoring. This study presents the first comparison of DL systems on a common calving front benchmark dataset. A multi-annotator study with ten annotators is performed to contrast the best-performing DL system against human performance. The best DL model's outputs deviate 221 m on average, while the average deviation of the human annotators is 38 m. This significant difference shows that current DL systems do not yet match human performance and that further research is needed to enable fully automated monitoring of glacier calving fronts. The study of Vision Transformers, foundation models, and the inclusion and processing strategy of more information are identified as avenues for future research.

artificial intelligence, glacier, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.05281

Country:

Europe > Germany > Bavaria (0.28)
North America > United States > California (0.28)
Asia > China (0.28)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How 'scientist' whales are helping uncover the secrets of climate change

Al JazeeraDec-29-2024, 12:01:38 GMT

I arrive in Hermanus, a picturesque South African coastal village an hour-and-a-half from Cape Town, at about 11am on a sunny October morning. Ignoring the restaurants and art galleries on the main drag and the throngs of tourists watching southern right whales from the cliff path, I drive straight to the harbour to meet Els Vermeulen, the Belgium-born scientist who heads up the whale unit for the University of Pretoria's Mammal Research Institute. She is waiting for her colleagues to return from the last whale-tagging sortie of the 2024 season. "I would normally be out on the boat with the team," says Vermeulen, who is dressed in a bold geometric print dress and a denim jacket. "But I had to drop my kids at school and couldn't get down here early enough." The water next to the concrete pier is so clear that I can see a giant orange starfish inching its way along the rocky seabed.

scientist, vermeulen, whale, (15 more...)

Al Jazeera

Country:

Africa > South Africa > Western Cape > Cape Town (0.25)
Africa > South Africa > Gauteng > Pretoria (0.25)
Europe > Belgium (0.24)
(7 more...)

Technology: Information Technology > Artificial Intelligence (0.95)

Add feedback

Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models

Banerjee, Somnath, Layek, Sayan, Shrawgi, Hari, Mandal, Rajarshi, Halder, Avik, Kumar, Shanu, Basu, Sagnik, Agrawal, Parag, Hazra, Rima, Mukherjee, Animesh

arXiv.org Artificial IntelligenceDec-23-2024

As LLMs are increasingly deployed in global applications, the importance of cultural sensitivity becomes paramount, ensuring that users from diverse backgrounds feel respected and understood. Cultural harm can arise when these models fail to align with specific cultural norms, resulting in misrepresentations or violations of cultural values. This work addresses the challenges of ensuring cultural sensitivity in LLMs, especially in small-parameter models that often lack the extensive training data needed to capture global cultural nuances. We present two key contributions: (1) A cultural harm test dataset, created to assess model outputs across different cultural contexts through scenarios that expose potential cultural insensitivities, and (2) A culturally aligned preference dataset, aimed at restoring cultural sensitivity through fine-tuning based on feedback from diverse annotators. These datasets facilitate the evaluation and enhancement of LLMs, ensuring their ethical and safe deployment across different cultural landscapes. Our results show that integrating culturally aligned feedback leads to a marked improvement in model behavior, significantly reducing the likelihood of generating culturally insensitive or harmful content. Ultimately, this work paves the way for more inclusive and respectful AI systems, fostering a future where LLMs can safely and ethically navigate the complexities of diverse cultural landscapes.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.1288

Country:

Asia > North Korea (0.27)
Europe > Russia (0.14)
Asia > Russia (0.14)
(33 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (0.93)

Industry:

Media > News (1.00)
Law > Criminal Law (1.00)
Law > Civil Rights & Constitutional Law (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Uncertainties of Satellite-based Essential Climate Variables from Deep Learning

Gou, Junyang, Salberg, Arnt-Børre, Shahvandi, Mostafa Kiani, Tourian, Mohammad J., Meyer, Ulrich, Boergens, Eva, Waldeland, Anders U., Velicogna, Isabella, Dahl, Fredrik, Jäggi, Adrian, Schindler, Konrad, Soja, Benedikt

arXiv.org Artificial IntelligenceDec-23-2024

Accurate uncertainty information associated with essential climate variables (ECVs) is crucial for reliable climate modeling and understanding the spatiotemporal evolution of the Earth system. In recent years, geoscience and climate scientists have benefited from rapid progress in deep learning to advance the estimation of ECV products with improved accuracy. However, the quantification of uncertainties associated with the output of such deep learning models has yet to be thoroughly adopted. This survey explores the types of uncertainties associated with ECVs estimated from deep learning and the techniques to quantify them. The focus is on highlighting the importance of quantifying uncertainties inherent in ECV estimates, considering the dynamic and multifaceted nature of climate data. The survey starts by clarifying the definition of aleatoric and epistemic uncertainties and their roles in a typical satellite observation processing workflow, followed by bridging the gap between conventional statistical and deep learning views on uncertainties. Then, we comprehensively review the existing techniques for quantifying uncertainties associated with deep learning algorithms, focusing on their application in ECV studies. The specific need for modification to fit the requirements from both the Earth observation side and the deep learning side in such interdisciplinary tasks is discussed. Finally, we demonstrate our findings with two ECV examples, snow cover and terrestrial water storage, and provide our perspectives for future research.

artificial intelligence, epistemic uncertainty, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.17506

Country:

Europe > Sweden (0.28)
Europe > Norway (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (0.94)
Water & Waste Management > Water Management (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for Foundation Models

Chen, Daoyuan, Huang, Yilun, Pan, Xuchen, Jiang, Nana, Wang, Haibin, Ge, Ce, Chen, Yushuo, Zhang, Wenhao, Ma, Zhijian, Zhang, Yilei, Huang, Jun, Lin, Wei, Li, Yaliang, Ding, Bolin, Zhou, Jingren

arXiv.org Artificial IntelligenceDec-23-2024

The burgeoning field of foundation models necessitates advanced data processing mechanisms capable of harnessing vast valuable data with varied types utilized by these models. Nevertheless, the current landscape presents unique challenges that traditional data processing frameworks cannot handle effectively, especially with multimodal intricacies. In response, we present Data-Juicer 2.0, a new system offering fruitful data processing capabilities backed by over a hundred operators spanning various modalities like text, image, audio, and video. With seamless compatibility and dedicated optimization to popular dataset hubs like Hugging Face and computing engines like Ray, Data-Juicer 2.0 enhances its predecessor in both usability, efficiency, and programmability. It features an easily accessible user interface layer that supports decoupled Python interactions, RESTful APIs, and conversational commands. Alongside this, it contains a core runtime layer optimized for adaptive execution and management across different dataset scales, processing demands, and computational environments, while shielding unnecessary system details. Extensive empirical evaluations demonstrate Data-Juicer 2.0's remarkable performance and scalability, highlighting its capability to efficiently process tens of billions of data samples with tens of thousands of CPU cores. The system is publicly available, actively maintained, and broadly adopted in diverse research endeavors, practical applications, and real-world products such as Alibaba Cloud PAI.

data mining, large language model, machine learning, (25 more...)

arXiv.org Artificial Intelligence

2501.14755

Country:

Antarctica (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Southern Ocean (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Software (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(7 more...)

Add feedback

Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory

Li, Xingyao, Zhang, Fengzhuo, Pan, Jiachun, Hou, Yunlong, Tan, Vincent Y. F., Yang, Zhuoran

arXiv.org Artificial IntelligenceDec-22-2024

Despite the considerable progress achieved in the long video generation problem, there is still significant room to improve the consistency of the videos, particularly in terms of smoothness and transitions between scenes. We address these issues to enhance the consistency and coherence of videos generated with either single or multiple prompts. We propose the Time-frequency based temporal Attention Reweighting Algorithm (TiARA), which meticulously edits the attention score matrix based on the Discrete Short-Time Fourier Transform. Our method is supported by a theoretical guarantee, the first-of-its-kind for frequency-based methods in diffusion models. For videos generated by multiple prompts, we further investigate key factors affecting prompt interpolation quality and propose PromptBlend, an advanced prompt interpolation pipeline. The efficacy of our proposed method is validated via extensive experimental results, exhibiting consistent and impressive improvements over baseline methods. The code will be released upon acceptance.

high quality, video, video generation, (14 more...)

arXiv.org Artificial Intelligence

2412.17254

Country:

South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
Europe > Czechia > Prague (0.04)
Asia > Singapore (0.04)
Antarctica (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback