imagery
SSL4EO-L: Datasets and Foundation Models for Landsat Imagery Adam J. Stewart
The Landsat program is the longest-running Earth observation program in history, with 50+ years of data acquisition by 8 satellites. The multispectral imagery captured by sensors onboard these satellites is critical for a wide range of scientific fields. Despite the increasing popularity of deep learning and remote sensing, the majority of researchers still use decision trees and random forests for Landsat image analysis due to the prevalence of small labeled datasets and lack of foundation models. In this paper, we introduce SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for Earth O bservation for the Landsat family of satellites (including 3 sensors and 2 product levels) and the largest Landsat dataset in history (5M image patches). Additionally, we modernize and re-release the L7 Irish and L8 Biome cloud detection datasets, and introduce the first ML benchmark datasets for Landsats 4-5 TM and Landsat 7 ETM+ SR. Finally, we pre-train the first foundation models for Landsat imagery using SSL4EO-L and evaluate their performance on multiple semantic segmentation tasks.
- North America > United States > Virginia (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > Texas (0.04)
- (4 more...)
- North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
- North America > United States > Colorado (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- (2 more...)
- Energy (1.00)
- Transportation > Infrastructure & Services (0.92)
- Transportation > Ground (0.67)
The State-Led Crackdown on Grok and xAI Has Begun
At least 37 attorneys general for US states and territories are taking action against xAI after Grok generated a flood of nonconsensual sexual images of women and minors. At least 37 attorneys general for US states and territories are taking action against xAI after people used its chatbot, Grok, to generate a flood of sexualized images earlier this year. On Friday, a bipartisan group of 35 attorneys general published an open letter to xAI demanding it "immediately take all available additional steps to protect the public and users of your platforms, especially the women and girls who are the overwhelming target of [non-consensual intimate images]." The letter comes amid an international wave of regulator attention on Grok users creating intimate deepfake images of people without their consent, as well as sexualized images of children. A recent report from the Center for Countering Digital Hate estimates that during an 11-day period starting on December 29, Grok's account on X generated around 3 million photorealistic sexualized images, including around 23,000 sexualized images of children.
- North America > United States > California (0.06)
- North America > United States > Arizona (0.06)
- North America > United States > Utah (0.04)
- (10 more...)
Use of AI to harm women has only just begun, experts warn
Elon Musk's AI tool, Grok, is being investigated by the UK's media regulator. Elon Musk's AI tool, Grok, is being investigated by the UK's media regulator. "Since discovering Grok AI, regular porn doesn't do it for me anymore, it just sounds absurd now," one enthusiast for the Elon Musk-owned AI chatbot wrote on Reddit. Another agreed: "If I want a really specific person, yes." If those who have been horrified by the distribution of sexualised imagery on Grok hoped that last week's belated safeguards could put the genie back in the bottle, there are many such posts on Reddit and elsewhere that tell a different story.
- North America > United States (0.16)
- Europe > United Kingdom (0.16)
- Europe > Ukraine (0.06)
- Oceania > Australia (0.05)
- Media > News (0.91)
- Government > Regional Government (0.73)
- Leisure & Entertainment > Sports (0.71)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)
- North America > United States > California (0.04)
- Europe > United Kingdom (0.04)
- Europe > Slovakia (0.04)
- (2 more...)
Transient Neural Radiance Fields for Lidar View Synthesis and 3D Reconstruction
Neural radiance fields (NeRFs) have become a ubiquitous tool for modeling scene appearance and geometry from multiview imagery. Recent work has also begun to explore how to use additional supervision from lidar or depth sensor measurements in the NeRF framework. However, previous lidar-supervised NeRFs focus on rendering conventional camera imagery and use lidar-derived point cloud data as auxiliary supervision; thus, they fail to incorporate the underlying image formation model of the lidar. Here, we propose a novel method for rendering transient NeRFs that take as input the raw, time-resolved photon count histograms measured by a single-photon lidar system, and we seek to render such histograms from novel views. Different from conventional NeRFs, the approach relies on a time-resolved version of the volume rendering equation to render the lidar measurements and capture transient light transport phenomena at picosecond timescales. We evaluate our method on a first-of-its-kind dataset of simulated and captured transient multiview scans from a prototype single-photon lidar. Overall, our work brings NeRFs to a new dimension of imaging at transient timescales, newly enabling rendering of transient imagery from novel views. Additionally, we show that our approach recovers improved geometry and conventional appearance compared to point cloud-based supervision when training on few input viewpoints. Transient NeRFs may be especially useful for applications which seek to simulate raw lidar measurements for downstream tasks in autonomous driving, robotics, and remote sensing.
- Information Technology > Artificial Intelligence > Vision (0.65)
- Information Technology > Artificial Intelligence > Robots (0.59)
SSL4EO-L: Datasets and Foundation Models for Landsat Imagery
The Landsat program is the longest-running Earth observation program in history, with 50+ years of data acquisition by 8 satellites. The multispectral imagery captured by sensors onboard these satellites is critical for a wide range of scientific fields. Despite the increasing popularity of deep learning and remote sensing, the majority of researchers still use decision trees and random forests for Landsat image analysis due to the prevalence of small labeled datasets and lack of foundation models. In this paper, we introduce SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for Earth Observation for the Landsat family of satellites (including 3 sensors and 2 product levels) and the largest Landsat dataset in history (5M image patches). Additionally, we modernize and re-release the L7 Irish and L8 Biome cloud detection datasets, and introduce the first ML benchmark datasets for Landsats 4-5 TM and Landsat 7 ETM+ SR. Finally, we pre-train the first foundation models for Landsat imagery using SSL4EO-L and evaluate their performance on multiple semantic segmentation tasks. All datasets and model weights are available via the TorchGeo library, making reproducibility and experimentation easy, and enabling scientific advancements in the burgeoning field of remote sensing for a multitude of downstream applications.
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery
Clouds in satellite imagery pose a significant challenge for downstream applications.A major challenge in current cloud removal research is the absence of a comprehensive benchmark and a sufficiently large and diverse training dataset.To address this problem, we introduce the largest public dataset -- *AllClear* for cloud removal, featuring 23,742 globally distributed regions of interest (ROIs) with diverse land-use patterns, comprising 4 million images in total. Each ROI includes complete temporal captures from the year 2022, with (1) multi-spectral optical imagery from Sentinel-2 and Landsat 8/9, (2) synthetic aperture radar (SAR) imagery from Sentinel-1, and (3) auxiliary remote sensing products such as cloud masks and land cover maps.We validate the effectiveness of our dataset by benchmarking performance, demonstrating the scaling law - the PSNR rises from $28.47$ to $33.87$ with $30\times$ more data, and conducting ablation studies on the temporal length and the importance of individual modalities. This dataset aims to provide comprehensive coverage of the Earth's surface and promote better cloud removal results.
OAM-TCD: A globally diverse dataset of high-resolution tree cover maps
Accurately quantifying tree cover is an important metric for ecosystem monitoring and for assessing progress in restored sites. Recent works have shown that deep learning-based segmentation algorithms are capable of accurately mapping trees at country and continental scales using high-resolution aerial and satellite imagery. Mapping at high (ideally sub-meter) resolution is necessary to identify individual trees, however there are few open-access datasets containing instance level annotations and those that exist are small or not geographically diverse. We present a novel open-access dataset for individual tree crown delineation (TCD) in high-resolution aerial imagery sourced from OpenAerialMap (OAM). Our dataset, OAM-TCD, comprises 5072 2048x2048 px images at 10 cm/px resolution with associated human-labeled instance masks for over 280k individual and 56k groups of trees. By sampling imagery from around the world, we are able to better capture the diversity and morphology of trees in different terrestrial biomes and in both urban and natural environments. Using our dataset, we train reference instance and semantic segmentation models that compare favorably to existing state-of-the-art models. We assess performance through k-fold cross-validation and comparison with existing datasets; additionally we demonstrate compelling results on independent aerial imagery captured over Switzerland and compare to municipal tree inventories and LIDAR-derived canopy maps in the city of Zurich. Our dataset, models and training/benchmark code are publicly released under permissive open-source licenses: Creative Commons (majority CC BY 4.0), and Apache 2.0 respectively.
Open High-Resolution Satellite Imagery: The WorldStrat Dataset – With Application to Super-Resolution
Analyzing the planet at scale with satellite imagery and machine learning is a dream that has been constantly hindered by the cost of difficult-to-access highly-representative high-resolution imagery. To remediate this, we introduce here the WorldStratified dataset. The largest and most varied such publicly available dataset, at Airbus SPOT 6/7 satellites' high resolution of up to 1.5 m/pixel, empowered by European Space Agency's Phi-Lab as part of the ESA-funded QueryPlanet project, we curate 10,000 sq km of unique locations to ensure stratified representation of all types of land-use across the world: from agriculture to ice caps, from forests to multiple urbanization densities. We also enrich those with locations typically under-represented in ML datasets: sites of humanitarian interest, illegal mining sites, and settlements of persons at risk.