Goto

Collaborating Authors

 Africa


OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation

arXiv.org Artificial Intelligence

Earth observation data presents a unique challenge: it is spatial like images, sequential like video or text, and highly multimodal. We present OlmoEarth: a multimodal, spatio-temporal foundation model that employs a novel self-supervised learning formulation, masking strategy, and loss all designed for the Earth observation domain. OlmoEarth achieves state-of-the-art performance compared to 12 other foundation models across a variety of research benchmarks and real-world tasks from external partners. When evaluating embeddings OlmoEarth achieves the best performance on 15 out of 24 tasks, and with full fine-tuning it is the best on 19 of 29 tasks. We deploy OlmoEarth as the backbone of an end-to-end platform for data collection, labeling, training, and inference of Earth observation models. The OlmoEarth Platform puts frontier foundation models and powerful data management tools into the hands of non-profits and NGOs working to solve the world's biggest problems. OlmoEarth source code, training data, and pre-trained weights are available at $\href{https://github.com/allenai/olmoearth_pretrain}{\text{https://github.com/allenai/olmoearth_pretrain}}$.


TEMPO: Global Temporal Building Density and Height Estimation from Satellite Imagery

arXiv.org Artificial Intelligence

We present TEMPO, a global, temporally resolved dataset of building density and height derived from high-resolution satellite imagery using deep learning models. We pair building footprint and height data from existing datasets with quarterly PlanetScope basemap satellite images to train a multi-task deep learning model that predicts building density and building height at a 37.6-meter per pixel resolution. We apply this model to global PlanetScope basemaps from Q1 2018 through Q2 2025 to create global, temporal maps of building density and height. We validate these maps by comparing against existing building footprint datasets. Our estimates achieve an F1 score between 85% and 88% on different hand-labeled subsets, and are temporally stable, with a 0.96 five-year trend-consistency score. TEMPO captures quarterly changes in built settlements at a fraction of the computational cost of comparable approaches, unlocking large-scale monitoring of development patterns and climate impacts essential for global resilience and adaptation efforts.


FLEX: Feature Importance from Layered Counterfactual Explanations

arXiv.org Artificial Intelligence

Machine learning models achieve state-of-the-art performance across domains, yet their lack of interpretability limits safe deployment in high-stakes settings. Counterfactual explanations are widely used to provide actionable "what-if" recourse, but they typically remain instance-specific and do not quantify which features systematically drive outcome changes within coherent regions of the feature space or across an entire dataset. We introduce FLEX (Feature importance from Layered counterfactual EXplanations), a model- and domain-agnostic framework that converts sets of counterfactuals into feature change frequency scores at local, regional, and global levels. FLEX generalises local change-frequency measures by aggregating across instances and neighbourhoods, offering interpretable rankings that reflect how often each feature must change to flip predictions. The framework is compatible with different counterfactual generation methods, allowing users to emphasise characteristics such as sparsity, feasibility, or actionability, thereby tailoring the derived feature importances to practical constraints. We evaluate FLEX on two contrasting tabular tasks: traffic accident severity prediction and loan approval, and compare FLEX to SHAP- and LIME-derived feature importance values. Results show that (i) FLEX's global rankings correlate with SHAP while surfacing additional drivers, and (ii) regional analyses reveal context-specific factors that global summaries miss. FLEX thus bridges the gap between local recourse and global attribution, supporting transparent and intervention-oriented decision-making in risk-sensitive applications.


T2IBias: Uncovering Societal Bias Encoded in the Latent Space of Text-to-Image Generative Models

arXiv.org Artificial Intelligence

Text-to-image (T2I) generative models are largely used in AI-powered real-world applications and value creation. However, their strategic deployment raises critical concerns for responsible AI management, particularly regarding the reproduction and amplification of race- and gender-related stereotypes that can undermine organizational ethics. In this work, we investigate whether such societal biases are systematically encoded within the pretrained latent spaces of state-of-the-art T2I models. We conduct an empirical study across the five most popular open-source models, using ten neutral, profession-related prompts to generate 100 images per profession, resulting in a dataset of 5,000 images evaluated by diverse human assessors representing different races and genders. We demonstrate that all five models encode and amplify pronounced societal skew: caregiving and nursing roles are consistently feminized, while high-status professions such as corporate CEO, politician, doctor, and lawyer are overwhelmingly represented by males and mostly White individuals. We further identify model-specific patterns, such as QWEN-Image's near-exclusive focus on East Asian outputs, Kandinsky's dominance of White individuals, and SDXL's comparatively broader but still biased distributions. These results provide critical insights for AI project managers and practitioners, enabling them to select equitable AI models and customized prompts that generate images in alignment with the principles of responsible AI. We conclude by discussing the risks of these biases and proposing actionable strategies for bias mitigation in building responsible GenAI systems. The code and Data Repository: https://github.com/Sufianlab/T2IBias


PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild

arXiv.org Artificial Intelligence

Non-human primates are our closest living relatives, and analyzing their behavior is central to research in cognition, evolution, and conservation. Computer vision could greatly aid this research, but existing methods often rely on human-centric pretrained models and focus on single datasets, which limits generalization. W e address this limitation by shifting from a model-centric to a data-centric approach and introduce PriVi, a large-scale primate-centric video pretraining dataset. PriVi contains 424 hours of curated video, combining 174 hours from behavioral research across 11 settings with 250 hours of diverse web-sourced footage, assembled through a scalable data cura-tion pipeline. W e continue pretraining V-JEP A, a large-scale video model, on PriVi to learn primate-specific representations and evaluate it using a lightweight frozen classifier . Across four benchmark datasets - ChimpACT, PanAf500, BaboonLand, and ChimpBehave - our approach consistently outperforms prior work, including fully fine-tuned baselines, and scales favorably with fewer labels. These results demonstrate that primate-centric pretraining substantially improves data efficiency and generalization, making it a promising approach for low-label applications. Code, models, and the majority of the dataset will be made available.


Enhancing failure prediction in nuclear industry: Hybridization of knowledge- and data-driven techniques

arXiv.org Artificial Intelligence

The convergence of the Internet of Things (IoT) and Industry 4.0 has significantly enhanced data-driven methodologies within the nuclear industry, notably enhancing safety and economic efficiency. This advancement challenges the precise prediction of future maintenance needs for assets, which is crucial for reducing downtime and operational costs. However, the effectiveness of data-driven methodologies in the nuclear sector requires extensive domain knowledge due to the complexity of the systems involved. Thus, this paper proposes a novel predictive maintenance methodology that combines data-driven techniques with domain knowledge from a nuclear equipment. The methodological originality of this paper is located on two levels: highlighting the limitations of purely data-driven approaches and demonstrating the importance of knowledge in enhancing the performance of the predictive models. The applicative novelty of this work lies in its use within a domain such as a nuclear industry, which is highly restricted and ultrasensitive due to security, economic and environmental concerns. A detailed real-world case study which compares the current state of equipment monitoring with two scenarios, demonstrate that the methodology significantly outperforms purely data-driven methods in failure prediction. While purely data-driven methods achieve only a modest performance with a prediction horizon limited to 3 h and a F1 score of 56.36%, the hybrid approach increases the prediction horizon to 24 h and achieves a higher F1 score of 93.12%.


LLM-Generated Negative News Headlines Dataset: Creation and Benchmarking Against Real Journalism

arXiv.org Artificial Intelligence

This research examines the potential of datasets generated by Large Language Models (LLMs) to support Natural Language Processing (NLP) tasks, aiming to overcome challenges related to data acquisition and privacy concerns associated with real-world data. Focusing on negative valence text, a critical component of sentiment analysis, we explore the use of LLM-generated synthetic news headlines as an alternative to real-world data. A specialized corpus of negative news headlines was created using tailored prompts to capture diverse negative sentiments across various societal domains. The synthetic headlines were validated by expert review and further analyzed in embedding space to assess their alignment with real-world negative news in terms of content, tone, length, and style. Key metrics such as correlation with real headlines, perplexity, coherence, and realism were evaluated. The synthetic dataset was benchmarked against two sets of real news headlines using evaluations including the Comparative Perplexity Test, Comparative Readability Test, Comparative POS Profiling, BERTScore, and Comparative Semantic Similarity. Results show the generated headlines match real headlines with the only marked divergence being in the proper noun score of the POS profile test.


Unravelling the mystery of the earliest life on Earth: Scientists uncover fresh chemical evidence of microbes in rocks more than 3.3 BILLION years old

Daily Mail - Science & tech

In 1996 Nasa and the White House made the explosive announcement that the rock contained traces of Martian bugs. The meteorite, catalogued as Allen Hills (ALH) 84001, crashed onto the frozen wastes of Antarctica 13,000 years ago and was recovered in 1984. Photographs were released showing elongated segmented objects that appeared strikingly lifelike.


China military reaches 'war footing' with new missile silos and advanced AI warfare systems

FOX News

A new congressional report warns China's military buildup has reached a war footing with 350 new missile silos and 20% nuclear expansion, threatening U.S. deterrence.


UK's sweeping asylum law changes: How will they impact refugees?

Al Jazeera

UK's sweeping asylum law changes: How will they impact refugees? Shabana Mahmood, the United Kingdom's home secretary, has said the country's asylum system is "not working" and is placing "intense strain on communities" ahead of proposals for major government reforms that would end refugees' automatic right to settle permanently in the UK. Speaking to the BBC on Sunday, Mahmood said undocumented migration is "tearing the country apart". First, they would end the automatic path to settled status for refugees after five years. And second, they would remove state benefits from those who have the right to work and can support themselves.