AITopics

Earth observation data presents a unique challenge: it is spatial like images, sequential like video or text, and highly multimodal. We present OlmoEarth: a multimodal, spatio-temporal foundation model that employs a novel self-supervised learning formulation, masking strategy, and loss all designed for the Earth observation domain. OlmoEarth achieves state-of-the-art performance compared to 12 other foundation models across a variety of research benchmarks and real-world tasks from external partners. When evaluating embeddings OlmoEarth achieves the best performance on 15 out of 24 tasks, and with full fine-tuning it is the best on 19 of 29 tasks. We deploy OlmoEarth as the backbone of an end-to-end platform for data collection, labeling, training, and inference of Earth observation models. The OlmoEarth Platform puts frontier foundation models and powerful data management tools into the hands of non-profits and NGOs working to solve the world's biggest problems. OlmoEarth source code, training data, and pre-trained weights are available at $\href{https://github.com/allenai/olmoearth_pretrain}{\text{https://github.com/allenai/olmoearth_pretrain}}$.

artificial intelligence, machine learning, natural language, (19 more...)

2511.13655

Country:

North America > United States (1.00)
Africa (1.00)

Genre: Research Report (0.82)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

TEMPO: Global Temporal Building Density and Height Estimation from Satellite Imagery

Glazer, Tammy, Hacheme, Gilles Q., Zaytar, Akram, Marotti, Luana, Michaels, Amy, Tadesse, Girmaw Abebe, White, Kevin, Dodhia, Rahul, Zolli, Andrew, Becker-Reshef, Inbal, Ferres, Juan M. Lavista, Robinson, Caleb

We present TEMPO, a global, temporally resolved dataset of building density and height derived from high-resolution satellite imagery using deep learning models. We pair building footprint and height data from existing datasets with quarterly PlanetScope basemap satellite images to train a multi-task deep learning model that predicts building density and building height at a 37.6-meter per pixel resolution. We apply this model to global PlanetScope basemaps from Q1 2018 through Q2 2025 to create global, temporal maps of building density and height. We validate these maps by comparing against existing building footprint datasets. Our estimates achieve an F1 score between 85% and 88% on different hand-labeled subsets, and are temporally stable, with a 0.96 five-year trend-consistency score. TEMPO captures quarterly changes in built settlements at a fraction of the computational cost of comparable approaches, unlocking large-scale monitoring of development patterns and climate impacts essential for global resilience and adaptation efforts.

artificial intelligence, deep learning, machine learning, (18 more...)

2511.12104

Country:

Africa (1.00)
North America > United States > Texas (0.69)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Health & Medicine (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.71)
Transportation (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Keshtmand, Nawid, Nzoyem, Roussel Desmond, Clark, Jeffrey Nicholas

FLEX: Feature Importance from Layered Counterfactual Explanations

Machine learning models achieve state-of-the-art performance across domains, yet their lack of interpretability limits safe deployment in high-stakes settings. Counterfactual explanations are widely used to provide actionable "what-if" recourse, but they typically remain instance-specific and do not quantify which features systematically drive outcome changes within coherent regions of the feature space or across an entire dataset. We introduce FLEX (Feature importance from Layered counterfactual EXplanations), a model- and domain-agnostic framework that converts sets of counterfactuals into feature change frequency scores at local, regional, and global levels. FLEX generalises local change-frequency measures by aggregating across instances and neighbourhoods, offering interpretable rankings that reflect how often each feature must change to flip predictions. The framework is compatible with different counterfactual generation methods, allowing users to emphasise characteristics such as sparsity, feasibility, or actionability, thereby tailoring the derived feature importances to practical constraints. We evaluate FLEX on two contrasting tabular tasks: traffic accident severity prediction and loan approval, and compare FLEX to SHAP- and LIME-derived feature importance values. Results show that (i) FLEX's global rankings correlate with SHAP while surfacing additional drivers, and (ii) regional analyses reveal context-specific factors that global summaries miss. FLEX thus bridges the gap between local recourse and global attribution, supporting transparent and intervention-oriented decision-making in risk-sensitive applications.

artificial intelligence, machine learning, natural language, (16 more...)

2511.11891

Country: Africa > Ethiopia (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

T2IBias: Uncovering Societal Bias Encoded in the Latent Space of Text-to-Image Generative Models

Sufian, Abu, Distante, Cosimo, Leo, Marco, Salam, Hanan

Text-to-image (T2I) generative models are largely used in AI-powered real-world applications and value creation. However, their strategic deployment raises critical concerns for responsible AI management, particularly regarding the reproduction and amplification of race- and gender-related stereotypes that can undermine organizational ethics. In this work, we investigate whether such societal biases are systematically encoded within the pretrained latent spaces of state-of-the-art T2I models. We conduct an empirical study across the five most popular open-source models, using ten neutral, profession-related prompts to generate 100 images per profession, resulting in a dataset of 5,000 images evaluated by diverse human assessors representing different races and genders. We demonstrate that all five models encode and amplify pronounced societal skew: caregiving and nursing roles are consistently feminized, while high-status professions such as corporate CEO, politician, doctor, and lawyer are overwhelmingly represented by males and mostly White individuals. We further identify model-specific patterns, such as QWEN-Image's near-exclusive focus on East Asian outputs, Kandinsky's dominance of White individuals, and SDXL's comparatively broader but still biased distributions. These results provide critical insights for AI project managers and practitioners, enabling them to select equitable AI models and customized prompts that generate images in alignment with the principles of responsible AI. We conclude by discussing the risks of these biases and proposing actionable strategies for bias mitigation in building responsible GenAI systems. The code and Data Repository: https://github.com/Sufianlab/T2IBias

interdisciplinary workshop, machine learning, natural language, (19 more...)

2511.10089

Country:

Asia (0.28)
Africa (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

PriVi: Towards A General-Purpose Video Model For Primate Behavior In The Wild

Mueller, Felix B., Meier, Jan F., Lueddecke, Timo, Vogg, Richard, Freixanet, Roger L., Hassler, Valentin, Bosshard, Tiffany, Karakoc, Elif, O'Hearn, William J., Pereira, Sofia M., Sehner, Sandro, Wierucka, Kaja, Burkart, Judith, Fichtel, Claudia, Fischer, Julia, Gail, Alexander, Hobaiter, Catherine, Ostner, Julia, Samuni, Liran, Schülke, Oliver, Shahidi, Neda, Wessling, Erin G., Ecker, Alexander S.

Non-human primates are our closest living relatives, and analyzing their behavior is central to research in cognition, evolution, and conservation. Computer vision could greatly aid this research, but existing methods often rely on human-centric pretrained models and focus on single datasets, which limits generalization. W e address this limitation by shifting from a model-centric to a data-centric approach and introduce PriVi, a large-scale primate-centric video pretraining dataset. PriVi contains 424 hours of curated video, combining 174 hours from behavioral research across 11 settings with 250 hours of diverse web-sourced footage, assembled through a scalable data cura-tion pipeline. W e continue pretraining V-JEP A, a large-scale video model, on PriVi to learn primate-specific representations and evaluate it using a lightweight frozen classifier . Across four benchmark datasets - ChimpACT, PanAf500, BaboonLand, and ChimpBehave - our approach consistently outperforms prior work, including fully fine-tuned baselines, and scales favorably with fewer labels. These results demonstrate that primate-centric pretraining substantially improves data efficiency and generalization, making it a promising approach for low-label applications. Code, models, and the majority of the dataset will be made available.

artificial intelligence, machine learning, natural language, (17 more...)

2511.09675

Country:

North America > United States (1.00)
Asia (0.93)
Africa (0.67)
Europe > Germany > Lower Saxony > Gottingen (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Saley, Amaratou Mahamadou, Moyaux, Thierry, Sekhari, Aïcha, Cheutet, Vincent, Danielou, Jean-Baptiste

Enhancing failure prediction in nuclear industry: Hybridization of knowledge- and data-driven techniques

The convergence of the Internet of Things (IoT) and Industry 4.0 has significantly enhanced data-driven methodologies within the nuclear industry, notably enhancing safety and economic efficiency. This advancement challenges the precise prediction of future maintenance needs for assets, which is crucial for reducing downtime and operational costs. However, the effectiveness of data-driven methodologies in the nuclear sector requires extensive domain knowledge due to the complexity of the systems involved. Thus, this paper proposes a novel predictive maintenance methodology that combines data-driven techniques with domain knowledge from a nuclear equipment. The methodological originality of this paper is located on two levels: highlighting the limitations of purely data-driven approaches and demonstrating the importance of knowledge in enhancing the performance of the predictive models. The applicative novelty of this work lies in its use within a domain such as a nuclear industry, which is highly restricted and ultrasensitive due to security, economic and environmental concerns. A detailed real-world case study which compares the current state of equipment monitoring with two scenarios, demonstrate that the methodology significantly outperforms purely data-driven methods in failure prediction. While purely data-driven methods achieve only a modest performance with a prediction horizon limited to 3 h and a F1 score of 56.36%, the hybrid approach increases the prediction horizon to 24 h and achieves a higher F1 score of 93.12%.

data mining, knowledge management, machine learning, (22 more...)

doi: 10.1016/j.cie.2025.111387

2511.11604

Country:

Asia (0.92)
Europe > France (0.46)
Africa > Middle East > Tunisia (0.28)

Genre:

Overview (1.00)
Workflow (0.93)
Research Report > New Finding (0.67)

Industry: Energy > Power Industry > Utilities > Nuclear (1.00)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
(7 more...)

Babalola, Olusola, Ojokoh, Bolanle, Boyinbode, Olutayo

LLM-Generated Negative News Headlines Dataset: Creation and Benchmarking Against Real Journalism

This research examines the potential of datasets generated by Large Language Models (LLMs) to support Natural Language Processing (NLP) tasks, aiming to overcome challenges related to data acquisition and privacy concerns associated with real-world data. Focusing on negative valence text, a critical component of sentiment analysis, we explore the use of LLM-generated synthetic news headlines as an alternative to real-world data. A specialized corpus of negative news headlines was created using tailored prompts to capture diverse negative sentiments across various societal domains. The synthetic headlines were validated by expert review and further analyzed in embedding space to assess their alignment with real-world negative news in terms of content, tone, length, and style. Key metrics such as correlation with real headlines, perplexity, coherence, and realism were evaluated. The synthetic dataset was benchmarked against two sets of real news headlines using evaluations including the Comparative Perplexity Test, Comparative Readability Test, Comparative POS Profiling, BERTScore, and Comparative Semantic Similarity. Results show the generated headlines match real headlines with the only marked divergence being in the proper noun score of the POS profile test.

large language model, machine learning, natural language, (18 more...)

2511.11591

Country:

Africa > Nigeria (0.28)
Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > News (1.00)
Government (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Daily Mail - Science & techNov-17-2025, 20:00:46 GMT

Unravelling the mystery of the earliest life on Earth: Scientists uncover fresh chemical evidence of microbes in rocks more than 3.3 BILLION years old

In 1996 Nasa and the White House made the explosive announcement that the rock contained traces of Martian bugs. The meteorite, catalogued as Allen Hills (ALH) 84001, crashed onto the frozen wastes of Antarctica 13,000 years ago and was recovered in 1984. Photographs were released showing elongated segmented objects that appeared strikingly lifelike.

artificial intelligence, chemical evidence, social media, (15 more...)

Daily Mail - Science & tech

Country:

Antarctica (0.24)
North America > Canada > Alberta (0.14)
South America > Brazil (0.04)
(12 more...)

Genre: Research Report (0.93)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(8 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

FOX NewsNov-17-2025, 19:57:39 GMT

China military reaches 'war footing' with new missile silos and advanced AI warfare systems

A new congressional report warns China's military buildup has reached a war footing with 350 new missile silos and 20% nuclear expansion, threatening U.S. deterrence.

artificial intelligence, china, social media, (12 more...)

FOX News

Country:

Asia > China > Beijing > Beijing (0.08)
Asia > Taiwan (0.07)
South America > Venezuela (0.04)
(12 more...)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (1.00)
Health & Medicine > Therapeutic Area (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.74)

Al JazeeraNov-17-2025, 16:06:43 GMT

UK's sweeping asylum law changes: How will they impact refugees?

UK's sweeping asylum law changes: How will they impact refugees? Shabana Mahmood, the United Kingdom's home secretary, has said the country's asylum system is "not working" and is placing "intense strain on communities" ahead of proposals for major government reforms that would end refugees' automatic right to settle permanently in the UK. Speaking to the BBC on Sunday, Mahmood said undocumented migration is "tearing the country apart". First, they would end the automatic path to settled status for refugees after five years. And second, they would remove state benefits from those who have the right to work and can support themselves.

artificial intelligence, government, refugee, (13 more...)

Al Jazeera

Country:

North America > United States (0.15)
Europe > Denmark (0.06)
Europe > Ukraine (0.05)
(15 more...)

Industry:

Law (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (1.00)
Government > Immigration & Customs (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)