Goto

Collaborating Authors

 Jakarta


A photo of Iran's bombed schoolgirl graveyard went around the world. Was it real, or AI?

The Guardian

Graves being prepared for the victims of an airstrike on a school in Minab in southern Iran, 2 March 2026. Graves being prepared for the victims of an airstrike on a school in Minab in southern Iran, 2 March 2026. A photo of Iran's bombed schoolgirl graveyard went around the world. T he graves, freshly dug, lie in neat rows of 20 across. More than 60 have already been carved out of the earth, with a few clusters of people standing gathered around them.




CVQA: Culturally-diverseMultilingual VisualQuestionAnsweringBenchmark

Neural Information Processing Systems

Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data.


Indonesia sues six companies over environmental harm in flood zones

Al Jazeera

Indonesia's government has filed multiple lawsuits seeking more than $200m in damages against six firms, after deadly floods wreaked havoc across Sumatra, killing more than 1,000 people last year, although environmentalists criticised the moves as inadequate. Environmentalists, experts and the government pointed the finger at deforestation for its role in last year's disaster that washed torrents of mud and wooden logs into villages across the northwestern part of the island. The sum represents both fines for damage and the proposed monetary value of recovery efforts. The suits were filed to courts on Thursday in Jakarta and North Sumatra's Medan, the ministry added. "We firmly uphold the principle of polluter pays," Environment Minister Hanif Faisol Nurofiq said in a statement.

  Country:
  Industry:

Hybrid LSTM and PPO Networks for Dynamic Portfolio Optimization

Kevin, Jun, Yugopuspito, Pujianto

arXiv.org Artificial Intelligence

This paper introduces a hybrid framework for portfolio optimization that fuses Long Short-Term Memory (LSTM) forecasting with a Proximal Policy Optimization (PPO) reinforcement learning strategy. The proposed system leverages the predictive power of deep recurrent networks to capture temporal dependencies, while the PPO agent adaptively refines portfolio allocations in continuous action spaces, allowing the system to anticipate trends while adjusting dynamically to market shifts. Using multi-asset datasets covering U.S. and Indonesian equities, U.S. Treasuries, and major cryptocurrencies from January 2018 to December 2024, the model is evaluated against several baselines, including equal-weight, index-style, and single-model variants (LSTM-only and PPO-only). The framework's performance is benchmarked against equal-weighted, index-based, and single-model approaches (LSTM-only and PPO-only) using annualized return, volatility, Sharpe ratio, and maximum drawdown metrics, each adjusted for transaction costs. The results indicate that the hybrid architecture delivers higher returns and stronger resilience under non-stationary market regimes, suggesting its promise as a robust, AI-driven framework for dynamic portfolio optimization.


What AI doesn't know: we could be creating a global 'knowledge collapse' Deepak Varuvel Dennison

The Guardian

What AI doesn't know: we could be creating a global'knowledge collapse' As GenAI becomes the primary way to find information, local and traditional wisdom is being lost. And we are only beginning to realise what we're missing This article was originally published as'Holes in the web' on Aeon.co A few years back, my dad was diagnosed with a tumour on his tongue - which meant we had some choices to weigh up. My family has an interesting dynamic when it comes to medical decisions. While my older sister is a trained doctor in western allopathic medicine, my parents are big believers in traditional remedies. Having grown up in a small town in India, I am accustomed to rituals. My dad had a ritual, too. Every time we visited his home village in southern Tamil Nadu, he'd get a bottle of thick, pungent, herb-infused oil from a vaithiyar, a traditional doctor practising Siddha medicine. It was his way of maintaining his connection with the kind of medicine he had always known and trusted.


Generalizing to Unseen Disaster Events: A Causal View

Seeberger, Philipp, Freisinger, Steffen, Bocklet, Tobias, Riedhammer, Korbinian

arXiv.org Artificial Intelligence

Due to the rapid growth of social media platforms, these tools have become essential for monitoring information during ongoing disaster events. However, extracting valuable insights requires real-time processing of vast amounts of data. A major challenge in existing systems is their exposure to event-related biases, which negatively affects their ability to generalize to emerging events. While recent advancements in debiasing and causal learning offer promising solutions, they remain underexplored in the disaster event domain. In this work, we approach bias mitigation through a causal lens and propose a method to reduce event- and domain-related biases, enhancing generalization to future events. Our approach outperforms multiple baselines by up to +1.9% F1 and significantly improves a PLM-based classifier across three disaster classification tasks.


SynthWorlds: Controlled Parallel Worlds for Disentangling Reasoning and Knowledge in Language Models

Gu, Ken, Bhat, Advait, Merrill, Mike A, West, Robert, Liu, Xin, McDuff, Daniel, Althoff, Tim

arXiv.org Artificial Intelligence

Evaluating the reasoning ability of language models (LMs) is complicated by their extensive parametric world knowledge, where benchmark performance often reflects factual recall rather than genuine reasoning. Existing datasets and approaches (e.g., temporal filtering, paraphrasing, adversarial substitution) cannot cleanly separate the two. We present SynthWorlds, a framework that disentangles task reasoning complexity from factual knowledge. In SynthWorlds, we construct parallel corpora representing two worlds with identical interconnected structure: a real-mapped world, where models may exploit parametric knowledge, and a synthetic-mapped world, where such knowledge is meaningless. On top of these corpora, we design two mirrored tasks as case studies: multi-hop question answering and page navigation, which maintain equal reasoning difficulty across worlds. Experiments in parametric-only (e.g., closed-book QA) and knowledge-augmented (e.g., retrieval-augmented) LM settings reveal a persistent knowledge advantage gap, defined as the performance boost models gain from memorized parametric world knowledge. Knowledge acquisition and integration mechanisms reduce but do not eliminate this gap, highlighting opportunities for system improvements. Fully automatic and scalable, SynthWorlds provides a controlled environment for evaluating LMs in ways that were previously challenging, enabling precise and testable comparisons of reasoning and memorization.


Culture Cartography: Mapping the Landscape of Cultural Knowledge

Ziems, Caleb, Held, William, Yu, Jane, Goldberg, Amir, Grusky, David, Yang, Diyi

arXiv.org Artificial Intelligence

To serve global users safely and productively, LLMs need culture-specific knowledge that might not be learned during pre-training. How do we find such knowledge that is (1) salient to in-group users, but (2) unknown to LLMs? The most common solutions are single-initiative: either researchers define challenging questions that users passively answer (traditional annotation), or users actively produce data that researchers structure as benchmarks (knowledge extraction). The process would benefit from mixed-initiative collaboration, where users guide the process to meaningfully reflect their cultures, and LLMs steer the process towards more challenging questions that meet the researcher's goals. We propose a mixed-initiative methodology called CultureCartography. Here, an LLM initializes annotation with questions for which it has low-confidence answers, making explicit both its prior knowledge and the gaps therein. This allows a human respondent to fill these gaps and steer the model towards salient topics through direct edits. We implement this methodology as a tool called CultureExplorer. Compared to a baseline where humans answer LLM-proposed questions, we find that CultureExplorer more effectively produces knowledge that leading models like DeepSeek R1 and GPT-4o are missing, even with web search. Fine-tuning on this data boosts the accuracy of Llama-3.1-8B by up to 19.2% on related culture benchmarks.