AITopics | Borneo

Collaborating Authors

Borneo

Indonesia sues six companies over environmental harm in flood zones

Al JazeeraJan-16-2026, 10:28:26 GMT

Indonesia's government has filed multiple lawsuits seeking more than $200m in damages against six firms, after deadly floods wreaked havoc across Sumatra, killing more than 1,000 people last year, although environmentalists criticised the moves as inadequate. Environmentalists, experts and the government pointed the finger at deforestation for its role in last year's disaster that washed torrents of mud and wooden logs into villages across the northwestern part of the island. The sum represents both fines for damage and the proposed monetary value of recovery efforts. The suits were filed to courts on Thursday in Jakarta and North Sumatra's Medan, the ministry added. "We firmly uphold the principle of polluter pays," Environment Minister Hanif Faisol Nurofiq said in a statement.

environmental harm, government, sumatra, (13 more...)

Al Jazeera

Country:

Asia > Indonesia > Sumatra > North Sumatra (0.27)
Asia > Indonesia > Java > Jakarta > Jakarta (0.25)
North America > United States (0.16)
(11 more...)

Industry:

Government (1.00)
Law > Environmental Law (0.93)
Law > Litigation (0.57)

Technology: Information Technology > Artificial Intelligence (0.33)

Add feedback

Culture Cartography: Mapping the Landscape of Cultural Knowledge

Ziems, Caleb, Held, William, Yu, Jane, Goldberg, Amir, Grusky, David, Yang, Diyi

arXiv.org Artificial IntelligenceNov-3-2025

To serve global users safely and productively, LLMs need culture-specific knowledge that might not be learned during pre-training. How do we find such knowledge that is (1) salient to in-group users, but (2) unknown to LLMs? The most common solutions are single-initiative: either researchers define challenging questions that users passively answer (traditional annotation), or users actively produce data that researchers structure as benchmarks (knowledge extraction). The process would benefit from mixed-initiative collaboration, where users guide the process to meaningfully reflect their cultures, and LLMs steer the process towards more challenging questions that meet the researcher's goals. We propose a mixed-initiative methodology called CultureCartography. Here, an LLM initializes annotation with questions for which it has low-confidence answers, making explicit both its prior knowledge and the gaps therein. This allows a human respondent to fill these gaps and steer the model towards salient topics through direct edits. We implement this methodology as a tool called CultureExplorer. Compared to a baseline where humans answer LLM-proposed questions, we find that CultureExplorer more effectively produces knowledge that leading models like DeepSeek R1 and GPT-4o are missing, even with web search. Fine-tuning on this data boosts the accuracy of Llama-3.1-8B by up to 19.2% on related culture benchmarks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.27672

Country:

Africa > Nigeria (0.94)
North America (0.93)
Asia > Indonesia > Borneo > Kalimantan (0.93)
Asia > Indonesia > Java (0.93)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What Do Indonesians Really Need from Language Technology? A Nationwide Survey

Kautsar, Muhammad Dehan Al, Susanto, Lucky, Wijaya, Derry, Koto, Fajri

arXiv.org Artificial IntelligenceSep-30-2025

There is an emerging effort to develop NLP for Indonesias 700+ local languages, but progress remains costly due to the need for direct engagement with native speakers. However, it is unclear what these language communities truly need from language technology. To address this, we conduct a nationwide survey to assess the actual needs of native speakers in Indonesia. Our findings indicate that addressing language barriers, particularly through machine translation and information retrieval, is the most critical priority. Although there is strong enthusiasm for advancements in language technology, concerns around privacy, bias, and the use of public data for AI training highlight the need for greater transparency and clear communication to support broader AI adoption.

artificial intelligence, chatbot, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.07506

Country:

Europe (1.00)
Asia > Indonesia > Sulawesi (1.00)
Asia > Indonesia > Borneo > Kalimantan (0.68)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

LoraxBench: A Multitask, Multilingual Benchmark Suite for 20 Indonesian Languages

Aji, Alham Fikri, Cohn, Trevor

arXiv.org Artificial IntelligenceAug-19-2025

As one of the world's most populous countries, with 700 languages spoken, Indonesia is behind in terms of NLP progress. We introduce LoraxBench, a benchmark that focuses on low-resource languages of Indonesia and covers 6 diverse tasks: reading comprehension, open-domain QA, language inference, causal reasoning, translation, and cultural QA. Our dataset covers 20 languages, with the addition of two formality registers for three languages. We evaluate a diverse set of multilingual and region-focused LLMs and found that this benchmark is challenging. We note a visible discrepancy between performance in Indonesian and other languages, especially the low-resource ones. There is no clear lead when using a region-specific model as opposed to the general multilingual model. Lastly, we show that a change in register affects model performance, especially with registers not commonly found in social media, such as high-level politeness `Krama' Javanese.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.12459

Country:

North America (1.00)
Europe (1.00)
Asia > Indonesia > Sumatra (0.46)
(3 more...)

Genre: Research Report (0.40)

Industry: Education > Assessment & Standards > Student Performance (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Elevating Semantic Exploration: A Novel Approach Utilizing Distributed Repositories

Bellandi, Valerio

arXiv.org Artificial IntelligenceMay-7-2025

Centralized and distributed systems are two main approaches to organizing ICT infrastructure, each with its pros and cons. Centralized systems concentrate resources in one location, making management easier but creating single points of failure. Distributed systems, on the other hand, spread resources across multiple nodes, offering better scalability and fault tolerance, but requiring more complex management. The choice between them depends on factors like application needs, scalability, and data sensitivity. Centralized systems suit applications with limited scalability and centralized control, while distributed systems excel in large-scale environments requiring high availability and performance. This paper explores a distributed document repository system developed for the Italian Ministry of Justice, using edge repositories to analyze textual data and metadata, enhancing semantic exploration capabilities.

artificial intelligence, natural language, text processing, (19 more...)

arXiv.org Artificial Intelligence

2505.03443

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Indonesia > Borneo > Kalimantan > East Kalimantan > Nusantara (0.04)
Asia > China (0.04)

Genre:

Research Report > Promising Solution (0.40)
Overview > Innovation (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Dual-Class Prompt Generation: Enhancing Indonesian Gender-Based Hate Speech Detection through Data Augmentation

Ibrahim, Muhammad Amien, Faisal, null, Winarto, Tora Sangputra Yopie, Sulistiya, Zefanya Delvin

arXiv.org Artificial IntelligenceMar-6-2025

Detecting gender-based hate speech in Indonesian social media remains challenging due to limited labeled datasets. While binary hate speech classification has advanced, a more granular category like gender-targeted hate speech is understudied because of class imbalance issues. This paper addresses this gap by comparing three data augmentation techniques for Indonesian gender-based hate speech detection. We evaluate backtranslation, single-class prompt generation (using only hate speech examples), and our proposed dual-class prompt generation (using both hate speech and non-hate speech examples). Experiments show all augmentation methods improve classification performance, with our dual-class approach achieving the best results (88.5% accuracy, 88.1% F1-score using Random Forest). Semantic similarity analysis reveals dual-class prompt generation produces the most novel content, while T-SNE visualizations confirm these samples occupy distinct feature space regions while maintaining class characteristics. Our findings suggest that incorporating examples from both classes helps language models generate more diverse yet representative samples, effectively addressing limited data challenges in specialized hate speech detection.

dataset, detection, speech detection, (12 more...)

arXiv.org Artificial Intelligence

2503.04279

Country:

Asia > Indonesia > Borneo > Kalimantan > East Kalimantan > Nusantara (0.05)
Asia > Indonesia > Java > Jakarta > Jakarta (0.05)
North America > United States > Hawaii (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Enhancing Poverty Targeting with Spatial Machine Learning: An application to Indonesia

Martinez, Rolando Gonzales, Cooray, Mariza

arXiv.org Machine LearningMar-6-2025

This study leverages spatial machine learning (SML) to enhance the accuracy of Proxy Means Testing (PMT) for poverty targeting in Indonesia. Conventional PMT methodologies are prone to exclusion and inclusion errors due to their inability to account for spatial dependencies and regional heterogeneity. By integrating spatial contiguity matrices, SML models mitigate these limitations, facilitating a more precise identification and comparison of geographical poverty clusters. Utilizing household survey data from the Social Welfare Integrated Data Survey (DTKS) for the periods 2016 to 2020 and 2016 to 2021, this study examines spatial patterns in income distribution and delineates poverty clusters at both provincial and district levels. Empirical findings indicate that the proposed SML approach reduces exclusion errors from 28% to 20% compared to standard machine learning models, underscoring the critical role of spatial analysis in refining machine learning-based poverty targeting. These results highlight the potential of SML to inform the design of more equitable and effective social protection policies, particularly in geographically diverse contexts. Future research can explore the applicability of spatiotemporal models and assess the generalizability of SML approaches across varying socio-economic settings.

exclusion error, inclusion error, spatial machine, (12 more...)

arXiv.org Machine Learning

2503.043

Country:

North America > United States (0.05)
Asia > Indonesia > Nusa Tenggara Islands (0.05)
Asia > Indonesia > Sumatra > Bengkulu > Bengkulu (0.04)
(17 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Pretrained LLMs as Real-Time Controllers for Robot Operated Serial Production Line

Waseem, Muhammad, Bhatta, Kshitij, Li, Chen, Chang, Qing

arXiv.org Artificial IntelligenceMar-5-2025

The manufacturing industry is undergoing a transformative shift, driven by cutting-edge technologies like 5G, AI, and cloud computing. Despite these advancements, effective system control, which is crucial for optimizing production efficiency, remains a complex challenge due to the intricate, knowledge-dependent nature of manufacturing processes and the reliance on domain-specific expertise. Conventional control methods often demand heavy customization, considerable computational resources, and lack transparency in decision-making. In this work, we investigate the feasibility of using Large Language Models (LLMs), particularly GPT-4, as a straightforward, adaptable solution for controlling manufacturing systems, specifically, mobile robot scheduling. We introduce an LLM-based control framework to assign mobile robots to different machines in robot assisted serial production lines, evaluating its performance in terms of system throughput. Our proposed framework outperforms traditional scheduling approaches such as First-Come-First-Served (FCFS), Shortest Processing Time (SPT), and Longest Processing Time (LPT). While it achieves performance that is on par with state-of-the-art methods like Multi-Agent Reinforcement Learning (MARL), it offers a distinct advantage by delivering comparable throughput without the need for extensive retraining. These results suggest that the proposed LLM-based solution is well-suited for scenarios where technical expertise, computational resources, and financial investment are limited, while decision transparency and system scalability are critical concerns.

llm, manufacturing system, robot, (17 more...)

arXiv.org Artificial Intelligence

2503.03889

Country:

North America > United States > Virginia (0.05)
Asia > Indonesia > Borneo > Kalimantan > East Kalimantan > Nusantara (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Twenty Years of Personality Computing: Threats, Challenges and Future Directions

Celli, Fabio, Kartelj, Aleksandar, Đorđević, Miljan, Suhartono, Derwin, Filipović, Vladimir, Milutinović, Veljko, Spathoulas, Georgios, Vinciarelli, Alessandro, Kosinski, Michal, Lepri, Bruno

arXiv.org Artificial IntelligenceMar-3-2025

Personality Computing is a field at the intersection of Personality Psychology and Computer Science. Started in 2005, research in the field utilizes computational methods to understand and predict human personality traits. The expansion of the field has been very rapid and, by analyzing digital footprints (text, images, social media, etc.), it helped to develop systems that recognize and even replicate human personality. While offering promising applications in talent recruiting, marketing and healthcare, the ethical implications of Personality Computing are significant. Concerns include data privacy, algorithmic bias, and the potential for manipulation by personality-aware Artificial Intelligence. This paper provides an overview of the field, explores key methodologies, discusses the challenges and threats, and outlines potential future directions for responsible development and deployment of Personality Computing technologies.

personality, personality computing, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2503.02082

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Serbia > Central Serbia > Belgrade (0.05)
(23 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media (1.00)
Law Enforcement & Public Safety (1.00)
Information Technology > Services (1.00)
(5 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
(11 more...)

Add feedback

Do Language Models Understand Honorific Systems in Javanese?

Farhansyah, Mohammad Rifqi, Darmawan, Iwan, Kusumawardhana, Adryan, Winata, Genta Indra, Aji, Alham Fikri, Wijaya, Derry Tanti

arXiv.org Artificial IntelligenceFeb-28-2025

The Javanese language features a complex system of honorifics that vary according to the social status of the speaker, listener, and referent. Despite its cultural and linguistic significance, there has been limited progress in developing a comprehensive corpus to capture these variations for natural language processing (NLP) tasks. In this paper, we present Unggah-Ungguh, a carefully curated dataset designed to encapsulate the nuances of Unggah-Ungguh Basa, the Javanese speech etiquette framework that dictates the choice of words and phrases based on social hierarchy and context. Using Unggah-Ungguh, we assess the ability of language models (LMs) to process various levels of Javanese honorifics through classification and machine translation tasks. To further evaluate cross-lingual LMs, we conduct machine translation experiments between Javanese (at specific honorific levels) and Indonesian. Additionally, we explore whether LMs can generate contextually appropriate Javanese honorifics in conversation tasks, where the honorific usage should align with the social role and contextual cues. Our findings indicate that current LMs struggle with most honorific levels, exhibitinga bias toward certain honorific tiers.

honorific level, ngoko, translation, (15 more...)

arXiv.org Artificial Intelligence

2502.20864

Country:

North America > Haiti (0.05)
Asia > Indonesia > Borneo > Kalimantan > East Kalimantan > Nusantara (0.04)
Asia > India > Haryana (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.86)

Add feedback