Goto

Collaborating Authors

 Lampung


NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts

Adilazuarda, Muhammad Farid, Wijanarko, Musa Izzanardi, Susanto, Lucky, Nur'aini, Khumaisa, Wijaya, Derry, Aji, Alham Fikri

arXiv.org Artificial Intelligence

Indonesia is rich in languages and scripts. However, most NLP progress has been made using romanized text. In this paper, we present NusaAksara, a novel public benchmark for Indonesian languages that includes their original scripts. Our benchmark covers both text and image modalities and encompasses diverse tasks such as image segmentation, OCR, transliteration, translation, and language identification. Our data is constructed by human experts through rigorous steps. NusaAksara covers 8 scripts across 7 languages, including low-resource languages not commonly seen in NLP benchmarks. Although unsupported by Unicode, the Lampung script is included in this dataset. We benchmark our data across several models, from LLMs and VLMs such as GPT-4o, Llama 3.2, and Aya 23 to task-specific systems such as PP-OCR and LangID, and show that most NLP technologies cannot handle Indonesia's local scripts, with many achieving near-zero performance.


Spatial Entity Resolution between Restaurant Locations and Transportation Destinations in Southeast Asia

Gao, Emily, Widdows, Dominic

arXiv.org Artificial Intelligence

Solving this problem can improve precision by removing duplicates, and can enrich detail by (for example) merging a phone Location matters in many businesses and services today, number from one record with the hours of operation particularly for transportation and delivery, scenarios from another, once these records are known to refer in which it is important to find the correct pickup to the same thing. This problem is referred to as entity and drop-off locations very quickly. User experience resolution (see (Talburt, 2011)), and it occurs with can be negatively affected if the location information various datasets, including those representing people, is inaccurate or insufficient. Inaccuracies products, works of literature, etc. can originate from imprecise GPS data, manual error happening in the process of data entry, or the lack of For Grab, one entity resolution problem that arises effective data quality control. Insufficiencies can also for spatial data is the alignment of transportation destinations take many forms, including lack of coverage, and lack and restaurants. Currently Grab maintains of detail -- for example, we may know the latitude two tables separately for transportation and food delivery, and longitude of a restaurant location in a mall, but because each use case requires some specific this might not include information about where passengers features, i.e., food delivery needs information about should be dropped off, or where a delivery the estimated delivery time, cuisine types, and opening courier should park to collect food for delivery. Or hours which are absent in the POI table. However, the location of a business may be known, but not its it is highly likely that some entities from both tables contact details or opening hours.


Enhanced Robot Motion Block of A-star Algorithm for Robotic Path Planning

Kabir, Raihan, Watanobe, Yutaka, Islam, Md. Rashedul, Naruse, Keitaro

arXiv.org Artificial Intelligence

An efficient robot path-planning model is vulnerable to the number of search nodes, path cost, and time complexity. The conventional A-star (A*) algorithm outperforms other grid-based algorithms for its heuristic search. However it shows suboptimal performance for the time, space, and number of search nodes, depending on the robot motion block (RMB). To address this challenge, this study proposes an optimal RMB for the A* path-planning algorithm to enhance the performance, where the robot movement costs are calculated by the proposed adaptive cost function. Also, a selection process is proposed to select the optimal RMB size. In this proposed model, grid-based maps are used, where the robot's next move is determined based on the adaptive cost function by searching among surrounding octet neighborhood grid cells. The cumulative value from the output data arrays is used to determine the optimal motion block size, which is formulated based on parameters. The proposed RMB significantly affects the searching time complexity and number of search nodes of the A* algorithm while maintaining almost the same path cost to find the goal position by avoiding obstacles. For the experiment, a benchmarked online dataset is used and prepared three different dimensional maps. The proposed approach is validated using approximately 7000 different grid maps with various dimensions and obstacle environments. The proposed model with an optimal RMB demonstrated a remarkable improvement of 93.98% in the number of search cells and 98.94% in time complexity compared to the conventional A* algorithm. Path cost for the proposed model remained largely comparable to other state-of-the-art algorithms. Also, the proposed model outperforms other state-of-the-art algorithms.


NusaCrowd: Open Source Initiative for Indonesian NLP Resources

Cahyawijaya, Samuel, Lovenia, Holy, Aji, Alham Fikri, Winata, Genta Indra, Wilie, Bryan, Mahendra, Rahmad, Wibisono, Christian, Romadhony, Ade, Vincentio, Karissa, Koto, Fajri, Santoso, Jennifer, Moeljadi, David, Wirawan, Cahya, Hudi, Frederikus, Parmonangan, Ivan Halim, Alfina, Ika, Wicaksono, Muhammad Satrio, Putra, Ilham Firdausi, Rahmadani, Samsul, Oenang, Yulianti, Septiandri, Ali Akbar, Jaya, James, Dhole, Kaustubh D., Suryani, Arie Ardiyanti, Putri, Rifki Afina, Su, Dan, Stevens, Keith, Nityasya, Made Nindyatama, Adilazuarda, Muhammad Farid, Ignatius, Ryan, Diandaru, Ryandito, Yu, Tiezheng, Ghifari, Vito, Dai, Wenliang, Xu, Yan, Damapuspita, Dyah, Tho, Cuk, Karo, Ichwanul Muslim Karo, Fatyanosa, Tirana Noor, Ji, Ziwei, Fung, Pascale, Neubig, Graham, Baldwin, Timothy, Ruder, Sebastian, Sujaini, Herry, Sakti, Sakriani, Purwarianti, Ayu

arXiv.org Artificial Intelligence

We present NusaCrowd, a collaborative initiative to collect and unify existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have brought together 137 datasets and 118 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their value is demonstrated through multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and the local languages of Indonesia. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and the local languages of Indonesia. Our work strives to advance natural language processing (NLP) research for languages that are under-represented despite being widely spoken.


Keeping it Real: Using Real-World Problems to Teach AI to Diverse Audiences

Sintov, Nicole (The Ohio State University) | Kar, Debarun (University of Southern California) | Nguyen, Thanh (University of Michigan) | Fang, Fei (Carnegie Mellon University) | Hoffman, Kevin (Aspire Public Schools) | Lyet, Arnaud (World Wildlife Fund) | Tambe, Milind (University of Southern California)

AI Magazine

In recent years, AI-based applications have increasingly been used in real-world domains. For example, game theory-based decision aids have been successfully deployed in various security settings to protect ports, airports, and wildlife. This article describes our unique problem-to-project educational approach that used games rooted in real-world issues to teach AI concepts to diverse audiences. Specifically, our educational program began by presenting real-world security issues, and progressively introduced complex AI concepts using lectures, interactive exercises, and ultimately hands-on games to promote learning. We describe our experience in applying this approach to several audiences, including students of an urban public high school, university undergraduates, and security domain experts who protect wildlife. We evaluated our approach based on results from the games and participant surveys.


From the Lab to the Classroom and Beyond: Extending a Game-Based Research Platform for Teaching AI to Diverse Audiences

Sintov, Nicole (University of Southern California) | Kar, Debarun (University of Southern California) | Nguyen, Thanh (University of Southern California) | Fang, Fei (University of Southern California) | Hoffman, Kevin (Aspire Public Schools) | Lyet, Arnaud (World Wildlife Fund) | Tambe, Milind (University of Southern California)

AAAI Conferences

Recent years have seen increasing interest in AI from outside the AI community. This is partly due to applications based on AI that have been used in real-world domains, for example, the successful deployment of game theory-based decision aids in security domains. This paper describes our teaching approach for introducing the AI concepts underlying security games to diverse audiences. We adapted a game-based research platform that served as a testbed for recent research advances in computational game theory into a set of interactive role-playing games. We guided learners in playing these games as part of our teaching strategy, which also included didactic instruction and interactive exercises on broader AI topics. We describe our experience in applying this teaching approach to diverse audiences, including students of an urban public high school, university undergraduates, and security domain experts who protect wildlife. We evaluate our approach based on results from the games and participant surveys.


Avian Influenza (H5N1) Warning System using Dempster-Shafer Theory and Web Mapping

Maseleno, Andino, Hasan, Md. Mahmud

arXiv.org Artificial Intelligence

Based on Cumulative Number of Confirmed Human Cases of Avian Influenza (H5N1) Reported to World Health Organization (WHO) in the 2011 from 15 countries, Indonesia has the largest number death because Avian Influenza which 146 deaths. In this research, the researcher built a Web Mapping and Dempster-Shafer theory as early warning system of avian influenza. Early warning is the provision of timely and effective information, through identified institutions, that allows individuals exposed to a hazard to take action to avoid or reduce their risk and prepare for effective response. In this paper as example we use five symptoms as major symptoms which include depression, combs, wattle, bluish face region, swollen face region, narrowness of eyes, and balance disorders. Research location is in the Lampung Province, South Sumatera. The researcher reason to choose Lampung Province in South Sumatera on the basis that has a high poultry population. Geographically, Lampung province is located at 103040' to 105050' East Longitude and 6045' - 3045' South latitude, confined with: South Sumatera and Bengkulu on North Side, Sunda Strait on the Side, Java Sea on the East Side, Indonesia Ocean on the West Side. Our approach uses Dempster Shafer theory to combine beliefs in certain hypotheses under conditions of uncertainty and ignorance, and allows quantitative measurement of the belief and plausibility in our identification result. Web Mapping is also used for displaying maps on a screen to visualize the result of the identification process. The result reveal that avian influenza warning system has successfully identified the existence of avian influenza and the maps can be displayed as the visualization.


Avian Influenza (H5N1) Expert System using Dempster-Shafer Theory

Maseleno, Andino, Hasan, Md. Mahmud

arXiv.org Artificial Intelligence

Based on Cumulative Number of Confirmed Human Cases of Avian Influenza (H5N1) Reported to World Health Organization (WHO) in the 2011 from 15 countries, Indonesia has the largest number death because Avian Influenza which 146 deaths. In this research, the researcher built an Avian Influenza (H5N1) Expert System for identifying avian influenza disease and displaying the result of identification process. In this paper, we describe five symptoms as major symptoms which include depression, combs, wattle, bluish face region, swollen face region, narrowness of eyes, and balance disorders. We use chicken as research object. Research location is in the Lampung Province, South Sumatera. The researcher reason to choose Lampung Province in South Sumatera on the basis that has a high poultry population. Dempster-Shafer theory to quantify the degree of belief as inference engine in expert system, our approach uses Dempster-Shafer theory to combine beliefs under conditions of uncertainty and ignorance, and allows quantitative measurement of the belief and plausibility in our identification result. The result reveal that Avian Influenza (H5N1) Expert System has successfully identified the existence of avian influenza and displaying the result of identification process.