AITopics | Ilocos Region

Collaborating Authors

Ilocos Region

Japan, Philippines to discuss surface-to-ship missile exports

The Japan TimesMay-31-2026, 10:48:00 GMT

A Japan Ground Self-Defense Force Type-88 surface-to-ship missile is fired during the Balikatan exercises at Culili Point Sand Dunes in Paoay, Ilocos Norte province, Philippines, on May 6. | REUTERS Singapore - Defense Minister Shinjiro Koizumi and his Philippine counterpart, Gilberto Teodoro, affirmed Sunday that talks will be launched on the export of surface-to-ship missiles from Japan to the Southeast Asian nation. Koizumi revealed this in talks with reporters after holding a meeting with the Philippine defense chief in Singapore earlier in the day. Type-88 surface-to-ship guided missiles of Japan's Ground Self-Defense Force are expected to be up for consideration. The Philippine side is believed to have shown an interest in procuring the missiles as the Self-Defense Forces used them in the Balikatan multilateral exercises conducted in Manila between April and May. The SDF, which had taken part in the annual exercises organized by the United States and the Philippines as an observer since 2012, joined the drills on a full scale for the first time this year following the entry into force of the Japan-Philippine reciprocal access agreement in September 2025. The possible procurement of Type-88 missiles is expected to help reinforce the deterrent and response capabilities of the Philippines, which is in a territorial dispute with China in the South China Sea.

artificial intelligence, philippines, social media, (14 more...)

The Japan Times

Country:

Asia > Japan (1.00)
Asia > Middle East > Iran (0.41)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.25)
Asia > Philippines > Luzon > Ilocos Region > Province of Ilocos Norte (0.25)

Industry: Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence (0.98)
Information Technology > Communications > Social Media (0.76)

Add feedback

A Appendix

Neural Information Processing SystemsFeb-17-2026, 07:56:21 GMT

The complete list may be seen in Table 8. Here are a few general notes about these strings: 1. Based on their recommendations, we did the following: 1. zh, zh_Latn: This resulted in the special filters described below. URLs) the corpora were in languages different from the LangID predictions. This is mainly mis-rendered PDFs and may have practical applications for denoising, or for decoding such garbled PDFs.

latn, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Tonga (0.04)
North America > United States (0.04)
South America > Peru > Huánuco Department > Huánuco Province > Huánuco (0.04)
(24 more...)

Industry: Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.67)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

Language Model Tokenizers Introduce Unfairness Between Languages

Neural Information Processing SystemsFeb-14-2026, 13:54:12 GMT

Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, there are concerns about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tok-enization lengths, with differences up to 15 times in some cases. These disparities persist even for tokenizers that are intentionally trained for multilingual support.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Haiti (0.14)
Asia > Philippines > Luzon > Ilocos Region > Province of Pangasinan (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(38 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

A Appendix A.1 LangID Details

Neural Information Processing SystemsOct-9-2025, 08:30:30 GMT

latn, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Tonga (0.04)
North America > United States (0.04)
South America > Peru > Huánuco Department > Huánuco Province > Huánuco (0.04)
(24 more...)

Industry: Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.67)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

74bb24dca8334adce292883b4b651eda-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 22:13:22 GMT

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > Haiti (0.14)
Asia > Philippines > Luzon > Ilocos Region > Province of Pangasinan (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(38 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
(2 more...)

Add feedback

Is LLM the Silver Bullet to Low-Resource Languages Machine Translation?

Song, Yewei, Li, Lujun, Lothritz, Cedric, Ezzini, Saad, Sleem, Lama, Gentile, Niccolo, State, Radu, Bissyandé, Tegawendé F., Klein, Jacques

arXiv.org Artificial IntelligenceMar-31-2025

Low-Resource Languages (LRLs) present significant challenges in natural language processing due to their limited linguistic resources and underrepresentation in standard datasets. While recent advancements in Large Language Models (LLMs) and Neural Machine Translation (NMT) have substantially improved translation capabilities for high-resource languages, performance disparities persist for LRLs, particularly impacting privacy-sensitive and resource-constrained scenarios. This paper systematically evaluates the limitations of current LLMs across 200 languages using benchmarks such as FLORES-200. We also explore alternative data sources, including news articles and bilingual dictionaries, and demonstrate how knowledge distillation from large pre-trained models can significantly improve smaller LRL translations. Additionally, we investigate various fine-tuning strategies, revealing that incremental enhancements markedly reduce performance gaps on smaller LLMs.

large language model, machine learning, translation, (17 more...)

arXiv.org Artificial Intelligence

2503.24102

Country:

North America > United States (1.00)
Oceania > Australia (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Do Not Trust Licenses You See: Dataset Compliance Requires Massive-Scale AI-Powered Lifecycle Tracing

Kim, Jaekyeom, Sohn, Sungryull, Jo, Gerrard Jeongwon, Choi, Jihoon, Bae, Kyunghoon, Lee, Hwayoung, Park, Yongmin, Lee, Honglak

arXiv.org Artificial IntelligenceMar-14-2025

This paper argues that a dataset's legal risk cannot be accurately assessed by its license terms alone; instead, tracking dataset redistribution and its full lifecycle is essential. However, this process is too complex for legal experts to handle manually at scale. Tracking dataset provenance, verifying redistribution rights, and assessing evolving legal risks across multiple stages require a level of precision and efficiency that exceeds human capabilities. Addressing this challenge effectively demands AI agents that can systematically trace dataset redistribution, analyze compliance, and identify legal risks. We develop an automated data compliance system called NEXUS and show that AI can perform these tasks with higher accuracy, efficiency, and cost-effectiveness than human experts. Our massive legal analysis of 17,429 unique entities and 8,072 license terms using this approach reveals the discrepancies in legal rights between the original datasets before redistribution and their redistributed subsets, underscoring the necessity of the data lifecycle-aware compliance. For instance, we find that out of 2,852 datasets with commercially viable individual license terms, only 605 (21%) are legally permissible for commercialization. This work sets a new standard for AI data governance, advocating for a framework that systematically examines the entire lifecycle of dataset redistribution to ensure transparent, legal, and responsible dataset management.

dataset, dependency, trust license, (12 more...)

arXiv.org Artificial Intelligence

2503.02784

Country:

Asia > Japan (0.14)
North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)
(26 more...)

Genre: Research Report (0.84)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Add feedback

Franken-Adapter: Cross-Lingual Adaptation of LLMs by Embedding Surgery

Jiang, Fan, Yu, Honglin, Chung, Grace, Cohn, Trevor

arXiv.org Artificial IntelligenceFeb-11-2025

The capabilities of Large Language Models (LLMs) in low-resource languages lag far behind those in English, making their universal accessibility a significant challenge. To alleviate this, we present $\textit{Franken-Adapter}$, a modular language adaptation approach for decoder-only LLMs with embedding surgery. Our method begins by creating customized vocabularies for target languages and performing language adaptation through embedding tuning on multilingual data. These pre-trained embeddings are subsequently integrated with LLMs that have been instruction-tuned on English alignment data to enable zero-shot cross-lingual transfer. Our experiments on $\texttt{Gemma2}$ models with up to 27B parameters demonstrate improvements of up to 20% across 96 languages, spanning both discriminative and generative tasks, with minimal regressions ($<$1%) in English. Further in-depth analysis reveals the critical role of customizing tokenizers in enhancing language adaptation, while boosting inference efficiency. Additionally, we show the versatility of our method by achieving a 14% improvement over a math-optimized LLM across 20 languages, offering a modular solution to transfer reasoning abilities across languages post hoc.

franken-adapter, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.08037

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
(17 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

DialUp! Modeling the Language Continuum by Adapting Models to Dialects and Dialects to Models

Bafna, Niyati, Chang, Emily, Robinson, Nathaniel R., Mortensen, David R., Murray, Kenton, Yarowsky, David, Sirin, Hale

arXiv.org Artificial IntelligenceJan-27-2025

Most of the world's languages and dialects are low-resource, and lack support in mainstream machine translation (MT) models. However, many of them have a closely-related high-resource language (HRL) neighbor, and differ in linguistically regular ways from it. This underscores the importance of model robustness to dialectical variation and cross-lingual generalization to the HRL dialect continuum. We present DialUp, consisting of a training-time technique for adapting a pretrained model to dialectical data (M->D), and an inference-time intervention adapting dialectical data to the model expertise (D->M). M->D induces model robustness to potentially unseen and unknown dialects by exposure to synthetic data exemplifying linguistic mechanisms of dialectical variation, whereas D->M treats dialectical divergence for known target dialects. These methods show considerable performance gains for several dialects from four language families, and modest gains for two other language families. We also conduct feature and error analyses, which show that language varieties with low baseline MT performance are more likely to benefit from these approaches.

large language model, latn, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.16581

Country:

Europe > Sweden (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Philippines > Luzon > Ilocos Region > Province of Pangasinan (0.05)
(21 more...)

Genre:

Personal > Honors (0.46)
Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter

Blaschke, Verena, Fedzechkina, Masha, ter Hoeve, Maartje

arXiv.org Artificial IntelligenceJan-24-2025

Cross-lingual transfer is a popular approach to increase the amount of training data for NLP tasks in a low-resource context. However, the best strategy to decide which cross-lingual data to include is unclear. Prior research often focuses on a small set of languages from a few language families and/or a single task. It is still an open question how these findings extend to a wider variety of languages and tasks. In this work, we analyze cross-lingual transfer for 266 languages from a wide variety of language families. Moreover, we include three popular NLP tasks: POS tagging, dependency parsing, and topic classification. Our findings indicate that the effect of linguistic similarity on transfer performance depends on a range of factors: the NLP task, the (mono- or multilingual) input representations, and the definition of linguistic similarity.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.14491

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.14)
(26 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback