AITopics | Aljunied, Mahani

Collaborating Authors

Aljunied, Mahani

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Zhao, Yiran, Liu, Chaoqun, Deng, Yue, Ying, Jiahao, Aljunied, Mahani, Li, Zhaodonghui, Bing, Lidong, Chan, Hou Pong, Rong, Yu, Zhao, Deli, Zhang, Wenxuan

arXiv.org Artificial IntelligenceMar-2-2025

Large language models (LLMs) have revolutionized natural language processing (NLP), yet open-source multilingual LLMs remain scarce, with existing models often limited in language coverage. Such models typically prioritize well-resourced languages, while widely spoken but under-resourced languages are often overlooked. To address this disparity, we introduce $\texttt{Babel}$, an open multilingual LLM that covers the top 25 languages by number of speakers, supports over 90% of the global population, and includes many languages neglected by other open multilingual LLMs. Unlike traditional continue pretraining approaches, Babel expands its parameter count through a layer extension technique that elevates Babel's performance ceiling. We introduce two variants: $\texttt{Babel-9B}$, designed for efficient inference and fine-tuning, and $\texttt{Babel-83B}$, which sets a new standard for open multilingual LLMs. Extensive evaluations on multilingual tasks demonstrate its superior performance compared to open LLMs of comparable size. In addition, using open-source supervised fine-tuning datasets, Babel achieves remarkable performance, with Babel-9B-Chat leading among 10B-sized LLMs and Babel-83B-Chat setting a new standard for multilingual tasks, reaching the same level of commercial models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.00865

Country:

Asia (0.31)
North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia

Liu, Chaoqun, Zhang, Wenxuan, Ying, Jiahao, Aljunied, Mahani, Luu, Anh Tuan, Bing, Lidong

arXiv.org Artificial IntelligenceFeb-10-2025

This study introduces two novel benchmarks, SeaExam and SeaBench, designed to evaluate the capabilities of Large Language Models (LLMs) in Southeast Asian (SEA) application scenarios. Unlike existing multilingual datasets primarily derived from English translations, these benchmarks are constructed based on real-world scenarios from SEA regions. SeaExam draws from regional educational exams to form a comprehensive dataset that encompasses subjects such as local history and literature. In contrast, SeaBench is crafted around multi-turn, open-ended tasks that reflect daily interactions within SEA communities. Our evaluations demonstrate that SeaExam and SeaBench more effectively discern LLM performance on SEA language tasks compared to their translated benchmarks. This highlights the importance of using real-world queries to assess the multilingual capabilities of LLMs.

large language model, machine learning, seabench, (20 more...)

arXiv.org Artificial Intelligence

2502.06298

Country:

Asia > Southeast Asia (0.40)
Europe > Middle East > Malta (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SeaLLMs -- Large Language Models for Southeast Asia

Nguyen, Xuan-Phi, Zhang, Wenxuan, Li, Xin, Aljunied, Mahani, Tan, Qingyu, Cheng, Liying, Chen, Guanzheng, Deng, Yue, Yang, Sen, Liu, Chaoqun, Zhang, Hang, Bing, Lidong

arXiv.org Artificial IntelligenceDec-1-2023

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages. To address this imbalance, we introduce SeaLLMs, an innovative series of language models that specifically focuses on Southeast Asian (SEA) languages. SeaLLMs are built upon the Llama-2 model and further advanced through continued pre-training with an extended vocabulary, specialized instruction and alignment tuning to better capture the intricacies of regional languages. This allows them to respect and reflect local cultural norms, customs, stylistic preferences, and legal considerations. Our comprehensive evaluation demonstrates that SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open-source models. Moreover, they outperform ChatGPT-3.5 in non-Latin languages, such as Thai, Khmer, Lao, and Burmese, by large margins while remaining lightweight and cost-effective to operate.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2312.00738

Country: Asia > Southeast Asia (0.41)

Genre: Research Report (0.82)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback