AITopics

Evolution of Concepts in Language Model Pre-Training

Ge, Xuyang, Shu, Wentao, Wu, Jiaxing, Zhou, Yunhua, He, Zhengfu, Qiu, Xipeng

Language models obtain extensive capabilities through pre-training. However, the pre-training process remains a black box. In this work, we track linear interpretable feature evolution across pre-training snapshots using a sparse dictionary learning method called crosscoders. We find that most features begin to form around a specific point, while more complex patterns emerge in later training stages. Feature attribution analyses reveal causal connections between feature evolution and downstream performance. Our feature-level observations are highly consistent with previous findings on Transformer's two-stage learning process, which we term a statistical learning phase and a feature learning phase. Our work opens up the possibility to track fine-grained representation progress during language model learning dynamics.

large language model, machine learning, natural language, (19 more...)

2509.17196

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningSep-23-2025

Near-Optimal Sample Complexity Bounds for Constrained Average-Reward MDPs

Wei, Yukuan, Li, Xudong, Yang, Lin F.

Recent advances have significantly improved our understanding of the sample complexity of learning in average-reward Markov decision processes (AMDPs) under the generative model. However, much less is known about the constrained average-reward MDP (CAMDP), where policies must satisfy long-run average constraints. In this work, we address this gap by studying the sample complexity of learning an $ε$-optimal policy in CAMDPs under a generative model. We propose a model-based algorithm that operates under two settings: (i) relaxed feasibility, which allows small constraint violations, and (ii) strict feasibility, where the output policy satisfies the constraint. We show that our algorithm achieves sample complexities of $\tilde{O}\left(\frac{S A (B+H)}{ ε^2}\right)$ and $\tilde{O} \left(\frac{S A (B+H)}{ε^2 ζ^2} \right)$ under the relaxed and strict feasibility settings, respectively. Here, $ζ$ is the Slater constant indicating the size of the feasible region, $H$ is the span bound of the bias function, and $B$ is the transient time bound. Moreover, a matching lower bound of $\tildeΩ\left(\frac{S A (B+H)}{ ε^2ζ^2}\right)$ for the strict feasibility case is established, thus providing the first minimax-optimal bounds for CAMDPs. Our results close the theoretical gap in understanding the complexity of constrained average-reward MDPs.

algorithm, conference paper, sample complexity, (15 more...)

arXiv.org Machine Learning

2509.16586

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(4 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
(2 more...)

CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation

Villa-Cueva, Emilio, Bolatzhanova, Sholpan, Turmakhan, Diana, Elzeky, Kareem, Ademtew, Henok Biadglign, Aji, Alham Fikri, Araujo, Vladimir, Azime, Israel Abebe, Baek, Jinheon, Belcavello, Frederico, Cristobal, Fermin, Cruz, Jan Christian Blaise, Dabre, Mary, Dabre, Raj, Ehsan, Toqeer, Etori, Naome A, Farooqui, Fauzan, Geng, Jiahui, Ivetta, Guido, Jayakumar, Thanmay, Jeong, Soyeong, Lim, Zheng Wei, Mandal, Aishik, Martinelli, Sofia, Mihaylov, Mihail Minkov, Orel, Daniil, Pramanick, Aniket, Purkayastha, Sukannya, Salazar, Israfel, Song, Haiyue, Torrent, Tiago Timponi, Yadeta, Debela Desalegn, Hamed, Injy, Tonja, Atnafu Lambebo, Solorio, Thamar

Translating cultural content poses challenges for machine translation systems due to the differences in conceptualizations between cultures, where language alone may fail to convey sufficient context to capture region-specific meanings. In this work, we investigate whether images can act as cultural context in multimodal translation. We introduce CaMMT, a human-curated benchmark of over 5,800 triples of images along with parallel captions in English and regional languages. Using this dataset, we evaluate five Vision Language Models (VLMs) in text-only and text+image settings. Through automatic and human evaluations, we find that visual context generally improves translation quality, especially in handling Culturally-Specific Items (CSIs), disambiguation, and correct gender marking. By releasing CaMMT, our objective is to support broader efforts to build and evaluate multimodal translation systems that are better aligned with cultural nuance and regional variations.

machine learning, natural language, translation, (18 more...)

2505.24456

Country:

Asia (1.00)
Europe (0.93)
Africa (0.68)
South America > Argentina (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Ju, Zhuoxuan, Wu, Jingni, Purushothama, Abhishek, Zeldes, Amir

DeDisCo at the DISRPT 2025 Shared Task: A System for Discourse Relation Classification

This paper presents DeDisCo, Georgetown University's entry in the DISRPT 2025 shared task on discourse relation classification. We test two approaches, using an mt5-based encoder and a decoder based approach using the openly available Qwen model. We also experiment on training with augmented dataset for low-resource languages using matched data translated automatically from English, as well as using some additional linguistic features inspired by entries in previous editions of the Shared Task. Our system achieves a macro-accuracy score of 71.28, and we provide some interpretation and error analysis for our results.

computational linguistic, large language model, machine learning, (17 more...)

2509.11498

Country:

Europe (1.00)
South America (0.67)
North America > United States > Maryland (0.28)
Asia > Japan > Honshū (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Rodríguez-Bocca, Pablo, Pereira, Guillermo, Kiedanski, Diego, Collazo, Soledad, Basterrech, Sebastián, Rubino, Gerardo

An AutoML Framework using AutoGluonTS for Forecasting Seasonal Extreme Temperatures

artificial intelligence, forecasting seasonal extreme temperature, machine learning, (14 more...)

In recent years, great progress has been made in the field of forecasting meteorological variables. Recently, deep learning architectures have made a major breakthrough in forecasting the daily average temperature over a ten-day horizon. However, advances in forecasting events related to the maximum temperature over short horizons remain a challenge for the community. A problem that is even more complex consists in making predictions of the maximum daily temperatures in the short, medium, and long term. In this work, we focus on forecasting events related to the maximum daily temperature over medium-term periods (90 days). Therefore, instead of addressing the problem from a meteorological point of view, this article tackles it from a climatological point of view. Due to the complexity of this problem, a common approach is to frame the study as a temporal classification problem with the classes: maximum temperature "above normal", "normal" or "below normal". From a practical point of view, we created a large historical dataset (from 1981 to 2018) collecting information from weather stations located in South America. In addition, we also integrated exogenous information from the Pacific, Atlantic, and Indian Ocean basins. We applied the AutoGluonTS platform to solve the above-mentioned problem. This AutoML tool shows competitive forecasting performance with respect to large operational platforms dedicated to tackling this climatological problem; but with a "relatively" low computational cost in terms of time and resources.

2509.17734

Country:

South America (1.00)
Europe > Spain (0.28)
Europe > France (0.28)
Europe > Denmark (0.28)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.66)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Automated Labeling of Intracranial Arteries with Uncertainty Quantification Using Deep Learning

Bisbal, Javier, Winter, Patrick, Jofre, Sebastian, Ponce, Aaron, Ansari, Sameer A., Abdalla, Ramez, Markl, Michael, Odeback, Oliver Welin, Uribe, Sergio, Tejos, Cristian, Sotelo, Julio, Schnell, Susanne, Marlevi, David

Accurate anatomical labeling of intracranial arteries is essential for cerebrovascular diagnosis and hemodynamic analysis but remains time-consuming and subject to interoperator variability. We present a deep learning-based framework for automated artery labeling from 3D Time-of-Flight Magnetic Resonance Angiography (3D ToF-MRA) segmentations (n=35), incorporating uncertainty quantification to enhance interpretability and reliability. We evaluated three convolutional neural network architectures: (1) a UNet with residual encoder blocks, reflecting commonly used baselines in vascular labeling; (2) CS-Net, an attention-augmented UNet incorporating channel and spatial attention mechanisms for enhanced curvilinear structure recognition; and (3) nnUNet, a self-configuring framework that automates preprocessing, training, and architectural adaptation based on dataset characteristics. Among these, nnUNet achieved the highest labeling performance (average Dice score: 0.922; average surface distance: 0.387 mm), with improved robustness in anatomically complex vessels. To assess predictive confidence, we implemented test-time augmentation (TT A) and introduced a novel coordinate-guided strategy to reduce interpolation errors during augmented inference. The resulting uncertainty maps reliably indicated regions of anatomical ambiguity, pathological variation, or manual labeling inconsistency. We further validated clinical utility by comparing flow velocities derived from automated and manual labels in co-registered 4D Flow MRI datasets, observing close agreement with no statistically significant differences. Our framework offers a scalable, accurate, and uncertainty-aware solution for automated cerebrovascular labeling, supporting downstream hemodynamic analysis and facilitating clinical integration. Introduction The intracranial arterial system plays a critical role in brain perfusion to maintain normal cognitive function.

artificial intelligence, machine learning, nnunet, (18 more...)

2509.17726

Country:

North America > United States (0.68)
Europe (0.68)
South America (0.47)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study > Negative Result (0.34)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Simbeck, Katharina, Mahran, Mariam

Mechanistic Interpretability with SAEs: Probing Religion, Violence, and Geography in Large Language Models

Despite growing research on bias in large language models (LLMs), most work has focused on gender and race, with little attention to religious identity. This paper explores how religion is internally represented in LLMs and how it intersects with concepts of violence and geography. Using mechanistic interpretability and Sparse Autoencoders (SAEs) via the Neuronpedia API, we analyze latent feature activations across five models. We measure overlap between religion- and violence-related prompts and probe semantic patterns in activation contexts. While all five religions show comparable internal cohesion, Islam is more frequently linked to features associated with violent language. In contrast, geographic associations largely reflect real-world religious demographics, revealing how models embed both factual distributions and cultural stereotypes. These findings highlight the value of structural analysis in auditing not just outputs but also internal representations that shape model behavior.

large language model, machine learning, religion, (18 more...)

2509.17665

Country:

Oceania (1.00)
North America > Canada (1.00)
Europe > United Kingdom (1.00)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Turk, Nawar, Comitogianni, Daniele, Kosseim, Leila

CLaC at DISRPT 2025: Hierarchical Adapters for Cross-Framework Multi-lingual Discourse Relation Classification

We present our submission to Task 3 (Discourse Relation Classification) of the DISRPT 2025 shared task. Task 3 introduces a unified set of 17 discourse relation labels across 39 corpora in 16 languages and six discourse frameworks, posing significant multilingual and cross-formalism challenges. We first benchmark the task by fine-tuning multilingual BERT-based models (mBERT, XLM-RoBERTa-Base, and XLM-RoBERTa-Large) with two argument-ordering strategies and progressive unfreezing ratios to establish strong baselines. We then evaluate prompt-based large language models (namely Claude Opus 4.0) in zero-shot and few-shot settings to understand how LLMs respond to the newly proposed unified labels. Finally, we introduce HiDAC, a Hierarchical Dual-Adapter Contrastive learning model. Results show that while larger transformer models achieve higher accuracy, the improvements are modest, and that unfreezing the top 75% of encoder layers yields performance comparable to full fine-tuning while training far fewer parameters. Prompt-based models lag significantly behind fine-tuned transformers, and HiDAC achieves the highest overall accuracy (67.5%) while remaining more parameter-efficient than full fine-tuning.

large language model, machine learning, natural language, (14 more...)

2509.16903

Country:

North America > United States (1.00)
Europe (1.00)
North America > Canada (0.69)
(2 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Sousa, Italo Alberto, da Silva, Mariana Carvalho, Machado, Jorge, Vaz, José Carlos

Exploring AI Capabilities in Participatory Budgeting within Smart Cities: The Case of Sao Paulo

This research examines how Artificial Intelligence (AI) can improve participatory budgeting processes within smart cities. In response to challenges like declining civic participation and resource allocation conflicts, the study explores how online political participation can be improved by AI. It investigates the state capacity governments need to implement AI-enhanced participatory tools, considering technological dependencies and vulnerabilities. It analyzes technological and administrative structures, actors, interests, and strategies to understand the dynamics of online political participation technologies in the case of Sao Paulo, Brazil. The study contributes to understanding how technological advancements can reshape participatory budgeting processes. In a broader sense, the research highlights how AI can transform participatory institutions by offering new tools for citizens and also for government officials in charge of participatory processes within smart cities.

artificial intelligence, natural language, proposal, (17 more...)

2509.16724

Country: South America > Brazil > São Paulo > São Paulo (0.14)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Applied AI (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)