AITopics

Linguistic features remain essential for interpretability and tasks that involve style, structure, and readability, but existing Spanish tools offer limited coverage. We present PUCP-Metrix, an open-source and comprehensive toolkit for linguistic analysis of Spanish texts. PUCP-Metrix includes 182 linguistic metrics spanning lexical diversity, syntactic and semantic complexity, cohesion, psycholinguistics, and readability. It enables fine-grained, interpretable text analysis. We evaluate its usefulness on Automated Readability Assessment and Machine-Generated Text Detection, showing competitive performance compared to an existing repository and strong neural baselines. PUCP-Metrix offers a comprehensive and extensible resource for Spanish, supporting diverse NLP applications.

artificial intelligence, natural language, text processing, (17 more...)

2511.17402

Country: North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.64)

Industry: Education (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Thomas, Marshall, Fish, Edward, Bowden, Richard

SignBind-LLM: Multi-Stage Modality Fusion for Sign Language Translation

Despite progress in gloss-free Sign Language Translation (SLT), traditional single modality end-to-end approaches consistently fail on two critical components of natural signing: the precise recognition of high-speed fingerspelling and the integration of asynchronous non-manual cues from the face. Recent progress in SLT with Large Language Models has side stepped this challenge, forcing a single network to learn these simultaneously resulting in poor performance when tasked with translating crucial information such as names, places, and technical terms. We introduce SignBind-LLM, a modular framework designed to overcome these limitations. Our approach employs separate, specialized predictors for continuous signing, fingerspelling, and lipreading. Each expert network first decodes its specific modality into a sequence of tokens. These parallel streams are then fused by a lightweight transformer that resolves temporal misalignments before passing the combined representation to a Large Language Model (LLM) for final sentence generation. Our method establishes a new state-of-the-art on the How2Sign, ChicagoFSWildPlus, and BOBSL datasets with a BLEU-4 score of 22.1, 73.2% letter accuracy and BLEU-4 score of 6.8 respectively. These results validate our core hypothesis: isolating and solving distinct recognition tasks before fusion provides a more powerful and effective pathway to robust, high-fidelity sign language translation.

large language model, machine learning, translation, (18 more...)

2509.0003

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education > Curriculum > Subject-Specific Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Susman, Aviad, Lin, Baihan, Suárez-Fariñas, Mayte, Colonel, Joseph T

SoftStep: Learning Sparse Similarity Powers Deep Neighbor-Based Regression

Neighbor-based methods are a natural alternative to linear prediction for tabular data when relationships between inputs and targets exhibit complexity such as nonlinearity, periodicity, or heteroscedasticity. Yet in deep learning on unstructured data, nonparametric neighbor-based approaches are rarely implemented in lieu of simple linear heads. This is primarily due to the ability of systems equipped with linear regression heads to co-learn internal representations along with the linear head's parameters. To unlock the full potential of neighbor-based methods in neural networks we introduce SoftStep, a parametric module that learns sparse instance-wise similarity measures directly from data. When integrated with existing neighbor-based methods, SoftStep enables regression models that consistently outperform linear heads across diverse architectures, domains, and training scenarios. We focus on regression tasks, where we show theoretically that neighbor-based prediction with a mean squared error objective constitutes a metric learning algorithm that induces well-structured embedding spaces. We then demonstrate analytically and empirically that this representational structure translates into superior performance when combined with the sparse, instance-wise similarity measures introduced by SoftStep. Beyond regression, SoftStep is a general method for learning instance-wise similarity in deep neural networks, with broad applicability to attention mechanisms, metric learning, representational alignment, and related paradigms.

artificial intelligence, machine learning, softstep, (18 more...)

2506.08139

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Nuclear Medicine (0.68)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.46)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data

Ji, Shaoxiong, Li, Zihao, Paavola, Jaakko, Luo, Hengyu, Tiedemann, Jörg

This paper investigates a critical design decision in the practice of massively multilingual continual pre-training -- the inclusion of parallel data. Specifically, we study the impact of bilingual translation data for massively multilingual language adaptation of the Llama3 family of models to 500 languages. To this end, we construct the MaLA bilingual translation corpus, containing data from more than 2,500 language pairs. Subsequently, we develop the EMMA-500 Llama 3 suite of four massively multilingual models -- continually pre-trained from the Llama 3 family of base models extensively on diverse data mixes up to 671B tokens -- and explore the effect of continual pre-training with or without bilingual translation data. Comprehensive evaluation across 7 tasks and 12 benchmarks demonstrates that bilingual data tends to enhance language transfer and performance, particularly for low-resource languages. We open-source the MaLA corpus, EMMA-500 Llama 3 suite artefacts, code, and model generations.

large language model, latn, machine learning, (21 more...)

2506.00469

Country:

Europe (1.00)
North America > Canada (0.27)
North America > United States > Pennsylvania (0.27)
Asia > Middle East (0.27)

Genre: Research Report > New Finding (0.92)

Industry:

Education (0.46)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Turing Test 2.0: The General Intelligence Threshold

Mappouras, Georgios

With the rise of artificial intelligence (A.I.) and large language models like ChatGPT, a new race for achieving artificial general intelligence (A.G.I) has started. While many speculate how and when A.I. will achieve A.G.I., there is no clear agreement on how A.G.I. can be detected in A.I. models, even when popular tools like the Turing test (and its modern variations) are used to measure their intelligence. In this work, we discuss why traditional methods like the Turing test do not suffice for measuring or detecting A.G.I. and provide a new, practical method that can be used to decide if a system (computer or any other) has reached or surpassed A.G.I. To achieve this, we make two new contributions. First, we present a clear definition for general intelligence (G.I.) and set a G.I. Threshold (G.I.T.) that can be used to distinguish between systems that achieve A.G.I. and systems that do not. Second, we present a new framework on how to construct tests that can detect if a system has achieved G.I. in a simple, comprehensive, and clear-cut fail/pass way. We call this novel framework the Turing test 2.0. We then demonstrate real-life examples of applying tests that follow our Turing test 2.0 framework on modern A.I. models.

information, large language model, machine learning, (16 more...)

2505.1955

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Education (1.00)
Leisure & Entertainment (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.92)
Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Turing's Test (1.00)
(2 more...)

The Japan TimesDec-4-2025, 08:40:00 GMT

Spotlight shines on humanoid robots at Tokyo show

A humanoid robot from GMO Internet Group dances and hops at the 2025 International Robot Exhibition on Wednesday in Tokyo. Robots equipped with cutting-edge technologies that perform duties on behalf of humans at workplaces and disaster-hit sites are on display at the 2025 International Robot Exhibition in Tokyo. At the exhibition, which kicked off at Tokyo Big Sight on Wednesday, the spotlight is on humanoid robots as well as those powered by artificial intelligence. Kawasaki Heavy Industries is showcasing the newest model of its humanoid robot Kaleido, which is equipped with technologies such as autonomous movement and remote control. In a demonstration held the same day, the robot extinguished a mock fire, removed a fallen shelf weighing 30 kilograms and rescued a dummy cat.

artificial intelligence, humanoid robot, robot, (9 more...)

The Japan Times

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (1.00)
North America > United States (0.16)
Asia > Taiwan (0.05)
(3 more...)

Industry:

Government > Regional Government (0.49)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.33)
Education > Educational Setting > K-12 Education (0.31)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)

The Japan TimesDec-4-2025, 05:40:00 GMT

Police arrest high school student over cyberattack on net cafe operator

The Metropolitan Police Department arrested a 17-year-old boy on Thursday for allegedly carrying out a cyberattack on the operator of the Kaikatsu Club internet cafe chain, sources said. Tokyo police served an arrest warrant on a 17-year-old boy on Thursday for allegedly carrying out a cyberattack on the operator of the Kaikatsu Club internet cafe chain, investigative sources said. The Metropolitan Police Department arrested the second-year high school student from the city of Osaka over an alleged violation of the law against unauthorized computer access and fraudulent obstruction of business. According to the sources, the boy fraudulently obtained about 7.25 million sets of Kaikatsu Club membership information with a computer program he created using the ChatGPT artificial intelligence chatbot. The boy is said to have skills strong enough to have won awards in cybersecurity competitions, as reported by TBS.

artificial intelligence, chatbot, natural language, (12 more...)

The Japan Times

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.26)
North America > United States (0.05)
(4 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government (1.00)
Education > Educational Setting > K-12 Education > Secondary School (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Communications > Social Media (0.76)

Los Angeles TimesDec-4-2025, 04:45:42 GMT

Protest at synagogue in Koreatown ends in arrests, hate accusations

Things to Do in L.A. Tap to enable a layout that focuses on the article. The Audrey Irmas Pavilion, left, at the Wilshire Boulevard Temple, center in background, in 2021. This is read by an automated voice. Please report any issues or inconsistencies here . Two were arrested during a pro-Palestinian protest at Wilshire Boulevard Temple that ended in confrontation.

artificial intelligence, social media, wilshire boulevard temple, (13 more...)

Los Angeles Times

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.21)
Asia > Middle East > Israel (0.07)
Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.05)
(5 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Liu, Pangpang, Lu, Junwei, Sun, Will Wei

Uncertainty Quantification for Large Language Model Reward Learning under Heterogeneous Human Feedback

arXiv.org Machine LearningDec-4-2025

We study estimation and statistical inference for reward models used in aligning large language models (LLMs). A key component of LLM alignment is reinforcement learning from human feedback (RLHF), where humans compare pairs of model-generated answers and their preferences are used to train a reward model. However, human feedback is inherently heterogeneous, creating significant challenges for reliable reward learning. To address this, we adopt a heterogeneous preference framework that jointly models the latent reward of answers and human rationality. This leads to a challenging biconvex optimization problem, which we solve via an alternating gradient descent algorithm. We establish theoretical guarantees for the resulting estimator, including its convergence and asymptotic distribution. These results enable the construction of confidence intervals for reward estimates. Leveraging these uncertainty quantification results, we conduct valid statistical comparisons between rewards and incorporate uncertainty into the best-of-$N$ (BoN) policy framework. Extensive simulations demonstrate the effectiveness of our method, and applications to real LLM data highlight the practical value of accounting for uncertainty in reward modeling for LLM alignment.

arxiv preprint arxiv, assumption 1, probability, (14 more...)

arXiv.org Machine Learning

2512.03208

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.81)

Industry:

Education (0.67)
Media > Film (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

arXiv.org Artificial IntelligenceDec-4-2025

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

Sprague, Zayne, Lu, Jack, Wadhwa, Manya, Keh, Sedrick, Ren, Mengye, Durrett, Greg

Reasoning models leveraging long chains of thought employ various cognitive skills, such as verification of their answers, backtracking, retrying by an alternate method, and more. Previous work has shown that when a base language model exhibits these skills, training that model further with reinforcement learning (RL) can learn to leverage them. How can we get models to leverage skills that aren't exhibited by base models? Our work, SkillFactory, is a method for fine-tuning models to roughly learn these skills during a supervised fine-tuning (SFT) stage prior to RL. Our approach does not rely on distillation from a stronger model, but instead uses samples from the model itself, rearranged to provide training data in the format of those skills. These "silver" SFT traces may be imperfect, but are nevertheless effective for priming a model to acquire skills during RL. Our evaluation shows that (1) starting from SkillFactory SFT initialization helps a model to generalize to harder variants of a task post-RL, despite lower performance pre-RL; (2) cognitive skills are indeed used by the model; (3) RLed SkillFactory models are more robust to regression on out-of-domain tasks than RLed base models. Our work suggests that inductive biases learned prior to RL help models learn robust cognitive skill use.

large language model, machine learning, natural language, (21 more...)

2512.04072

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)