AITopics

We present Boolean-aware attention, a novel attention mechanism that dynamically adjusts token focus based on Boolean operators (e.g., and, or, not). Our model employs specialized Boolean experts, each tailored to amplify or suppress attention for operator-specific contexts. A predefined gating mechanism activates the corresponding experts based on the detected Boolean type. Experiments on Boolean retrieval datasets demonstrate that integrating BoolAttn with BERT greatly enhances the model's capability to process Boolean queries.

boolean operator, mechanism, operator, (15 more...)

2503.01753

Country:

South America > Brazil (0.04)
North America > United States > Arkansas (0.04)
North America > Mexico (0.04)
Asia > Vietnam (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Betran, Santiago Bou, Longhini, Alberta, Vasco, Miguel, Zhang, Yuchong, Kragic, Danica

FLAME: A Federated Learning Benchmark for Robotic Manipulation

Recent progress in robotic manipulation has been fueled by large-scale datasets collected across diverse environments. Training robotic manipulation policies on these datasets is traditionally performed in a centralized manner, raising concerns regarding scalability, adaptability, and data privacy. While federated learning enables decentralized, privacy-preserving training, its application to robotic manipulation remains largely unexplored. We introduce FLAME (Federated Learning Across Manipulation Environments), the first benchmark designed for federated learning in robotic manipulation. FLAME consists of: (i) a set of large-scale datasets of over 160,000 expert demonstrations of multiple manipulation tasks, collected across a wide range of simulated environments; (ii) a training and evaluation framework for robotic policy learning in a federated setting. We evaluate standard federated learning algorithms in FLAME, showing their potential for distributed policy learning and highlighting key challenges. Our benchmark establishes a foundation for scalable, adaptive, and privacy-aware robotic learning.

federated learning, learning, robotic manipulation, (13 more...)

2503.01729

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > Alberta (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > China > Shandong Province > Qingdao (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ueda, Ryo, Kuribayashi, Tatsuki, Kando, Shunsuke, Inui, Kentaro

Syntactic Learnability of Echo State Neural Language Models at Scale

What is a neural model with minimum architectural complexity that exhibits reasonable language learning capability? To explore such a simple but sufficient neural language model, we revisit a basic reservoir computing (RC) model, Echo State Network (ESN), a restricted class of simple Recurrent Neural Networks. Our experiments showed that ESN with a large hidden state is comparable or superior to Transformer in grammaticality judgment tasks when trained with about 100M words, suggesting that architectures as complex as that of Transformer may not always be necessary for syntactic learning.

esn, nll, validation nll, (14 more...)

2503.01724

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(6 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bitton, Yehonatan, Bitton, Elad, Nisan, Shai

Detecting Stylistic Fingerprints of Large Language Models

Large language models (LLMs) have distinct and consistent stylistic fingerprints, even when prompted to write in different writing styles. Detecting these fingerprints is important for many reasons, among them protecting intellectual property, ensuring transparency regarding AI-generated content, and preventing the misuse of AI technologies. In this paper, we present a novel method to classify texts based on the stylistic fingerprints of the models that generated them. We introduce an LLM-detection ensemble that is composed of three classifiers with varied architectures and training data. This ensemble is trained to classify texts generated by four well-known LLM families: Claude, Gemini, Llama, and OpenAI. As this task is highly cost-sensitive and might have severe implications, we want to minimize false-positives and increase confidence. We consider a prediction as valid when all three classifiers in the ensemble unanimously agree on the output classification. Our ensemble is validated on a test set of texts generated by Claude, Gemini, Llama, and OpenAI models, and achieves extremely high precision (0.9988) and a very low false-positive rate (0.0004). Furthermore, we demonstrate the ensemble's ability to distinguish between texts generated by seen and unseen models. This reveals interesting stylistic relationships between models. This approach to stylistic analysis has implications for verifying the originality of AI-generated texts and tracking the origins of model training techniques.

classifier, detecting stylistic fingerprint, ensemble, (14 more...)

2503.01659

Country:

Europe > Spain (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report (1.00)

Industry: Law > Intellectual Property & Technology Law (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

CoT-VLM4Tar: Chain-of-Thought Guided Vision-Language Models for Traffic Anomaly Resolution

Ren, Tianchi, Hu, Haibo, Zuo, Jiacheng, Chen, Xinhong, Wang, Jianping, Xue, Chun Jason, Wu, Jen-Ming, Guan, Nan

CoT -VLM4T ar: Chain-of-Thought Guided Vision-Language Models for Traffic Anomaly Resolution Tianchi Ren, 1, Haibo Hu, 2, Jiacheng Zuo 3, Xinhong Chen 4, Jianping Wang 5, Chun Jason Xue 6, Jen-Ming Wu 7, Nan Guan, 8 Abstract -- With the acceleration of urbanization, modern urban traffic systems are becoming increasingly complex, leading to frequent traffic anomalies. These anomalies encompass not only common traffic jams but also more challenging issues such as phantom traffic jams, intersection deadlocks, and accident liability analysis, which severely impact traffic flow, vehicular safety, and overall transportation efficiency. Currently, existing solutions primarily rely on manual intervention by traffic police or artificial intelligence-based detection systems. However, these methods often suffer from response delays and inconsistent management due to inadequate resources, while AI detection systems, despite enhancing efficiency to some extent, still struggle to handle complex traffic anomalies in a real-time and precise manner . T o address these issues, we propose CoT -VLM4T ar: (Chain of Thought Visual-Language Model for Traffic Anomaly Resolution), this innovative approach introduces a new chain-of-thought to guide the VLM in analyzing, reasoning, and generating solutions for traffic anomalies with greater reasonable and effective solution, and to evaluate the performance and effectiveness of our method, we developed a closed-loop testing framework based on the CARLA simulator . Furthermore, to ensure seamless integration of the solutions generated by the VLM with the CARLA simulator, we implement an itegration module that converts these solutions into executable commands. Our results demonstrate the effectiveness of VLM in the resolution of real-time traffic anomalies, providing a proof-of-concept for its integration into autonomous traffic management systems.

anomaly, scenario, traffic anomaly, (14 more...)

2503.01632

Country:

Asia > China > Hong Kong (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre: Research Report > New Finding (0.69)

Industry: Transportation > Ground > Road (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Lost in Moderation: How Commercial Content Moderation APIs Over- and Under-Moderate Group-Targeted Hate Speech and Linguistic Variations

Hartmann, David, Oueslati, Amin, Staufer, Dimitri, Pohlmann, Lena, Munzert, Simon, Heuer, Hendrik

Commercial content moderation APIs are marketed as scalable solutions to combat online hate speech. However, the reliance on these APIs risks both silencing legitimate speech, called over-moderation, and failing to protect online platforms from harmful speech, known as under-moderation. To assess such risks, this paper introduces a framework for auditing black-box NLP systems. Using the framework, we systematically evaluate five widely used commercial content moderation APIs. Analyzing five million queries based on four datasets, we find that APIs frequently rely on group identity terms, such as ``black'', to predict hate speech. While OpenAI's and Amazon's services perform slightly better, all providers under-moderate implicit hate speech, which uses codified messages, especially against LGBTQIA+ individuals. Simultaneously, they over-moderate counter-speech, reclaimed slurs and content related to Black, LGBTQIA+, Jewish, and Muslim people. We recommend that API providers offer better guidance on API implementation and threshold setting and more transparency on their APIs' limitations. Warning: This paper contains offensive and hateful terms and concepts. We have chosen to reproduce these terms for reasons of transparency.

computational linguistic, proceedings, speech, (12 more...)

doi: 10.1145/3706598.3713998

2503.01623

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(35 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.68)
Law > Civil Rights & Constitutional Law (0.67)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Tam, Zhi Rui, Wu, Cheng-Kuang, Lin, Chieh-Yen, Chen, Yun-Nung

None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering

Multiple-choice exam questions with "None of the above" (NA) options have been extensively studied in educational testing, in which existing research suggests that they better assess true knowledge. However, their impact on Large Language Models (LLMs) evaluation remains underexplored. Through systematic experiments with 28 LLMs on the MMLU benchmark, we examine how NA options affect model performance and confidence calibration. Our analysis reveals that NA options, when used as the correct answer, lead to a consistent 30-50\% performance drop across models regardless of scale--suggesting that LLMs lack the meta-cognitive ability to systematically evaluate and reject all given options when none are correct. This degradation shows strong domain dependence, with minimal impact on mathematical reasoning (14.6\% drop) but severe effects on tasks requiring uncertainty handling like business ethics (48.1\% drop). Our results highlight important implications for benchmark design and raise questions about LLMs' ability to handle uncertainty in real-world applications.

correct answer, language model, llm, (17 more...)

2503.0155

Country:

Africa (0.04)
South America > Uruguay > Montevideo > Montevideo (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Baranov, Alexander, Palatkina, Anna, Makovka, Yulia, Braslavski, Pavel

KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines

We present KoWit-24, a dataset with fine-grained annotation of wordplay in 2,700 Russian news headlines. KoWit-24 annotations include the presence of wordplay, its type, wordplay anchors, and words/phrases the wordplay refers to. Unlike the majority of existing humor collections of canned jokes, KoWit-24 provides wordplay contexts -- each headline is accompanied by the news lead and summary. The most common type of wordplay in the dataset is the transformation of collocations, idioms, and named entities -- the mechanism that has been underrepresented in previous humor datasets. Our experiments with five LLMs show that there is ample room for improvement in wordplay detection and interpretation tasks. The dataset and evaluation scripts are available at https://github.com/Humor-Research/KoWit-24

computational linguistic, dataset, wordplay, (14 more...)

2503.0151

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry: Media > News (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh

Koto, Fajri, Joshi, Rituraj, Mukhituly, Nurdaulet, Wang, Yuxia, Xie, Zhuohan, Pal, Rahul, Orel, Daniil, Mullah, Parvez, Turmakhan, Diana, Goloburda, Maiya, Kamran, Mohammed, Ghosh, Samujjwal, Jia, Bokang, Mansurov, Jonibek, Togmanov, Mukhammed, Banerjee, Debopriyo, Laiyk, Nurkhan, Sakip, Akhmed, Han, Xudong, Kochmar, Ekaterina, Aji, Alham Fikri, Singh, Aaryamonvikram, Jadhav, Alok Anil, Katipomu, Satheesh, Kamboj, Samta, Choudhury, Monojit, Gosal, Gurpreet, Ramakrishnan, Gokul, Mishra, Biswajit, Chandran, Sarath, Sheinin, Avraham, Vassilieva, Natalia, Sengupta, Neha, Murray, Larry, Nakov, Preslav

Llama-3.1-Sherkala-8B-Chat, or Sherkala-Chat (8B) for short, is a state-of-the-art instruction-tuned open generative large language model (LLM) designed for Kazakh. Sherkala-Chat (8B) aims to enhance the inclusivity of LLM advancements for Kazakh speakers. Adapted from the LLaMA-3.1-8B model, Sherkala-Chat (8B) is trained on 45.3B tokens across Kazakh, English, Russian, and Turkish. With 8 billion parameters, it demonstrates strong knowledge and reasoning abilities in Kazakh, significantly outperforming existing open Kazakh and multilingual models of similar scale while achieving competitive performance in English. We release Sherkala-Chat (8B) as an open-weight instruction-tuned model and provide a detailed overview of its training, fine-tuning, safety alignment, and evaluation, aiming to advance research and support diverse real-world applications.

computational linguistic, dataset, language model, (13 more...)

2503.01493

Country:

Asia > Kazakhstan (0.15)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Russia (0.05)
(20 more...)

Genre: Research Report (0.81)

Industry: Education > Curriculum > Subject-Specific Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization

Niu, Yifan, Gao, Ziqi, Xu, Tingyang, Liu, Yang, Bian, Yatao, Rong, Yu, Huang, Junzhou, Li, Jia

Exploring chemical space to find novel molecules that simultaneously satisfy multiple properties is crucial in drug discovery. However, existing methods often struggle with trading off multiple properties due to the conflicting or correlated nature of chemical properties. To tackle this issue, we introduce InversionGNN framework, an effective yet sample-efficient dual-path graph neural network (GNN) for multi-objective drug discovery. In the direct prediction path of InversionGNN, we train the model for multi-property prediction to acquire knowledge of the optimal combination of functional groups. Then the learned chemical knowledge helps the inversion generation path to generate molecules with required properties. In order to decode the complex knowledge of multiple properties in the inversion path, we propose a gradient-based Pareto search method to balance conflicting properties and generate Pareto optimal molecules. Additionally, InversionGNN is able to search the full Pareto front approximately in discrete chemical space. Comprehensive experimental evaluations show that InversionGNN is both effective and sample-efficient in various discrete multi-objective settings including drug discovery.

inversiongnn, molecule, optimization, (12 more...)

2503.01488

Country:

Asia > China > Hong Kong (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Tarrant County > Arlington (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)