AITopics

Country:

North America > United States (0.67)
Asia > China (0.46)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Neural Information Processing SystemsFeb-18-2026, 15:55:02 GMT

f02a816cd50c0b1441601dbde012fa24-Paper-Conference.pdf

artificial intelligence, camouflage, machine learning, (19 more...)

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Neural Information Processing SystemsFeb-18-2026, 01:42:21 GMT

c6483c8a68083af3383f91ee0dc6db95-Paper-Conference.pdf

large language model, machine learning, natural language, (20 more...)

Country: Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-8-2026, 20:13:05 GMT

NaturalCounterfactualsWithNecessaryBacktracking

Ourmethodologyincorporates a certain amount of backtracking when needed, allowing changes in causally preceding variables tominimize deviations from realistic scenarios. Specifically, we introduce a novel optimization framework that permits but also controls the extent of backtracking with a "naturalness" criterion. Empirical experiments demonstrate the effectiveness of our method.

artificial intelligence, counterfactual, machine learning, (19 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Slovakia > Bratislava > Bratislava (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Neural Information Processing SystemsDec-24-2025, 13:56:06 GMT

Isometric 3D Adversarial Examples in the Physical World

Recently, several attempts have demonstrated that 3D deep learning models are as vulnerable to adversarial example attacks as 2D models. However, these methods are still far from stealthy and suffer from severe performance degradation in the physical world. Although 3D data is highly structured, it is difficult to bound the perturbations with simple metrics in the Euclidean space. In this paper, we propose a novel $\epsilon$-isometric ($\epsilon$-ISO) attack method to generate natural and robust 3D adversarial examples in the physical world by considering the geometric properties of 3D objects and the invariance to physical transformations. For naturalness, we constrain the adversarial example and the original one to be $\epsilon$-isometric by adopting the Gaussian curvature as the surrogate metric under a theoretical analysis. For robustness under physical transformations, we propose a maxima over transformation (MaxOT) method to actively search for the most difficult transformations rather than random ones to make the generated adversarial example more robust in the physical world. Extensive experiments on typical point cloud recognition models validate that our approach can improve the attack success rate and naturalness of the generated 3D adversarial examples than the state-of-the-art attack methods.

adversarial example, name change, physical world, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

arXiv.org Artificial IntelligenceDec-2-2025

SpeechJudge: Towards Human-Level Judgment for Speech Naturalness

Zhang, Xueyao, Wang, Chaoren, Liao, Huan, Li, Ziniu, Wang, Yuancheng, Wang, Li, Jia, Dongya, Chen, Yuanzhe, Li, Xiulin, Chen, Zhuo, Wu, Zhizheng

Aligning large generative models with human feedback is a critical challenge. In speech synthesis, this is particularly pronounced due to the lack of a large-scale human preference dataset, which hinders the development of models that truly align with human perception. To address this, we introduce SpeechJudge, a comprehensive suite comprising a dataset, a benchmark, and a reward model centered on naturalness--one of the most fundamental subjective metrics for speech synthesis. First, we present SpeechJudge-Data, a large-scale human feedback corpus of 99K speech pairs. The dataset is constructed using a diverse set of advanced zero-shot text-to-speech (TTS) models across diverse speech styles and multiple languages, with human annotations for both intelligibility and naturalness preference. From this, we establish SpeechJudge-Eval, a challenging benchmark for speech naturalness judgment. Our evaluation reveals that existing metrics and AudioLLMs struggle with this task; the leading model, Gemini-2.5-Flash, achieves less than 70% agreement with human judgment, highlighting a significant gap for improvement. To bridge this gap, we develop SpeechJudge-GRM, a generative reward model (GRM) based on Qwen2.5-Omni-7B. It is trained on SpeechJudge-Data via a two-stage post-training process: Supervised Fine-Tuning (SFT) with Chain-of-Thought rationales followed by Reinforcement Learning (RL) with GRPO on challenging cases. On the SpeechJudge-Eval benchmark, the proposed SpeechJudge-GRM demonstrates superior performance, achieving 77.2% accuracy (and 79.4% after inference-time scaling @10) compared to a classic Bradley-Terry reward model (72.7%). Furthermore, SpeechJudge-GRM can be also employed as a reward function during the post-training of speech generation models to facilitate their alignment with human preferences.

large language model, machine learning, natural language, (22 more...)

2511.07931

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.89)

Rackauckas, Zackary, Hirschberg, Julia

Comparative Evaluation of Expressive Japanese Character Text-to-Speech with VITS and Style-BERT-VITS2

arXiv.org Artificial IntelligenceDec-2-2025

Synthesizing expressive Japanese character speech poses unique challenges due to pitch-accent sensitivity and stylistic variability. This paper empirically evaluates two open-source text-to-speech models--VITS and Style-BERT-VITS2 JP Extra (SBV2JE)--on in-domain, character-driven Japanese speech. Using three character-specific datasets, we evaluate models across naturalness (mean opinion and comparative mean opinion score), intelligibility (word error rate), and speaker consistency. SBV2JE matches human ground truth in naturalness (MOS 4.37 vs. 4.38), achieves lower WER, and shows slight preference in CMOS. Enhanced by pitch-accent controls and a WavLM-based discriminator, SBV2JE proves effective for applications like language learning and character dialogue generation, despite higher computational demands.

artificial intelligence, machine learning, natural language, (18 more...)

2505.1732

Country: Asia > Japan > Honshū (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.75)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.63)

Neural Information Processing SystemsNov-15-2025, 05:50:45 GMT

Isometric 3D Adversarial Examples in the Physical World Yibo Miao

Based on Definitions 1 and 2, we have the following theorem.

adversarial example, proceedings, transformation, (11 more...)

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.93)

Industry: Information Technology (0.95)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceNov-11-2025

Decomate: Leveraging Generative Models for Co-Creative SVG Animation

Park, Jihyeon, Myung, Jiyoon, Shin, Seone, Son, Jungki, Han, Joohyung

Designers often encounter friction when animating static SVG graphics, especially when the visual structure does not match the desired level of motion detail. Existing tools typically depend on predefined groupings or require technical expertise, which limits designers' ability to experiment and iterate independently. We present Decomate, a system that enables intuitive SVG animation through natural language. Decomate leverages a multimodal large language model to restructure raw SVGs into semantically meaningful, animation-ready components. Designers can then specify motions for each component via text prompts, after which the system generates corresponding HTML/CSS/JS animations. By supporting iterative refinement through natural language interaction, Decomate integrates generative AI into creative workflows, allowing animation outcomes to be directly shaped by user intent.

artificial intelligence, large language model, natural language, (16 more...)

2511.06297

Genre:

Research Report (0.64)
Questionnaire & Opinion Survey (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)

Kim, Sanghee J., Misra, Kanishka

Hey, wait a minute: on at-issue sensitivity in Language Models

arXiv.org Artificial IntelligenceNov-5-2025

Evaluating the naturalness of dialogue in language models (LMs) is not trivial: notions of 'naturalness' vary, and scalable quantitative metrics remain limited. This study leverages the linguistic notion of 'at-issueness' to assess dialogue naturalness and introduces a new method: Divide, Generate, Recombine, and Compare (DGRC). DGRC (i) divides a dialogue as a prompt, (ii) generates continuations for subparts using LMs, (iii) recombines the dialogue and continuations, and (iv) compares the likelihoods of the recombined sequences. This approach mitigates bias in linguistic analyses of LMs and enables systematic testing of discourse-sensitive behavior. Applying DGRC, we find that LMs prefer to continue dialogue on at-issue content, with this effect enhanced in instruct-tuned models. They also reduce their at-issue preference when relevant cues (e.g., "Hey, wait a minute") are present. Although instruct-tuning does not further amplify this modulation, the pattern reflects a hallmark of successful dialogue dynamics.

large language model, machine learning, natural language, (18 more...)