AITopics | Natural Language

Collaborating Authors

Natural Language

One goal of AI work in natural language is to enable communication between people and computers without resorting to memorization of complex commands and procedures. Automatic translation – enabling scientists, business people and just plain folks to interact easily with people around the world – is another goal. Both are just part of the broad field of AI and natural language, along with the cognitive science aspect of using computers to study how humans understand language.

News Overviews Instructional Materials AI-Alerts Classics

SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models

Neural Information Processing SystemsJun-2-2025, 07:43:12 GMT

Vision Language Models (VLMs) have demonstrated remarkable performance in 2D vision and language tasks. However, their ability to reason about spatial arrangements remains limited. In this work, we introduce Spatial Region GPT (SpatialRGPT) to enhance VLMs' spatial perception and reasoning capabilities. SpatialRGPT advances VLMs' spatial understanding through two key innovations: (i) a data curation pipeline that enables effective learning of regional representation from 3D scene graphs, and (ii) a flexible "plugin" module for integrating depth information into the visual encoder of existing VLMs. During inference, when provided with user-specified region proposals, SpatialRGPT can accurately perceive their relative directions and distances. Additionally, we propose SpatialRGBT-Bench, a benchmark with ground-truth 3D annotations encompassing indoor, outdoor, and simulated environments, for evaluating 3D spatial cognition in VLMs. Our results demonstrate that SpatialRGPT significantly enhances performance in spatial reasoning tasks, both with and without local region prompts. The model also exhibits strong generalization capabilities, effectively reasoning about complex spatial relations and functioning as a region-aware dense reward annotator for robotic tasks. Code, dataset, and benchmark are released at https://www.anjiecheng.me/SpatialRGPT.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Transportation (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Metric Flow Matching for Smooth Interpolations on the Data Manifold

Neural Information Processing SystemsJun-2-2025, 07:42:29 GMT

Matching objectives underpin the success of modern generative models and rely on constructing conditional paths that transform a source distribution into a target distribution. Despite being a fundamental building block, conditional paths have been designed principally under the assumption of Euclidean geometry, resulting in straight interpolations. However, this can be particularly restrictive for tasks such as trajectory inference, where straight paths might lie outside the data manifold, thus failing to capture the underlying dynamics giving rise to the observed marginals.

machine learning, manifold, natural language, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration

Neural Information Processing SystemsJun-2-2025, 07:42:10 GMT

Membership Inference Attacks (MIA) aim to infer whether a target data record has been utilized for model training or not. Existing MIAs designed for large language models (LLMs) can be bifurcated into two types: reference-free and reference-based attacks. Although reference-based attacks appear promising performance by calibrating the probability measured on the target model with reference models, this illusion of privacy risk heavily depends on a reference dataset that closely resembles the training set. Both two types of attacks are predicated on the hypothesis that training records consistently maintain a higher probability of being sampled. However, this hypothesis heavily relies on the overfitting of target models, which will be mitigated by multiple regularization methods and the generalization of LLMs.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec (0.14)
Asia > Middle East > UAE (0.14)
North America > United States > Pennsylvania (0.14)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution Qi Tang 1,2

Neural Information Processing SystemsJun-2-2025, 07:39:11 GMT

Diffusion-based Video Super-Resolution (VSR) is renowned for generating perceptually realistic videos, yet it grapples with maintaining detail consistency across frames due to stochastic fluctuations. The traditional approach of pixel-level alignment is ineffective for diffusion-processed frames because of iterative disruptions. To overcome this, we introduce SeeClear-a novel VSR framework leveraging conditional video generation, orchestrated by instance-centric and channel-wise semantic controls. This framework integrates a Semantic Distiller and a Pixel Condenser, which synergize to extract and upscale semantic details from low-resolution frames. The Instance-Centric Alignment Module (InCAM) utilizes video-clip-wise tokens to dynamically relate pixels within and across frames, enhancing coherency.

information, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

f2b9e8e7a36d43ddfd3d55113d56b1e0-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsJun-2-2025, 07:36:50 GMT

Our final dataset comprised 200 molecules.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Can LLMs Solve Molecule Puzzles A Benchmark for Molecular Structure Elucidation

Neural Information Processing SystemsJun-2-2025, 07:36:46 GMT

LLMs with improved chemical reasoning capabilities in the other two sub-tasks.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Semantic Density: Uncertainty Quantification for Large Language Models through Confidence Measurement in Semantic Space

Neural Information Processing SystemsJun-2-2025, 07:31:35 GMT

With the widespread application of Large Language Models (LLMs) to various domains, concerns regarding the trustworthiness of LLMs in safety-critical scenarios have been raised, due to their unpredictable tendency to hallucinate and generate misinformation. Existing LLMs do not have an inherent functionality to provide the users with an uncertainty/confidence metric for each response it generates, making it difficult to evaluate trustworthiness. Although several studies aim to develop uncertainty quantification methods for LLMs, they have fundamental limitations, such as being restricted to classification tasks, requiring additional training and data, considering only lexical instead of semantic information, and being prompt-wise but not response-wise. A new framework is proposed in this paper to address these issues.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > Mexico > Mexico City (0.14)
North America > United States > Texas (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (0.67)
Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

f26b29298ae8acd94bd7e839688e329b-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsJun-2-2025, 07:31:08 GMT

For what purpose was the dataset created? Was there a specific task in mind? This work is partially funded by Cisco and Amazon. Are there multiple types of instances (e.g. Each scenario essentially contains a user's LLM's generated program conforms to the user's intent.

artificial intelligence, dataset, natural language, (18 more...)

Neural Information Processing Systems

Industry:

Information Technology (0.69)
Law (0.69)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

IaC-Eval: A Code Generation Benchmark for Cloud Infrastructure-as-Code Programs, Xinyu Wang University of Michigan

Neural Information Processing SystemsJun-2-2025, 07:31:05 GMT

Infrastructure-as-Code (IaC), an important component of cloud computing, allows the definition of cloud infrastructure in high-level programs. However, developing IaC programs is challenging, complicated by factors that include the burgeoning complexity of the cloud ecosystem (e.g., diversity of cloud services and workloads), and the relative scarcity of IaC-specific code examples and public repositories. While large language models (LLMs) have shown promise in general code generation and could potentially aid in IaC development, no benchmarks currently exist for evaluating their ability to generate IaC code.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Michigan (0.40)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Learning Where to Edit Vision Transformers Yunqiao Yang 1 Long-Kai Huang 2 Shengzhuang Chen 1 Kede Ma

Neural Information Processing SystemsJun-2-2025, 07:29:05 GMT

Model editing aims to data-efficiently correct predictive errors of large pre-trained models while ensuring generalization to neighboring failures and locality to minimize unintended effects on unrelated examples. While significant progress has been made in editing Transformer-based large language models, effective strategies for editing vision Transformers (ViTs) in computer vision remain largely untapped. In this paper, we take initial steps towards correcting predictive errors of ViTs, particularly those arising from subpopulation shifts. Taking a locate-then-edit approach, we first address the "where-to-edit" challenge by meta-learning a hypernetwork on CutMix-augmented data generated for editing reliability. This trained hypernetwork produces generalizable binary masks that identify a sparse subset of structured model parameters, responsive to real-world failure samples. Afterward, we solve the "how-to-edit" problem by simply fine-tuning the identified parameters using a variant of gradient descent to achieve successful edits. To validate our method, we construct an editing benchmark that introduces subpopulation shifts towards natural underrepresented images and AI-generated images, thereby revealing the limitations of pre-trained ViTs for object recognition. Our approach not only achieves superior performance on the proposed benchmark but also allows for adjustable trade-offs between generalization and locality. Our code is available at https://github.com/hustyyq/Where-to-Edit.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: