AITopics | Natural Language

Collaborating Authors

Natural Language

One goal of AI work in natural language is to enable communication between people and computers without resorting to memorization of complex commands and procedures. Automatic translation – enabling scientists, business people and just plain folks to interact easily with people around the world – is another goal. Both are just part of the broad field of AI and natural language, along with the cognitive science aspect of using computers to study how humans understand language.

News Overviews Instructional Materials AI-Alerts Classics

4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models

Neural Information Processing SystemsMay-24-2025, 08:13:42 GMT

Existing dynamic scene generation methods mostly rely on distilling knowledge from pre-trained 3D generative models, which are typically fine-tuned on synthetic object datasets. As a result, the generated scenes are often object-centric and lack photorealism. To address these limitations, we introduce a novel pipeline designed for photorealistic text-to-4D scene generation, discarding the dependency on multi-view generative models and instead fully utilizing video generative models trained on diverse real-world datasets. Our method begins by generating a reference video using the video generation model. We then learn the canonical 3D representation of the video using a freeze-time video, delicately generated from the reference video. To handle inconsistencies in the freeze-time video, we jointly learn a per-frame deformation to model these imperfections. We then learn the temporal deformation based on the canonical representation to capture dynamic interactions in the reference video.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Media > Film (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Large Scale Transfer Learning for Tabular Data via Language Modeling Josh Gardner, Juan C. Perdomo # Ludwig Schmidt

Neural Information Processing SystemsMay-24-2025, 08:03:59 GMT

Tabular data - structured, heterogeneous, spreadsheet-style data with rows and columns - is widely used in practice across many domains. However, while recent foundation models have reduced the need for developing task-specific datasets and predictors in domains such as language modeling and computer vision, this transfer learning paradigm has not had similar impact in the tabular domain.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia (0.45)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Government (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Differentiable Convex Optimization Layers

Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, J. Zico Kolter

Neural Information Processing SystemsMay-24-2025, 06:12:26 GMT

Neural Information Processing Systems http://nips.cc/

convex program, optimization, optimization problem, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Florida (0.14)

Industry: Energy > Oil & Gas > Upstream (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization

Neural Information Processing SystemsMay-24-2025, 06:09:13 GMT

Generating ligand molecules for specific protein targets, known as structure-based drug design, is a fundamental problem in therapeutics development and biological discovery. Recently, target-aware generative models, especially diffusion models, have shown great promise in modeling protein-ligand interactions and generating candidate drugs. However, existing models primarily focus on learning the chemical distribution of all drug candidates, which lacks effective steerability on the chemical quality of model generations.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Hypotheses Paradise: An Open and Strong Baseline for Speech Recognition with Large Language Models Chen Chen 1, Chao-Han Huck Yang 2

Neural Information Processing SystemsMay-24-2025, 04:59:12 GMT

Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to attain human parity on several publicly available clean speech datasets. However, even state-of-the-art ASR systems experience performance degradation when confronted with adverse conditions, as a well-trained acoustic model is sensitive to variations in the speech domain, e.g., background noise. Intuitively, humans address this issue by relying on their linguistic knowledge: the meaning of ambiguous spoken terms is usually inferred from contextual cues thereby reducing the dependency on the auditory system. Inspired by this observation, we introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction, where N-best decoding hypotheses provide informative elements for true transcription prediction. This approach is a paradigm shift from the traditional language model rescoring strategy that can only select one candidate hypothesis as the output transcription.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.67)
Asia > China (0.47)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Energy > Oil & Gas > Midstream (0.92)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models Chen Chen 1, Chao-Han Huck Yang 2,3

Neural Information Processing SystemsMay-24-2025, 04:59:09 GMT

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.92)
Asia > China (0.47)
North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Energy > Oil & Gas > Midstream (0.93)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Temporal Sentence Grounding with Relevance Feedback in Videos Jianfeng Dong 1 2

Neural Information Processing SystemsMay-24-2025, 04:16:34 GMT

As a widely explored multi-modal task, Temporal Sentence Grounding in videos (TSG) endeavors to retrieve a specific video segment matched with a given query text from a video. The traditional paradigm for TSG generally assumes that relevant segments always exist within a given video. However, this assumption is restrictive and unrealistic in real-world applications where the existence of a query-related segment is uncertain, easily resulting in erroneous grounding. Motivated by the research gap and practical application, this paper introduces a new task, named Temporal Sentence Grounding with Relevance Feedback (TSG-RF) in videos, which accommodates the possibility that a video may or may not include a segment related to the query. This task entails localizing precise video segments that semantically align with the query text when such content is present, while delivering definitive feedback on the non-existence of related segments when absent. Moreover, we propose a novel Relation-aware Temporal Sentence Grounding (RaTSG) network for addressing this challenging task.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations

Xu Wang, Jingming He, Lin Ma

Neural Information Processing SystemsMay-24-2025, 03:52:48 GMT

In this paper, we propose one novel model for point cloud semantic segmentation, which exploits both the local and global structures within the point cloud based on the contextual point representations. Specifically, we enrich each point representation by performing one novel gated fusion on the point itself and its contextual points. Afterwards, based on the enriched representation, we propose one novel graph pointnet module, relying on the graph attention block to dynamically compose and update each point representation within the local point cloud structure. Finally, we resort to the spatial-wise and channel-wise attention strategies to exploit the point cloud global structure and thereby yield the resulting semantic label for each point. Extensive results on the public point cloud databases, namely the S3DIS and ScanNet datasets, demonstrate the effectiveness of our proposed model, outperforming the state-of-the-art approaches. Our code for this paper is available at https://github.com/fly519/ELGS.

data mining, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.15)
North America > Canada (0.14)
Europe > Italy (0.14)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections Roy Miles

Neural Information Processing SystemsMay-24-2025, 02:41:53 GMT

Large language models (LLMs) have recently emerged as powerful tools for tackling many language-processing tasks. Despite their success, training and finetuning these models is still far too computationally and memory intensive. In this paper, we identify and characterise the important components needed for effective model convergence using gradient descent. In doing so we find that the intermediate activations used to implement backpropagation can be excessively compressed without incurring any degradation in performance. This result leads us to a cheap and memory-efficient algorithm for both fine-tuning and pre-training LLMs. The proposed algorithm simply divides the tokens up into smaller sub-tokens before projecting them onto a fixed 1-dimensional subspace during the forward pass. These features are then coarsely reconstructed during the backward pass to implement the update rules. We confirm the effectiveness of our algorithm as being complimentary to many state-of-the-art PEFT methods on the VTAB-1k fine-tuning benchmark. Furthermore, we outperform QLoRA for fine-tuning LLaMA and show competitive performance against other memory-efficient pre-training methods on the large-scale C4 dataset.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.68)
Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Appendices A Additional Information on Prompts 16 A.1 Words and Word Frequencies 16 A.2 The Distribution of Prompt Types in the Benchmark 17 A.3 The Encoding Scheme for Task 2 Answers

Neural Information Processing SystemsMay-24-2025, 02:27:00 GMT

To answer the first question, we split the data into the two groups: the first group contains the subset of data for numeric-simple prompts, and the second group the subset of data for attribute-color prompts. We only consider prompts in both groups that contain the same numbers (1-4) and the same words ("cat", "apple", "koala", "bottle", "mushroom"), to isolate the effect of adding the color term as opposed to potential confounding factors. For example a confounding factor might be the word identity, as a model might be more accurate in generating correct images when the prompt contains the word "dog", and if this word exists only in the first prompt type and not in the second then responses in the first prompt type will on average have higher accuracy that may or may not

annotator, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.64)

Add feedback