AITopics | Large Language Model

AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation

Neural Information Processing SystemsApr-25-2026, 19:26:33 GMT

Open-vocabulary semantic segmentation is a challenging task that requires segmenting novel object categories at inference time. Recent works explore vision-language pre-training to handle this task, but suffer from unrealistic assumptions in practical scenarios, i.e., low-quality textual category names. For example, this paradigm assumes that new textual categories will be accurately and completely provided, and exist in lexicons during pre-training. However, exceptions often happen when meet with ambiguity for brief or incomplete names, new words that are not present in the pre-trained lexicons, and difficult-to-describe categories for users. To address these issues, this work proposes a novel attribute decomposition-aggregation framework, AttrSeg, inspired by human cognition in understanding new concepts.

large language model, machine learning, segmentation, (20 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.48)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)

Add feedback

1feb87871436031bdc0f2beaa62a049b-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 18:36:26 GMT

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications > Social Media (0.93)

Add feedback

DeWave: Discrete EEGWaves Encoding for Brain Dynamics to Text Translation

Neural Information Processing SystemsApr-25-2026, 17:45:47 GMT

The translation of brain dynamics into natural language is pivotal for braincomputer interfaces (BCIs). With the swift advancement of large language models, such as ChatGPT, the need to bridge the gap between the brain and languages becomes increasingly pressing. Current methods, however, require eye-tracking fixations or event markers to segment brain dynamics into word-level features, which can restrict the practical application of these systems. To tackle these issues, we introduce a novel framework, DeWave, that integrates discrete encoding sequences into open-vocabulary EEG-to-text translation tasks. DeWave uses a quantized variational encoder to derive discrete codex encoding and align it with pre-trained language models. This discrete codex representation brings forth two advantages: 1) it realizes translation on raw waves without marker by introducing text-EEG contrastive alignment training, and 2) it alleviates the interference caused by individual differences in EEG waves through an invariant discrete codex with or without markers.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

1eeacdf8770e6dd5164cdeec8bcfa8cc-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 17:31:44 GMT

large language model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Maryland (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

TextDiffuser: Diffusion Models as Text Painters

Neural Information Processing SystemsApr-25-2026, 15:46:49 GMT

Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce TextDiffuser, focusing on generating images with visually appealing text that is coherent with backgrounds. TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout. Additionally, we contribute the first large-scale text images dataset with OCR annotations, MARIO-10M, containing 10 million image-text pairs with text recognition, detection, and character-level segmentation annotations. We further collect the MARIO-Eval benchmark to serve as a comprehensive tool for evaluating text rendering quality.

large language model, machine learning, textdiffuser, (20 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report (0.46)

Industry:

Leisure & Entertainment (0.93)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Natural Language Instruction following with Task related Language Development and Translation

Neural Information Processing SystemsApr-25-2026, 15:33:09 GMT

Natural language-conditioned reinforcement learning (RL) enables agents to follow human instructions. Previous approaches generally implemented languageconditioned RL by providing the policy with human instructions in natural language (NL) and training the policy to follow instructions. In this is outside-in approach, the policy must comprehend the NL and manage the task simultaneously. However, the unbounded NL examples often bring much extra complexity for solving concrete RL tasks, which can distract policy learning from completing the task. To ease the learning burden of the policy, we investigate an inside-out scheme for natural language-conditioned RL by developing a task language (TL) that is task-related and easily understood by the policy, thus reducing the policy learning burden. Besides, we employ a translator to translate natural language into the TL, which is used in RL to achieve efficient policy training. We implement this scheme as TALAR (TAsk Language with predicAte Representation) that learns multiple predicates to model object relationships as the TL. Experiments indicate that TALAR not only better comprehends NL instructions but also leads to a better instruction-following policy that significantly improves the success rate over baselines and adapts to unseen expressions of NL instruction. Besides, the TL is also an effective sub-task abstraction compatible with hierarchical RL.

large language model, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning Universal Policies via Text-Guided Video Generation

Neural Information Processing SystemsApr-25-2026, 15:13:39 GMT

A goal of artificial intelligence is to construct an agent that can solve a wide variety of tasks. Recent progress in text-guided image synthesis has yielded models with an impressive ability to generate complex novel images, exhibiting combinatorial generalization across domains. Motivated by this success, we investigate whether such tools can be used to construct more general-purpose agents. Specifically, we cast the sequential decision making problem as a text-conditioned video generation problem, where, given a text-encoded specification of a desired goal, a planner synthesizes a set of future frames depicting its planned actions in the future, after which control actions are extracted from the generated video. By leveraging text as the underlying goal specification, we are able to naturally and combinatorially generalize to novel goals. The proposed policy-as-video formulation can further represent environments with different state and action spaces in a unified space of images, which, for example, enables learning and generalization across a variety of robot manipulation tasks. Finally, by leveraging pretrained language embeddings and widely available videos from the internet, the approach enables knowledge transfer through predicting highly realistic video plans for real robots2.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

1bfd87d2d92f0556819467dc08034f76-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 13:34:44 GMT

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 12:43:43 GMT

large language model, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe (0.28)

Genre:

Personal (0.48)
Research Report (0.46)

Industry:

Media (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Example Pair: Depth-> Image Output Example Pair: Hed-> Image Output In-Context Learning Unlocked for Diffusion Models

Neural Information Processing SystemsApr-25-2026, 12:25:34 GMT

Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our model automatically understands the underlying task and performs the same task on a new query image following the text guidance.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Media (0.68)

Technology: