AITopics | Large Language Model

01b7575c38dac42f3cfb7d500438b875-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 10:10:46 GMT

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

5 Reasons to Think Twice Before Using ChatGPT--or Any Chatbot--for Financial Advice

WIREDApr-24-2026, 09:00:00 GMT

As people increasingly rely on AI chatbots for guidance, even on financial matters, a healthy dose of skepticism is critical. I've used ChatGPT to help me build a budget before, and it was genuinely helpful. After I input my monthly salary as well as my standard utilities and recurring expenses, the chatbot drafted a few solid options, and I tweaked them into penny-pinching perfection. "Millions of people turn to ChatGPT with money-related questions, from understanding debt to building budgets and learning financial concepts," says Niko Felix, an OpenAI spokesperson, when reached for comment. "ChatGPT can be a helpful tool for exploring options, preparing questions, and making financial topics easier to understand, but it is not a substitute for licensed financial professionals." OpenAI's Terms of Use state that the AI tool is not meant to replace professional financial advice.

large language model, machine learning, natural language, (21 more...)

WIRED

Country:

North America > United States > California (0.15)
Europe > Slovakia (0.05)
Europe > Czechia (0.05)
Asia > Middle East > Iran (0.05)

Genre: Research Report (0.97)

Industry:

Government (0.98)
Banking & Finance > Financial Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Birth of a Transformer: AMemory Viewpoint

Neural Information Processing SystemsApr-24-2026, 07:53:30 GMT

Large language models based on transformers have achieved great empirical successes. However, as they are deployed more widely, there is a growing need to better understand their internal mechanisms in order to make them more reliable. These models appear to store vast amounts of knowledge from their training data, and to adapt quickly to new information provided in their context or prompt. We study how transformers balance these two types of knowledge by considering a synthetic setup where tokens are generated from either global or context-specific bigram distributions. By a careful empirical analysis of the training process on a simplified two-layer transformer, we illustrate the fast learning of global bigrams and the slower development of an "induction head" mechanism for the in-context bigrams. We highlight the role of weight matrices as associative memories, provide theoretical insights on how gradients enable their learning during training, and study the role of data-distributional properties.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

01c4593d60a020fed5607944330106b1-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:37:22 GMT

arxiv preprint, large language model, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

00d1f03b87a401b1c7957e0cc785d0bc-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:36:03 GMT

T annotation of o tackle questions this of problem, and visual answers question-answer recent for methods videos, . In pretrained here particular build on on, a fr W promising ozen eb-scale bidir approach te ectional xt-only language adapts data to fr multi-modal ozen models autor (BiLM) egr inputs.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Passive learning of active causal strategies in agents and language models

Neural Information Processing SystemsApr-24-2026, 06:51:34 GMT

What can be learned about causality and experimentation from passive data? This question is salient given recent successes of passively-trained language models in interactive domains such as tool use. Passive learning is inherently limited. However, we show that purely passive learning can in fact allow an agent to learn generalizable strategies for determining and using causal structures, as long as the agent can intervene at test time. We formally illustrate that, under certain assumptions, learning a strategy of first experimenting, then seeking goals, can allow generalization from passive learning in principle.

large language model, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.68)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

034d7bfeace2a9a258648b16fc626298-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 06:24:03 GMT

arxiv preprint arxiv, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

029df12a9363313c3e41047844ecad94-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 05:58:35 GMT

information retrieval, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County (0.28)

Genre: Workflow (0.47)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Add feedback

Detecting Any Human-Object Interaction Relationship: Universal HOIDetector with Spatial Prompt Learning on Foundation Models

Neural Information Processing SystemsApr-24-2026, 05:31:16 GMT

Human-object interaction (HOI) detection aims to comprehend the intricate relationships between humans and objects, predicting < human,action,object >triplets, and serving as the foundation for numerous computer vision tasks. The complexity and diversity of human-object interactions in the real world, however, pose significant challenges for both annotation and recognition, particularly in recognizing interactions within an open world context. This study explores the universal interaction recognition in an open-world setting through the use of Vision-Language (VL) foundation models and large language models (LLMs). The proposed method is dubbed as UniHOI. We conduct a deep analysis of the three hierarchical features inherent in visual HOI detectors and propose a method for high-level relation extraction aimed at VL foundation models, which we call HO prompt-based learning. Our design includes an HOPrompt-guided Decoder (HOPD), facilitates the association of high-level relation representations in the foundation model with various HO pairs within the image. Furthermore, we utilize a LLM (i.e.

large language model, machine learning, natural language, (13 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Large language models transition from integrating across position-yoked, exponential windows to structure-yoked, power-law windows

Neural Information Processing SystemsApr-24-2026, 05:30:02 GMT

Modern language models excel at integrating across long temporal scales needed to encode linguistic meaning and show non-trivial similarities to biological neural systems. Prior work suggests that human brain responses to language exhibit hierarchically organized "integration windows" that substantially constrain the overall influence of an input token (e.g., a word) on the neural response. However, little prior work has attempted to use integration windows to characterize computations in large language models (LLMs). We developed a simple word-swap procedure for estimating integration windows from black-box language models that does not depend on access to gradients or knowledge of the model architecture (e.g., attention weights). Using this method, we show that trained LLMs exhibit stereotyped integration windows that are well-fit by a convex combination of an exponential and a power-law function, with a partial transition from exponential to power-law dynamics across network layers. We then introduce a metric for quantifying the extent to which these integration windows vary with structural boundaries (e.g., sentence boundaries), and using this metric, we show that integration windows become increasingly yoked to structure at later network layers. None of these findings were observed in an untrained model, which as expected integrated uniformly across its input. These results suggest that LLMs learn to integrate information in natural language using a stereotyped pattern: integrating across position-yoked, exponential windows at early layers, followed by structure-yoked, power-law windows at later layers. The methods we describe in this paper provide a general-purpose toolkit for understanding temporal integration in language models, facilitating cross-disciplinary research at the intersection of biological and artificial intelligence.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: