AITopics | dependency graph

Dependency Parsing is More Parameter-Efficient with Normalization

Neural Information Processing SystemsJun-22-2026, 21:07:23 GMT

Dependency parsing is the task of inferring natural language structure, often approached by modeling word interactions via attention through biaffine scoring. This mechanism works like self-attention in Transformers, where scores are calculated for every pair of words in a sentence. However, unlike Transformer attention, biaffine scoring does not use normalization prior to taking the softmax of the scores. In this paper, we provide theoretical evidence and empirical results revealing that a lack of normalization necessarily results in overparameterized parser models, where the extra parameters compensate for the sharp softmax outputs produced by high variance inputs to the biaffine scoring function. We argue that biaffine scoring can be made substantially more efficient by performing score normalization. We conduct experiments on semantic and syntactic dependency parsing in multiple languages, along with latent graph inference on non-linguistic data, using various settings of a k-hop parser. We train N-layer stacked BiLSTMs and evaluate the parser's performance with and without normalizing biaffine scores. Normalizing allows us to achieve state-of-the-art performance with fewer samples and trainable parameters.

machine learning, natural language, normalization, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota (0.28)
North America > United States > Massachusetts (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

fce176458ff542940fa3ed16e6f9c852-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 09:55:34 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Sports (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Supplementary Information 1 Code and Full Technical Documentation of Event Stream GPT

Neural Information Processing SystemsApr-27-2026, 05:01:09 GMT

T/P refers to "tokens" per patient, C/T to codes per "token," and # Codes to the number of unique codes present in the dataset. Missing values reflect quantities not reported in the source publication.

data mining, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.69)

Industry:

Health & Medicine > Health Care Providers & Services (0.71)
Health & Medicine > Health Care Technology > Medical Record (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)
Information Technology > Data Science > Data Mining (0.69)

Add feedback

DisenGCD: A Meta Multigraph-assisted Disentangled Graph Learning Framework for Cognitive Diagnosis

Neural Information Processing SystemsMar-21-2026, 22:54:07 GMT

Existing graph learning-based cognitive diagnosis (CD) methods have made relatively good results, but their student, exercise, and concept representations are learned and exchanged in an implicit unified graph, which makes the interaction-agnostic exercise and concept representations be learned poorly, failing to provide high robustness against noise in students' interactions. Besides, lower-order exercise latent representations obtained in shallow layers are not well explored when learning the student representation. To tackle the issues, this paper suggests a meta multigraph-assisted disentangled graph learning framework for CD (DisenGCD), which learns three types of representations on three disentangled graphs: student-exercise-concept interaction, exercise-concept relation, and concept dependency graphs, respectively. Specifically, the latter two graphs are first disentangled from the interaction graph. Then, the student representation is learned from the interaction graph by a devised meta multigraph learning module; multiple learnable propagation paths in this module enable current student latent representation to access lower-order exercise latent representations,which can lead to more effective nad robust student representations learned; the exercise and concept representations are learned on the relation and dependency graphs by graph attention modules. Finally, a novel diagnostic function is devised to handle three disentangled representations for prediction.

artificial intelligence, machine learning, representation, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.64)

Add feedback

fce176458ff542940fa3ed16e6f9c852-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 03:20:18 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ANPL: Towards Natural Programming with Interactive Decomposition Di Huang

Neural Information Processing SystemsFeb-17-2026, 11:23:28 GMT

Though LLMs are capable of generating plausible programs, it's challenging to interact with the LLMs further to revise the program, especially if the user's specific requirements are different from the initial proposal.

large language model, machine learning, programming language, (19 more...)

Neural Information Processing Systems

Country:

North America > Mexico (0.04)
Asia > China (0.04)
North America > United States > District of Columbia > Washington (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Education (0.93)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions Zizhao Wang

Neural Information Processing SystemsFeb-16-2026, 13:50:57 GMT

Unsupervised skill discovery carries the promise that an intelligent agent can learn reusable skills through autonomous, reward-free environment interaction.

local dependency, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: