AITopics

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Neural Information Processing SystemsFeb-12-2026, 14:41:32 GMT

Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming

Fei Wang, James Decker, Xilun Wu, Gregory Essertel, Tiark Rompf

In this paper we propose an implementation of backpropagation using functions with callbacks, where the forward pass is executed as a sequence of function calls, and the backward pass as a corresponding sequence of function returns. A key realization is that this technique of chaining callbacks is well known in the programming languages community as continuation-passing style (CPS) .

artificial intelligence, machine learning, programming language, (18 more...)

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.05)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsNov-20-2025, 15:38:27 GMT

Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming

Fei Wang, James Decker, Xilun Wu, Gregory Essertel, Tiark Rompf

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, programming language, (19 more...)

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.05)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-8-2025

LANTERN: Scalable Distillation of Large Language Models for Job-Person Fit and Explanation

Fu, Zhoutong, Cao, Yihan, Chen, Yi-Lin, Lunia, Aman, Dong, Liming, Saraf, Neha, Jiang, Ruijie, Dai, Yun, Song, Qingquan, Wang, Tan, Li, Guoyao, Koh, Derek, Wei, Haichao, Wang, Zhipeng, Gupta, Aman, Jiang, Chengming, Shen, Jianqiang, Hong, Liangjie, Zhang, Wenjing

Large language models (LLMs) have achieved strong performance across a wide range of natural language processing tasks. However, deploying LLMs at scale for domain specific applications, such as job-person fit and explanation in job seeking platforms, introduces distinct challenges. At LinkedIn, the job person fit task requires analyzing a candidate's public profile against job requirements to produce both a fit assessment and a detailed explanation. Directly applying open source or finetuned LLMs to this task often fails to yield high quality, actionable feedback due to the complexity of the domain and the need for structured outputs. Moreover, the large size of these models leads to high inference latency and limits scalability, making them unsuitable for online use. To address these challenges, we introduce LANTERN, a novel LLM knowledge distillation framework tailored specifically for job person fit tasks. LANTERN involves modeling over multiple objectives, an encoder model for classification purpose, and a decoder model for explanation purpose. To better distill the knowledge from a strong black box teacher model to multiple downstream models, LANTERN incorporates multi level knowledge distillation that integrates both data and logit level insights. In addition to introducing the knowledge distillation framework, we share our insights on post training techniques and prompt engineering, both of which are crucial for successfully adapting LLMs to domain specific downstream tasks. Extensive experimental results demonstrate that LANTERN significantly improves task specific metrics for both job person fit and explanation. Online evaluations further confirm its effectiveness, showing measurable gains in job seeker engagement, including a 0.24\% increase in apply rate and a 0.28\% increase in qualified applications.

large language model, machine learning, teacher model, (16 more...)

2510.0549

Country: North America > United States (0.48)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation (0.35)
Education (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-3-2025, 09:25:23 GMT

Learning Latent Process from High-Dimensional Event Sequences via Efficient Sampling

Qitian Wu, Zixuan Zhang, Xiaofeng Gao, Junchi Yan, Guihai Chen

Neural Information Processing Systems http://nips.cc/

event sequence, high-dimension event sequence, sequence, (17 more...)

Country:

North America > United States (0.14)
Asia > China > Shanghai > Shanghai (0.05)
North America > Canada (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.94)

Shukla, Aman, Scantlebury, Daniel Patrick, Kumar, Rishabh

Modeling User Behavior from Adaptive Surveys with Supplemental Context

arXiv.org Artificial IntelligenceJul-29-2025

Modeling user behavior is critical across many industries where understanding preferences, intent, or decisions informs personalization, targeting, and strategic outcomes. Surveys have long served as a classical mechanism for collecting such behavioral data due to their interpretability, structure, and ease of deployment. However, surveys alone are inherently limited by user fatigue, incomplete responses, and practical constraints on their length making them insufficient for capturing user behavior. In this work, we present LANTERN (Late-Attentive Network for Enriched Response Modeling), a modular architecture for modeling user behavior by fusing adaptive survey responses with supplemental contextual signals. We demonstrate the architectural value of maintaining survey primacy through selective gating, residual connections and late fusion via cross-attention, treating survey data as the primary signal while incorporating external modalities only when relevant. LANTERN outperforms strong survey-only baselines in multi-label prediction of survey responses. We further investigate threshold sensitivity and the benefits of selective modality reliance through ablation and rare/frequent attribute analysis. LANTERN's modularity supports scalable integration of new encoders and evolving datasets. This work provides a practical and extensible blueprint for behavior modeling in survey-centric applications.

artificial intelligence, lantern, machine learning, (16 more...)

2507.20919

Country: North America > United States (0.15)

Genre: Questionnaire & Opinion Survey (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.68)

arXiv.org Artificial IntelligenceMay-20-2025

Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation Learning

Gao, Hang, Zhang, Chenhao, Wang, Tie, Zhao, Junsuo, Wu, Fengge, Zheng, Changwen, Liu, Huaping

Large Language Models (LLMs) have achieved remarkable success across various domains. However, they still face significant challenges, including high computational costs for training and limitations in solving complex reasoning problems. Although existing methods have extended the reasoning capabilities of LLMs through structured paradigms, these approaches often rely on task-specific prompts and predefined reasoning processes, which constrain their flexibility and generalizability. To address these limitations, we propose a novel framework that leverages graph learning to enable more flexible and adaptive reasoning capabilities for LLMs. Specifically, this approach models the reasoning process of a problem as a graph and employs LLM-based graph learning to guide the adaptive generation of each reasoning step. To further enhance the adaptability of the model, we introduce a Graph Neural Network (GNN) module to perform representation learning on the generated reasoning process, enabling real-time adjustments to both the model and the prompt. Experimental results demonstrate that this method significantly improves reasoning performance across multiple tasks without requiring additional training or task-specific prompt design. Code can be found in https://github.com/zch65458525/L2T.

large language model, machine learning, natural language, (18 more...)

2505.06321

Country:

North America > United States (0.93)
Europe (0.68)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceNov-26-2024

Push the Limit of Multi-modal Emotion Recognition by Prompting LLMs with Receptive-Field-Aware Attention Weighting

Zhang, Liyun, Ding, Dian, Lu, Yu, Chen, Yi-Chao, Xue, Guangtao

Understanding the emotions in a dialogue usually requires external knowledge to accurately understand the contents. As the LLMs become more and more powerful, we do not want to settle on the limited ability of the pre-trained language model. However, the LLMs either can only process text modality or are too expensive to process the multimedia information. We aim to utilize both the power of LLMs and the supplementary features from the multimedia modalities. In this paper, we present a framework, Lantern, that can improve the performance of a certain vanilla model by prompting large language models with receptive-field-aware attention weighting. This framework trained a multi-task vanilla model to produce probabilities of emotion classes and dimension scores. These predictions are fed into the LLMs as references to adjust the predicted probabilities of each emotion class with its external knowledge and contextual understanding. We slice the dialogue into different receptive fields, and each sample is included in exactly t receptive fields. Finally, the predictions of LLMs are merged with a receptive-field-aware attention-driven weighting module. In the experiments, vanilla models CORECT and SDT are deployed in Lantern with GPT-4 or Llama-3.1-405B. The experiments in IEMOCAP with 4-way and 6-way settings demonstrated that the Lantern can significantly improve the performance of current vanilla models by up to 1.23% and 1.80%.

llm, prediction, vanilla model, (17 more...)

2411.17674

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Leang, Joshua Ong Jun, Gema, Aryo Pradipta, Cohen, Shay B.

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

arXiv.org Artificial IntelligenceOct-14-2024

Mathematical reasoning remains a significant challenge for large language models (LLMs), despite progress in prompting techniques such as Chain-of-Thought (CoT). We present Chain of Mathematically Annotated Thought (CoMAT), which enhances reasoning through two stages: Symbolic Conversion (converting natural language queries into symbolic form) and Reasoning Execution (deriving answers from symbolic representations). CoMAT operates entirely with a single LLM and without external solvers. Across four LLMs, CoMAT outperforms traditional CoT on six out of seven benchmarks, achieving gains of 4.48% on MMLU-Redux (MATH) and 4.58% on GaoKao MCQ. In addition to improved performance, CoMAT ensures faithfulness and verifiability, offering a transparent reasoning process for complex mathematical tasks

large language model, machine learning, natural language, (19 more...)

2410.10336

Country:

Asia > Middle East > Jordan (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Neural Information Processing SystemsOct-7-2024, 08:03:22 GMT

Reviews: Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming

The paper descries Lantern, a framework for automatic differentiation in Scala, based on callbacks and continuation passing style. It compares against PyTorch and TensorFlow on several benchmark tasks. There are two main aspects of the paper: Reverse-mode automatic differentiation with continuations, and code generation via multi-stage programming. The submission does not provide code for the proposed framework, which I don't find acceptable for a paper on a software package. It's unclear to me how the first is different from any other implementation of automatic differentiation via operator overloading.

automatic differentiation, efficient and expressive differentiable programming, pytorch and tensorflow, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)