AITopics

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Neural Information Processing SystemsOct-3-2025, 07:57:06 GMT

Unsupervised Text Generation by Learning from Search

Later, we perform max-margin (MM) learning to better distinguish between higher-scored sentences and other high-probability but sub-optimal sentences.

gpt2, text generation, tgl, (15 more...)

Country:

North America > Canada > Alberta (0.14)
Asia > China > Hong Kong (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Neural Information Processing SystemsOct-2-2025, 08:00:38 GMT

Multi-Layer Feature Reduction for Tree Structured Group Lasso via Hierarchical Projection

Jie Wang, Jieping Ye

Neural Information Processing Systems http://nips.cc/

inactive node, mlfre, node, (16 more...)

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Fernandez, Jared, Bisk, Yonatan, Strubell, Emma

Gradient Localization Improves Lifelong Pretraining of Language Models

arXiv.org Artificial IntelligenceNov-7-2024

Large Language Models (LLMs) trained on web-scale text corpora have been shown to capture world knowledge in their parameters. However, the mechanism by which language models store different types of knowledge is poorly understood. In this work, we examine two types of knowledge relating to temporally sensitive entities and demonstrate that each type is localized to different sets of parameters within the LLMs. We hypothesize that the lack of consideration of the locality of knowledge in existing continual learning methods contributes to both: the failed uptake of new information, and catastrophic forgetting of previously learned information. We observe that sequences containing references to updated and newly mentioned entities exhibit larger gradient norms in a subset of layers. We demonstrate that targeting parameter updates to these relevant layers can improve the performance of continually pretraining on language containing temporal drift.

gradient norm, knowledge, language model, (12 more...)

2411.04448

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Haiti (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsMar-13-2024

Multi-Layer Feature Reduction for Tree Structured Group Lasso via Hierarchical Projection Jie Wang

Tree structured group Lasso (TGL) is a powerful technique in uncovering the tree structured sparsity over the features, where each node encodes a group of features. It has been applied successfully in many real-world applications. However, with extremely large feature dimensions, solving TGL remains a significant challenge due to its highly complicated regularizer. In this paper, we propose a novel Multi-Layer Feature reduction method (MLFre) to quickly identify the inactive nodes (the groups of features with zero coefficients in the solution) hierarchically in a top-down fashion, which are guaranteed to be irrelevant to the response. Thus, we can remove the detected nodes from the optimization without sacrificing accuracy. The major challenge in developing such testing rules is due to the overlaps between the parents and their children nodes. By a novel hierarchical projection algorithm, MLFre is able to test the nodes independently from any of their ancestor nodes. Moreover, we can integrate MLFre--that has a low computational cost--with any existing solvers. Experiments on both synthetic and real data sets demonstrate that the speedup gained by MLFre can be orders of magnitude.

inactive node, mlfre, node, (17 more...)

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceNov-30-2023

LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning

Li, Jintang, Dan, Jiawang, Wu, Ruofan, Zhou, Jing, Tian, Sheng, Liu, Yunfei, Wang, Baokun, Meng, Changhua, Wang, Weiqiang, Zhu, Yuchang, Chen, Liang, Zheng, Zibin

Over the past few years, graph neural networks (GNNs) have become powerful and practical tools for learning on (static) graph-structure data. However, many real-world applications, such as social networks and e-commerce, involve temporal graphs where nodes and edges are dynamically evolving. Temporal graph neural networks (TGNNs) have progressively emerged as an extension of GNNs to address time-evolving graphs and have gradually become a trending research topic in both academics and industry. Advancing research and application in such an emerging field necessitates the development of new tools to compose TGNN models and unify their different schemes for dealing with temporal graphs. In this work, we introduce LasTGL, an industrial framework that integrates unified and extensible implementations of common temporal graph learning algorithms for various advanced tasks. The purpose of LasTGL is to provide the essential building blocks for solving temporal graph learning tasks, focusing on the guiding principles of user-friendliness and quick prototyping on which PyTorch is based. In particular, LasTGL provides comprehensive temporal graph datasets, TGNN models and utilities along with well-documented tutorials, making it suitable for both absolute beginners and expert deep learning practitioners alike.

graph, representation, temporal graph, (14 more...)

2311.16605

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology > Services (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Suys, Tom, Hwang, Sunyou, de Croon, Guido C. H. E., Remes, Bart D. W.

Autonomous Control for Orographic Soaring of Fixed-Wing UAVs

arXiv.org Artificial IntelligenceMay-23-2023

Abstract-- We present a novel controller for fixed-wing UAVs that enables autonomous soaring in an orographic wind field, extending flight endurance. Our method identifies soaring regions and addresses position control challenges by introducing a target gradient line (TGL) on which the UAV achieves an equilibrium soaring position, where sink rate and updraft are balanced. We also demonstrate a single degree of control freedom in a soaring position through manipulation of the TGL. I. INTRODUCTION UAVs have benefited from advancements in battery technology and miniaturization of avionics, which resulted in an increase in their endurance and range. However, the full potential of UAV applications remains limited by reduced flight time.

artificial intelligence, controller, wind field, (17 more...)

2305.13891

Country: Europe > Netherlands > South Holland > Delft (0.05)

Genre: Research Report > New Finding (0.47)

Industry:

Transportation > Air (1.00)
Energy (0.95)
Aerospace & Defense > Aircraft (0.90)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

arXiv.org Artificial IntelligenceJul-9-2020

Unsupervised Text Generation by Learning from Search

Li, Jingjing, Li, Zichao, Mou, Lili, Jiang, Xin, Lyu, Michael R., King, Irwin

In this work, we present TGLS, a novel framework to unsupervised Text Generation by Learning from Search. We start by applying a strong search algorithm (in particular, simulated annealing) towards a heuristically defined objective that (roughly) estimates the quality of sentences. Then, a conditional generative model learns from the search results, and meanwhile smooth out the noise of search. The alternation between search and learning can be repeated for performance bootstrapping. We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, paraphrase generation and text formalization. Our model significantly outperforms unsupervised baseline methods in both tasks. Especially, it achieves comparable performance with the state-of-the-art supervised methods in paraphrase generation.

artificial intelligence, machine learning, natural language, (18 more...)

2007.08557

Country:

North America > Canada > Alberta (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Neural Information Processing SystemsDec-31-2015

Multi-Layer Feature Reduction for Tree Structured Group Lasso via Hierarchical Projection

Wang, Jie, Ye, Jieping

Tree structured group Lasso (TGL) is a powerful technique in uncovering the tree structured sparsity over the features, where each node encodes a group of features. It has been applied successfully in many real-world applications. However, with extremely large feature dimensions, solving TGL remains a significant challenge due to its highly complicated regularizer. In this paper, we propose a novel Multi-Layer Feature reduction method (MLFre) to quickly identify the inactive nodes (the groups of features with zero coefficients in the solution) hierarchically in a top-down fashion, which are guaranteed to be irrelevant to the response. Thus, we can remove the detected nodes from the optimization without sacrificing accuracy. The major challenge in developing such testing rules is due to the overlaps between the parents and their children nodes. By a novel hierarchical projection algorithm, MLFre is able to test the nodes independently from any of their ancestor nodes. Moreover, we can integrate MLFre---that has a low computational cost---with any existing solvers. Experiments on both synthetic and real data sets demonstrate that the speedup gained by MLFre can be orders of magnitude.

artificial intelligence, machine learning, natural language, (18 more...)