AITopics

Country: North America (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsApr-24-2026, 15:48:22 GMT

Differentiable Unsupervised Feature Selection based on a Gated Laplacian - Supplementary Materials

It is important to properly tune the kernel scale/bandwidth σb, which determines its scale of connectivity. Several studies have proposed schemes for tuning σb, see for example [10, 3, 12, 5]. Here, we focus on two schemes, a global bandwidth and a local bandwidth. The local bandwidth proposed in [12], involves setting a local-scale σi for each data point xi,i= 1,...,n. The scale is chosen using the L1 distance from the k-th nearest neighbor of the point xi.

artificial intelligence, dataset, machine learning, (15 more...)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Neural Information Processing SystemsFeb-14-2026, 03:06:16 GMT

cb12d7f933e7d102c52231bf62b8a678-AuthorFeedback.pdf

predicate, relation, sanctity, (16 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.52)

Neural Information Processing SystemsDec-25-2025, 10:52:35 GMT

Practical Differentially Private Hyperparameter Tuning with Subsampling

Tuning the hyperparameters of differentially private (DP) machine learning (ML) algorithms often requires use of sensitive data and this may leak private information via hyperparameter values. Recently, Papernot and Steinke (2022) proposed a certain class of DP hyperparameter tuning algorithms, where the number of random search samples is randomized. Commonly, these algorithms still considerably increase the DP privacy parameter $\varepsilon$ over non-tuned DP ML model training and can be computationally heavy as evaluating each hyperparameter candidate requires a new training run. We focus on lowering both the DP bounds and the compute cost of these methods by using only a random subset of the sensitive data for the hyperparameter tuning and by appropriately extrapolating the optimal values to a larger dataset. We carry out a Rényi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off than the baseline method by Papernot and Steinke.

artificial intelligence, machine learning, proceedings, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Neural Information Processing SystemsDec-24-2025, 09:33:21 GMT

Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL

Despite a series of recent successes in reinforcement learning (RL), many RL algorithms remain sensitive to hyperparameters. As such, there has recently been interest in the field of AutoRL, which seeks to automate design decisions to create more general algorithms. Recent work suggests that population based approaches may be effective AutoRL algorithms, by learning hyperparameter schedules on the fly. In particular, the PB2 algorithm is able to achieve strong performance in RL tasks by formulating online hyperparameter optimization as time varying GP-bandit problem, while also providing theoretical guarantees. However, PB2 is only designed to work for \emph{continuous} hyperparameters, which severely limits its utility in practice. In this paper we introduce a new (provably) efficient hierarchical approach for optimizing \emph{both continuous and categorical} variables, using a new time-varying bandit algorithm specifically designed for the population based training regime. We evaluate our approach on the challenging Procgen benchmark, where we show that explicitly modelling dependence between data augmentation and other hyperparameters improves generalization.

efficient population, input hyperparameter, name change, (7 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Kamath, Gaurav, Vajjala, Sowmya

Does Synthetic Data Help Named Entity Recognition for Low-Resource Languages?

arXiv.org Artificial IntelligenceNov-6-2025

Named Entity Recognition(NER) for low-resource languages aims to produce robust systems for languages where there is limited labeled training data available, and has been an area of increasing interest within NLP. Data augmentation for increasing the amount of low-resource labeled data is a common practice. In this paper, we explore the role of synthetic data in the context of multilingual, low-resource NER, considering 11 languages from diverse language families. Our results suggest that synthetic data does in fact hold promise for low-resource language NER, though we see significant variation between languages.

computational linguistic, large language model, machine learning, (19 more...)

2505.16814

Country:

Europe (1.00)
North America > Canada (0.70)
Asia > Middle East > UAE (0.15)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

arXiv.org Artificial IntelligenceNov-4-2025

Context Tuning for In-Context Optimization

Lu, Jack, Teehan, Ryan, Yang, Zhenbang, Ren, Mengye

We introduce Context Tuning, a simple and effective method to significantly enhance few-shot adaptation of language models (LLMs) without fine-tuning model parameters. While prompt-based adaptation techniques have demonstrated the effectiveness of lightweight adaptation methods for LLMs, they typically initialize a trainable prompt or prefix with irrelevant tokens for the task at hand. In contrast, Context Tuning initializes the trainable prompt or prefix with task-specific demonstration examples, leveraging the model's inherent In-Context Learning (ICL) ability to extract relevant information for improved few-shot learning performance. Extensive evaluations on benchmarks such as CrossFit, UnifiedQA, MMLU, BIG-Bench Hard, and ARC demonstrate that Context Tuning outperforms traditional prompt-based adaptation methods and achieves competitive accuracy to Test-Time Training with significantly higher training efficiency.

large language model, machine learning, natural language, (17 more...)

2507.04221

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Awlla, Kozhin muhealddin, Veisi, Hadi, Abdullah, Abdulhady Abas

KuBERT: Central Kurdish BERT Model and Its Application for Sentiment Analysis

arXiv.org Artificial IntelligenceSep-23-2025

This paper enhances the study of sentiment analysis for the Central Kurdish language by integrating the Bidirectional Encoder Representations from Transformers (BERT) into Natural Language Processing techniques. Kurdish is a low - resourced language, having a high level of linguistic diversity with minimal computational resources, making sentiment analysis somewhat challenging. Earlier, this was done using a traditional w ord embedding model, such as Word2Vec, but with the emergence of new language models, specifically BERT, there is hope for improvements. The better word embedding capabilities of BERT lend to this study, aiding in the capturing of the nuanced semantic pool and the contextual intricacies of the language under study, the Kurdish language, thus setting a new benchmark for sentiment analysis in low - resource languages. The steps include collecting and normalizing a large corpus of Kurdish texts, pretraining BERT with a special tokenizer for Kurdish, and developing different models for sentiment analysis including Bidirectional Long Short - Term Memory ( BiLSTM), Multi - L ayer Perceptron ( MLP), and finetuning the BERT classifier . The proposed approach consists of 3 cla sses: positive, negative, and neutral sentiment analysis using a sentiment embedding of BERT in four different configurations. The accuracy of the best - performing classifier, BiLSTM, is 74.09%. For the BERT with an MLP classifier model, the maximum accuracy achieved is 73.96%, while the fine - tuned BERT model tops the others with 75.37% accuracy. Additionally, the fine - tuned BERT model demonstrates a vast improvement when focused on t wo 2 - class sentiment analyses positive and negative with an accuracy of 86.

machine learning, natural language, sentiment analysis, (18 more...)

2509.16804

Country: Asia > Middle East > Iraq (0.68)

Genre: Research Report > New Finding (0.94)

Industry:

Media (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bayram, Erkan, Belabbas, Mohamed-Ali, Başar, Tamer

Geometric Foundations of Tuning without Forgetting in Neural ODEs

arXiv.org Artificial IntelligenceSep-4-2025

In our earlier work, we introduced the principle of Tuning without Forgetting (TwF) for sequential training of neural ODEs, where training samples are added iteratively and parameters are updated within the subspace of control functions that preserves the end-point mapping at previously learned samples on the manifold of output labels in the first-order approximation sense. In this letter, we prove that this parameter subspace forms a Banach submanifold of finite codimension under nonsingular controls, and we characterize its tangent space. This reveals that TwF corresponds to a continuation/deformation of the control function along the tangent space of this Banach submanifold, providing a theoretical foundation for its mapping-preserving (not forgetting) during the sequential training exactly, beyond first-order approximation.

artificial intelligence, banach submanifold, machine learning, (15 more...)