AITopics | scheduler

Collaborating Authors

scheduler

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix AVariational Paragraph Embedder A.1 Selection of substitution rate p

Neural Information Processing SystemsApr-30-2026, 10:10:09 GMT

Figure 4: Impact of the proportion of injected noise for learning Paragraph Embeddings on XSum dataset. PPLint and the PPL of the generation obtained from training PLANNER on the corresponding z at different noise level. We observed when the value of p is within (0, 0.7), there Performing a grid search on each task using diffusion models is an expensive process. However, it has been observed that an increase in the value of p leads to a deviation between the two. This could be attributed to a higher conversion error that occurs when p is excessively large. A.2 Selection of number of latent code k The parameter k determines the number of latent codes used to represent a paragraph and therefore controls the compression level. Latent codes with smaller values of k are easier to model using the diffusion model, but may struggle to accurately preserve all the information in the original text. Additionally, smaller values of k offer computational efficiency as the sequence length for the diffusion model is k. To determine the best set of latent codes, we conducted experiments using three different methods: 1) selecting the first k hidden vectors, 2) selecting the last k hidden vectors, and 3) selecting interleaving hidden vectors, one for every L k hidden vectors. The results of the ablation study are presented in Table 5. Based on our findings, we observed no significant difference among the different choices, so we opted for option 1). Furthermore, we discovered that increasing the value of k does not lead to a dramatic improvement in performance. To balance between efficiency and performance, in most of our study we only use k =16 Setup BLEU_clean BLEU_robust First k (k=16) 79.59 43.17 A.3 Reconstruction, denoising and interpolation examples In Table 6, we present examples that demonstrate the adeptness of the trained Variational Paragraph Embedder in providing clean and denoised reconstructions. Additionally, we showcase interpolation results (Table 7, 8) derived from two random sentences in the hotel review dataset. The interpolated paragraph is usually coherent and incorporates inputs from both sentences, characterizing the distributional smoothness of the latent space. Reconstructed text complaints: after two nights stay, i asked the maid to clean our room (empty the wastebasket & make the bed). Denoising reconstruction (hotel review), noise level 0.3 Original text * * * check out the bathroom picture * * * i was in nyc by myself to watch some friends participate in the us olympic marathon trials. Corrupted text * * [unused697] check exams the bathroom picture * * slams i was in nyc mead myself yankee 2016 some scotch ruin in the outfielder olympicnca trials.

artificial intelligence, hotel, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia (0.93)
North America > United States > Maryland > Prince George's County (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Consumer Products & Services (1.00)
Health & Medicine (0.93)
(6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Meta-learning with an Adaptive Task Scheduler

Neural Information Processing SystemsApr-25-2026, 13:34:32 GMT

To benefit the learning of a new task, meta-learning has been proposed to transfer a well-generalized meta-model learned from various meta-training tasks. Existing meta-learning algorithms randomly sample meta-training tasks with a uniform probability, under the assumption that tasks are of equal importance. However, it is likely that tasks are detrimental with noise or imbalanced given a limited number of meta-training tasks. To prevent the meta-model from being corrupted by such detrimental tasks or dominated by tasks in the majority, in this paper, we propose an adaptive task scheduler (ATS) for the meta-training process. In ATS, for the first time, we design a neural scheduler to decide which meta-training tasks to use next by predicting the probability being sampled for each candidate task, and train the scheduler to optimize the generalization capacity of the metamodel to unseen tasks. We identify two meta-model-related factors as the input of the neural scheduler, which characterize the difficulty of a candidate task to the meta-model. Theoretically, we show that a scheduler taking the two factors into account improves the meta-training loss and also the optimization landscape. Under the setting of meta-learning with noise and limited budgets, ATS improves the performance on both miniImageNet and a real-world drug discovery benchmark by up to 13%and 18%, respectively, compared to state-of-the-art task schedulers.

artificial intelligence, machine learning, scheduler, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

30d411fdc0e6daf092a74354094359bb-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 08:47:31 GMT

artificial intelligence, machine learning, pe solution, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Appendix: On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them

Neural Information Processing SystemsApr-24-2026, 06:50:14 GMT

Suppose we have a non-zero solution θ which is a stationary point of f(θ,t) at t-th step and SGD finds θt = θ at t-th step. Theorem 2.2 of Shapiro and Wardi [9] told us that the learning rate should be small enough for convergence. Obviously, we have η < in practice. As ηt = ηt+1 does not hold, SGD cannot converging to any non-zero stationary point. The proof is now complete.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

040d3b6af368bf71f952c18da5713b48-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 06:50:11 GMT

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient LLM Scheduling by Learning to Rank

Neural Information Processing SystemsMar-21-2026, 00:59:56 GMT

In Large Language Model (LLM) inference, the output length of an LLM request is typically regarded as not known a priori. Consequently, most LLM serving systems employ a simple First-come-first-serve (FCFS) scheduling strategy, leading to Head-Of-Line (HOL) blocking and reduced throughput and service quality. In this paper, we reexamine this assumption -- we show that, although predicting the exact generation length of each request is infeasible, it is possible to predict the relative ranks of output lengths in a batch of requests, using learning to rank. The ranking information offers valuable guidance for scheduling requests. Building on this insight, we develop a novel scheduler for LLM inference and serving that can approximate the shortest-job-first (SJF) schedule better than existing approaches. We integrate this scheduler with the state-of-the-art LLM serving system and show significant performance improvement in several important applications: 2.8x lower latency in chatbot serving and 6.5x higher throughput in synthetic data generation.

artificial intelligence, large language model, natural language, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

f0d629a734b56a642701bba7bc8bb3ed-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 15:56:16 GMT

arxiv preprint arxiv, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
Europe > Slovenia > Drava > Municipality of Maribor > Maribor (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.92)
Government > Regional Government > Asia Government > North Korea Government (0.46)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

c4e3b55ed4ac9ba52d7df11f8bddbbf4-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-18-2026, 01:00:52 GMT

artificial intelligence, machine learning, superconductor, (18 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Austria > Vienna (0.04)
Africa > Chad > Salamat (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Supplemental Material A Proof for proposition

Neural Information Processing SystemsFeb-16-2026, 16:33:17 GMT

Reversing the process is not immediately obvious and thus several schedulers were proposed [23, 26, 31, 58]. In this paper, we employ DDIM [58] scheduler, a popular deterministic scheduler. Other deterministic scheduler would be suitable, and we show in section I below that our method performs well with other schedulers.

artificial intelligence, machine learning, scheduler, (19 more...)

Neural Information Processing Systems

Technology: