AITopics

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
North America > Canada (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.90)

Neural Information Processing SystemsDec-23-2025, 18:46:31 GMT

Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning

Priority dispatching rule (PDR) is widely used for solving real-world Job-shop scheduling problem (JSSP). However, the design of effective PDRs is a tedious task, requiring a myriad of specialized knowledge and often delivering limited performance. In this paper, we propose to automatically learn PDRs via an end-to-end deep reinforcement learning agent. We exploit the disjunctive graph representation of JSSP, and propose a Graph Neural Network based scheme to embed the states encountered during solving. The resulting policy network is size-agnostic, effectively enabling generalization on large-scale instances. Experiments show that the agent can learn high-quality PDRs from scratch with elementary raw features, and demonstrates strong performance against the best existing PDRs. The learned policies also perform well on much larger instances that are unseen in training.

deep reinforcement learning, job shop scheduling, learning, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-2-2025, 02:53:05 GMT

11958dfee29b6709f48a9ba0387a2431-Paper.pdf

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Robots (0.68)

Neural Information Processing SystemsOct-2-2025, 02:52:54 GMT

11958dfee29b6709f48a9ba0387a2431-AuthorFeedback.pdf

artificial intelligence, final version, machine learning, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

arXiv.org Artificial IntelligenceOct-2-2025

Rethinking Thinking Tokens: LLMs as Improvement Operators

Madaan, Lovish, Didolkar, Aniket, Gururangan, Suchin, Quan, John, Silva, Ruan, Salakhutdinov, Ruslan, Zaheer, Manzil, Arora, Sanjeev, Goyal, Anirudh

Reasoning training incentivizes LLMs to produce long chains of thought (long CoT), which among other things, allows them to explore solution strategies with self-checking. This results in higher accuracy, but inflates context length, token/compute cost, and answer latency. We ask: Can current models leverage their metacognition to provide other combinations on this Pareto frontier, e.g., better accuracy with lower context length and/or latency? Abstractly, we view the model as an improvement operator on its own "thoughts" with a continuum of possible strategies. We identify an interesting inference family Parallel-Distill-Refine (PDR), which performs the following: (i) generate diverse drafts in parallel; (ii) distill them into a bounded, textual workspace; and (iii) refine conditioned on this workspace, producing an output that seeds the next round. Importantly, context length (hence compute cost) is controllable via degree of parallelism, and is no longer conflated with the total number of generated tokens. We report PDR instantiations of current models that give better accuracy than long CoT while incurring lower latency. Setting degree of parallelism to 1 yields an interesting subcase, Sequential Refinement (SR) (iteratively improve a single candidate answer) which provides performance superior to long CoT. Success of such model orchestrations raises the question whether further training could shift the Pareto frontier. To this end, we train an 8B thinking model with Reinforcement Learning (RL) to make it consistent with PDR as the inference method. On math tasks with verifiable answers, iterative pipelines surpass single-pass baselines at matched sequential budgets, with PDR delivering the largest gains (e.g., +11% on AIME 2024 and +9% on AIME 2025).

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2510.01123

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

arXiv.org Artificial IntelligenceSep-10-2025

An efficient deep reinforcement learning environment for flexible job-shop scheduling

Wu, Xinquan, Yan, Xuefeng, Wei, Mingqiang, Guan, Donghai

The Flexible Job-shop Scheduling Problem (FJSP) is a classical combinatorial optimization problem that has a wide-range of applications in the real world. In order to generate fast and accurate scheduling solutions for FJSP, various deep reinforcement learning (DRL) scheduling methods have been developed. However, these methods are mainly focused on the design of DRL scheduling Agent, overlooking the modeling of DRL environment. This paper presents a simple chronological DRL environment for FJSP based on discrete event simulation and an end-to-end DRL scheduling model is proposed based on the proximal policy optimization (PPO). Furthermore, a short novel state representation of FJSP is proposed based on two state variables in the scheduling environment and a novel comprehensible reward function is designed based on the scheduling area of machines. Experimental results on public benchmark instances show that the performance of simple priority dispatching rules (PDR) is improved in our scheduling environment and our DRL scheduling model obtains competing performance compared with OR-Tools, meta-heuristic, DRL and PDR scheduling methods.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

2509.07019

Country: Asia > China (0.15)

Genre: Research Report (0.64)

Industry: Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-11-2025

Automated Retinal Layer and Fluid Segmentation and Cross-sectional Analysis using Spectral Domain Optical Coherence Tomography Images for Diabetic Retinopathy

Chen, S., Ma, D., Raviselvan, M., Sundaramoorthy, S., Popuri, K., Ju, M. J., Sarunic, M. V., Ratra, D., Beg, M. F.

This study presents an AI-driven pipeline for automated retinal segmentation and thickness analysis in diabetic retinopathy (DR) using SD-OCT imaging. A deep neural network was trained to segment ten retinal layers, intra-retinal fluid, and hyperreflective foci (HRF), with performance evaluated across multiple architectures. SwinUNETR achieved the highest segmentation accuracy, while VM-Unet excelled in specific layers. Analysis revealed distinct thickness variations between NPDR and PDR, with correlations between layer thickness and visual acuity. The proposed method enhances DR assessment by reducing manual annotation effort and providing clinically relevant thickness maps for disease monitoring and treatment planning.

segmentation, thickness, vm-unet, (13 more...)

2503.01248

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > North Carolina > Forsyth County > Winston-Salem (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

arXiv.org Artificial IntelligenceFeb-1-2025

Evolutionary Power-Aware Routing in VANETs using Monte-Carlo Simulation

Toutouh, J., Nesmachnow, S., Alba, E.

This work addresses the reduction of power consumption of the AODV routing protocol in vehicular networks as an optimization problem. Nowadays, network designers focus on energy-aware communication protocols, specially to deploy wireless networks. Here, we introduce an automatic method to search for energy-efficient AODV configurations by using an evolutionary algorithm and parallel Monte-Carlo simulations to improve the accuracy of the evaluation of tentative solutions. The experimental results demonstrate that significant power consumption improvements over the standard configuration can be attained, with no noteworthy loss in the quality of service.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

doi: 10.1109/HPCSim.2012.6266900

2502.10417

Country:

South America > Uruguay (0.05)
Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Telecommunications > Networks (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Boveroux, Laurie, Ernst, Damien, Louveaux, Quentin

Investigating the Monte-Carlo Tree Search Approach for the Job Shop Scheduling Problem

arXiv.org Artificial IntelligenceJan-29-2025

The Job Shop Scheduling Problem (JSSP) is a well-known optimization problem in manufacturing, where the goal is to determine the optimal sequence of jobs across different machines to minimize a given objective. In this work, we focus on minimising the weighted sum of job completion times. We explore the potential of Monte Carlo Tree Search (MCTS), a heuristic-based reinforcement learning technique, to solve large-scale JSSPs, especially those with recirculation. We propose several Markov Decision Process (MDP) formulations to model the JSSP for the MCTS algorithm. In addition, we introduce a new synthetic benchmark derived from real manufacturing data, which captures the complexity of large, non-rectangular instances often encountered in practice. Our experimental results show that MCTS effectively produces good-quality solutions for large-scale JSSP instances, outperforming our constraint programming approach.

artificial intelligence, machine learning, opération, (20 more...)

2501.17991

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Neural Information Processing SystemsOct-9-2024, 13:35:53 GMT

Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning

Priority dispatching rule (PDR) is widely used for solving real-world Job-shop scheduling problem (JSSP). However, the design of effective PDRs is a tedious task, requiring a myriad of specialized knowledge and often delivering limited performance. In this paper, we propose to automatically learn PDRs via an end-to-end deep reinforcement learning agent. We exploit the disjunctive graph representation of JSSP, and propose a Graph Neural Network based scheme to embed the states encountered during solving. The resulting policy network is size-agnostic, effectively enabling generalization on large-scale instances.

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)