Goto

Collaborating Authors

 xxx


TextDiffuser: Diffusion Models as Text Painters

Neural Information Processing Systems

TextDiffuser consists of two stages: first, a Transformer model generates the layout of keywords extracted from text prompts, and then diffusion models generate images conditioned on the text prompt and the generated layout.





Position-basedScaledGradientforModel QuantizationandPruning-Appendix

Neural Information Processing Systems

Inthis experiment, we only quantize the weights, not the activations, to compare the performance degradation as weight bit-width decreases. The mean squared errors (MSE) of the weights across different bit-widths are also reported. The name of the layer and the number of parameters in parenthesis are shown in the column. All numbers are results of the last epoch. Table A3: ResNet-32 trained with Adam on the CIFAR-100 dataset.


4aa13186c795a52ba88f5b822f4b77eb-Paper-Conference.pdf

Neural Information Processing Systems

Therefore, estimating how well a given model might perform on the new data is an important step toward reliable ML applications. This isverychallenging, however,asthedata distribution can change inflexible ways, and we may not haveanylabels on the new data, which is often the case in monitoring settings. In this paper, we propose a new distribution shift model, Sparse Joint Shift (SJS), which considers the joint shift of both labels and afew features.


45c166d697d65080d54501403b433256-AuthorFeedback.pdf

Neural Information Processing Systems

The reviewers2 acknowledge that the ideas presented inthe paper are compelling, sound and appear tobeeffective(R3), offering a3 great add to the GP literature (R1) which is also supported by a solid and an interesting theoretical foundation (R2,4 R4). Existing multi-output GP models are not applicable to our setting (see line 79-83) and are thus not16 comparabletotheDAG-GPmodel. Wehavefurther clarified this point in Section 1.2.


Generalised Mutual Information for Discriminative Clustering

Neural Information Processing Systems

All GEMINIsaresummarisedin Table 1, (see Appendix Dforderivations). Figure 2: Clusteringofamixtureof 3 Gaussiandistributionswith MI (left) anda GEMINI (right) usingcategoricaldistributions.


TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning

Ni, Hang, Liu, Fan, Ma, Xinyu, Su, Lixin, Wang, Shuaiqiang, Yin, Dawei, Xiong, Hui, Liu, Hao

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown promise in automating travel planning, yet they often fall short in addressing nuanced spatiotemporal rationality. While existing benchmarks focus on basic plan validity, they neglect critical aspects such as route efficiency, POI appeal, and real-time adaptability. This paper introduces TP-RAG, the first benchmark tailored for retrieval-augmented, spatiotemporal-aware travel planning. Our dataset includes 2,348 real-world travel queries, 85,575 fine-grain annotated POIs, and 18,784 high-quality travel trajectory references sourced from online tourist documents, enabling dynamic and context-aware planning. Through extensive experiments, we reveal that integrating reference trajectories significantly improves spatial efficiency and POI rationality of the travel plan, while challenges persist in universality and robustness due to conflicting references and noisy data. To address these issues, we propose EvoRAG, an evolutionary framework that potently synergizes diverse retrieved trajectories with LLMs' intrinsic reasoning. EvoRAG achieves state-of-the-art performance, improving spatiotemporal compliance and reducing commonsense violation compared to ground-up and retrieval-augmented baselines. Our work underscores the potential of hybridizing Web knowledge with LLM-driven optimization, paving the way for more reliable and adaptive travel planning agents.


LLM-Upgraded Graph Reinforcement Learning for Carbon-Aware Job Scheduling in Smart Manufacturing

Yang, Zhiying, Liu, Fang, Zhang, Wei, Lou, Xin, Low, Malcolm Yoke Hean, Gan, Boon Ping

arXiv.org Artificial Intelligence

This paper presents \textsc{Luca}, a \underline{l}arge language model (LLM)-\underline{u}pgraded graph reinforcement learning framework for \underline{c}arbon-\underline{a}ware flexible job shop scheduling. \textsc{Luca} addresses the challenges of dynamic and sustainable scheduling in smart manufacturing systems by integrating a graph neural network and an LLM, guided by a carefully designed in-house prompting strategy, to produce a fused embedding that captures both structural characteristics and contextual semantics of the latest scheduling state. This expressive embedding is then processed by a deep reinforcement learning policy network, which generates real-time scheduling decisions optimized for both makespan and carbon emission objectives. To support sustainability goals, \textsc{Luca} incorporates a dual-objective reward function that encourages both energy efficiency and scheduling timeliness. Experimental results on both synthetic and public datasets demonstrate that \textsc{Luca} consistently outperforms comparison algorithms. For instance, on the synthetic dataset, it achieves an average of 4.1\% and up to 12.2\% lower makespan compared to the best-performing comparison algorithm while maintaining the same emission level. On public datasets, additional gains are observed for both makespan and emission. These results demonstrate that \textsc{Luca} is effective and practical for carbon-aware scheduling in smart manufacturing.