AITopics

Country: North America > United States > Illinois (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Neural Information Processing SystemsFeb-17-2026, 13:13:18 GMT

ded98d28f82342a39f371c013dfb3058-Paper-Conference.pdf

machine learning, natural language, unet, (18 more...)

Country:

Asia > Middle East > Israel (0.04)
Asia > China > Guangdong Province (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Neural Information Processing SystemsFeb-11-2026, 14:53:23 GMT

Least Square Calibrationfor Peer Reviews

Our procedurepaper in include 22] in demonstrates beanon-ne 29,24].

artificial intelligence, lsc, theorem 3, (10 more...)

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.15)
Asia > Singapore (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Neural Information Processing SystemsDec-26-2025, 23:35:15 GMT

ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection

In diffusion models, UNet is the most popular network backbone, since its long skip connects (LSCs) to connect distant network blocks can aggregate long-distant information and alleviate vanishing gradient. Unfortunately, UNet often suffers from unstable training in diffusion models which can be alleviated by scaling its LSC coefficients smaller. However, theoretical understandings of the instability of UNet in diffusion models and also the performance improvement of LSC scaling remain absent yet. To solve this issue, we theoretically show that the coefficients of LSCs in UNet have big effects on the stableness of the forward and backward propagation and robustness of UNet. Specifically, the hidden feature and gradient of UNet at any layer can oscillate and their oscillation ranges are actually large which explains the instability of UNet training. Moreover, UNet is also provably sensitive to perturbed input, and predicts an output distant from the desired output, yielding oscillatory loss and thus oscillatory gradient. Besides, we also observe the theoretical benefits of the LSC coefficient scaling of UNet in the stableness of hidden features and gradient and also robustness. Finally, inspired by our theory, we propose an effective coefficient scaling framework ScaleLong that scales the coefficients of LSC in UNet and better improve the training stability of UNet. Experimental results on CIFAR10, CelebA, ImageNet and COCO show that our methods are superior to stabilize training, and yield about 1.5x training acceleration on different diffusion models with UNet or UViT backbones.

diffusion model, scalelong, unet, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Formica, Federico, Gregis, Stefano, Zanenga, Aurora Francesca, Rota, Andrea, Lawford, Mark, Menghi, Claudio

Feature-Guided Analysis of Neural Networks: A Replication Study

arXiv.org Artificial IntelligenceNov-4-2025

Understanding why neural networks make certain decisions is pivotal for their use in safety-critical applications. Feature-Guided Analysis (FGA) extracts slices of neural networks relevant to their tasks. Existing feature-guided approaches typically monitor the activation of the neural network neurons to extract the relevant rules. Preliminary results are encouraging and demonstrate the feasibility of this solution by assessing the precision and recall of Feature-Guided Analysis on two pilot case studies. However, the applicability in industrial contexts needs additional empirical evidence. To mitigate this need, this paper assesses the applicability of FGA on a benchmark made by the MNIST and LSC datasets. We assessed the effectiveness of FGA in computing rules that explain the behavior of the neural network. Our results show that FGA has a higher precision on our benchmark than the results from the literature. We also evaluated how the selection of the neural network architecture, training, and feature selection affect the effectiveness of FGA. Our results show that the selection significantly affects the recall of FGA, while it has a negligible impact on its precision.

artificial intelligence, deep learning, machine learning, (19 more...)

2511.00052

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Pyati, Prajyot, Kaur, Navjot, Jana, Saswata, Bhattacharya, Adri, Mandal, Partha Sarathi

Separation of Unconscious Robots with Obstructed Visibility

arXiv.org Artificial IntelligenceOct-28-2025

We study a recently introduced \textit{unconscious} mobile robot model, where each robot is associated with a \textit{color}, which is visible to other robots but not to itself. The robots are autonomous, anonymous, oblivious and silent, operating in the Euclidean plane under the conventional \textit{Look-Compute-Move} cycle. A primary task in this model is the \textit{separation problem}, where unconscious robots sharing the same color must separate from others, forming recognizable geometric shapes such as circles, points, or lines. All prior works model the robots as \textit{transparent}, enabling each to know the positions and colors of all other robots. In contrast, we model the robots as \textit{opaque}, where a robot can obstruct the visibility of two other robots, if it lies on the line segment between them. Under this obstructed visibility, we consider a variant of the separation problem in which robots, starting from any arbitrary initial configuration, are required to separate into concentric semicircles. We present a collision-free algorithm that solves the separation problem under a semi-synchronous scheduler in $O(n)$ epochs, where $n$ is the number of robots. The robots agree on one coordinate axis but have no knowledge of $n$.

artificial intelligence, robot, semicircle, (17 more...)

2510.22434

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Neural Information Processing SystemsOct-9-2025, 09:30:51 GMT

ded98d28f82342a39f371c013dfb3058-Paper-Conference.pdf

machine learning, natural language, unet, (18 more...)

Country:

Asia > Middle East > Israel (0.04)
Asia > China > Guangdong Province (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

arXiv.org Artificial IntelligenceJun-3-2025

Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs

Lee, Sungjae, Kim, Hoyoung, Hwang, Jeongyeon, Park, Eunhyeok, Ok, Jungseul

Scaling test-time computation--generating and analyzing multiple or sequential outputs for a single input--has become a promising strategy for improving the reliability and quality of large language models (LLMs), as evidenced by advances in uncertainty quantification and multi-step reasoning. A key shared component is semantic clustering, which groups outputs that differ in form but convey the same meaning. Semantic clustering enables estimation of the distribution over the semantics of outputs and helps avoid redundant exploration of reasoning paths. However, existing approaches typically rely on external models, which introduce substantial computational overhead and often fail to capture context-aware semantics. We propose Latent Semantic Clustering (LSC), a lightweight and context-sensitive method that leverages the generator LLM's internal hidden states for clustering, eliminating the need for external models. Our extensive experiment across various LLMs and datasets shows that LSC significantly improves the computational efficiency of test-time scaling while maintaining or exceeding the performance of existing methods.

large language model, machine learning, natural language, (17 more...)

2506.00344

Country: Asia (0.93)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Chulev, Joanikij, Mladenovska, Angela

Line Space Clustering (LSC): Feature-Based Clustering using K-medians and Dynamic Time Warping for Versatility

arXiv.org Artificial IntelligenceMar-19-2025

Clustering high-dimensional data is a critical challenge in machine learning due to the curse of dimensionality and the presence of noise. Traditional clustering algorithms often fail to capture the intrinsic structures in such data. This paper explores a combination of clustering methods, which we called Line Space Clustering (LSC), a representation that transforms data points into lines in a newly defined feature space, enabling clustering based on the similarity of feature value patterns, essentially treating features as sequences. LSC employs a combined distance metric that uses Euclidean and Dynamic Time Warping (DTW) distances, weighted by a parameter {\alpha}, allowing flexibility in emphasizing shape or magnitude similarities. We delve deeply into the mechanics of DTW and the Savitzky Golay filter, explaining their roles in the algorithm. Extensive experiments demonstrate the efficacy of LSC on synthetic and real-world datasets, showing that randomly experimenting with time-series optimized methods sometimes might surprisingly work on a complex dataset, particularly in noisy environments. Source code and experiments are available at: https://github.com/JoanikijChulev/LSC.

artificial intelligence, data mining, machine learning, (14 more...)

2503.15777

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)