AITopics | allocation strategy

Collaborating Authors

allocation strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VLM-SFD: VLM-Assisted Siamese Flow Diffusion Framework for Dual-Arm Cooperative Manipulation

Chen, Jiaming, Jiang, Yiyu, Huang, Aoshen, Li, Yang, Pan, Wei

arXiv.org Artificial IntelligenceNov-24-2025

Dual-arm cooperative manipulation holds great promise for tackling complex real-world tasks that demand seamless coordination and adaptive dynamics. Despite substantial progress in learning-based motion planning, most approaches struggle to generalize across diverse manipulation tasks and adapt to dynamic, unstructured environments, particularly in scenarios involving interactions between two objects such as assembly, tool use, and bimanual grasping. To address these challenges, we introduce a novel VLM-Assisted Siamese Flow Diffusion (VLM-SFD) framework for efficient imitation learning in dual-arm cooperative manipulation. The proposed VLM-SFD framework exhibits outstanding adaptability, significantly enhancing the ability to rapidly adapt and generalize to diverse real-world tasks from only a minimal number of human demonstrations. Specifically, we propose a Siamese Flow Diffusion Network (SFDNet) employs a dual-encoder-decoder Siamese architecture to embed two target objects into a shared latent space, while a diffusion-based conditioning process - conditioned by task instructions - generates two-stream object-centric motion flows that guide dual-arm coordination. We further design a dynamic task assignment strategy that seamlessly maps the predicted 2D motion flows into 3D space and incorporates a pre-trained vision-language model (VLM) to adaptively assign the optimal motion to each robotic arm over time. Experiments validate the effectiveness of the proposed method, demonstrating its ability to generalize to diverse manipulation tasks while maintaining high efficiency and adaptability. The code and demo videos are publicly available on our project website https://sites.google.com/view/vlm-sfd/.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2025.3627381

2506.13428

Country:

Europe > United Kingdom (0.14)
Asia > China (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Non-Uniform Class-Wise Coreset Selection for Vision Model Fine-tuning

Zhang, Hanyu, Xing, Zhen, He, Ruian, Yang, Wenxuan, Ma, Chenxi, Tan, Weimin, Yan, Bo

arXiv.org Artificial IntelligenceNov-19-2025

Coreset selection aims to identify a small yet highly informative subset of data, thereby enabling more efficient model training while reducing storage overhead. Recently, this capability has been leveraged to tackle the challenges of fine-tuning large foundation models, offering a direct pathway to their efficient and practical deployment. However, most existing methods are class-agnostic, causing them to overlook significant difficulty variations among classes. This leads them to disproportionately prune samples from either overly easy or hard classes, resulting in a suboptimal allocation of the data budget that ultimately degrades the final coreset performance. T o address this limitation, we propose Non-Uniform Class-Wise Coreset Selection (NUCS), a novel framework that both integrates class-level and sample-level difficulty. W e propose a robust metric for global class difficulty, quantified as the winsorized average of per-sample difficulty scores. Guided by this metric, our method performs a theoretically-grounded, nonuniform allocation of data selection budgets inter-class, while adaptively selecting samples intra-class with optimal difficulty ranges. Extensive experiments on a wide range of visual classification tasks demonstrate that NUCS consistently outperforms state-of-the-art methods across 10 diverse datasets and pre-trained models, achieving both superior accuracy and computational efficiency, highlighting the promise of non-uniform class-wise selection strategy for advancing the efficient fine-tuning of large foundation models.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.13234

Country:

Asia > China > Shanghai > Shanghai (0.40)
North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Key Decision-Makers in Multi-Agent Debates: Who Holds the Power?

Zhang, Qian, Zheng, Yan, Liu, Jinyi, Liang, Hebin, Wang, Lanjun

arXiv.org Artificial IntelligenceNov-17-2025

Recent studies on LLM agent scaling have highlighted the potential of Multi-Agent Debate (MAD) to enhance reasoning abilities. However, the critical aspect of role allocation strategies remains underexplored. In this study, we demonstrate that allocating roles with differing viewpoints to specific positions significantly impacts MAD's performance in reasoning tasks. Specifically, we find a novel role allocation strategy, "Truth Last", which can improve MAD performance by up to 22% in reasoning tasks. To address the issue of unknown truth in practical applications, we propose the Multi-Agent Debate Consistency (MADC) strategy, which systematically simulates and optimizes its core mechanisms. MADC incorporates path consistency to assess agreement among independent roles, simulating the role with the highest consistency score as the truth. We validated MADC across a range of LLMs (9 models), including the DeepSeek-R1 Distilled Models, on challenging reasoning tasks. MADC consistently demonstrated advanced performance, effectively overcoming MAD's performance bottlenecks and providing a crucial pathway for further improvements in LLM agent scaling.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.1104

Country:

Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size

Xia, Junhao, Zhao, Ming, Xiao, Limin, Zhang, Xiujun

arXiv.org Artificial IntelligenceOct-7-2025

Large language models (LLMs) face significant computational and memory challenges, making extremely low-bit quantization crucial for their efficient deployment. In this work, we introduce SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size, a novel framework that enables extremely low-bit quantization of LLMs while preserving their linguistic reasoning capabilities. A distinctive feature of SDQ-LLM is the continuous adjustability of the Over-Sampling Ratio (OSR), enabling dynamic adaptation to memory or VRAM constraints by selecting fractional OSR (e.g. 2.5 times) for an optimal trade-off between model size and accuracy. SDQ-LLM uses upsampling combined with Sigma-Delta Quantizer to binarize or ternarize LLMs weights, encoding high-precision parameters into 1-bit or 1.58-bit representations, replacing the multiplication operations within linear layers with addition. This approach significantly enhances inference efficiency under extremely low-bit quantization. To further reduce the loss of quantization precision, we incorporate Hadamard-based weight smoothing prior to quantization, improving the stability and robustness of the weight representations. Furthermore, to fully leverage the continuity of the OSR and reduce precision loss, recognizing the correlation between quantization sensitivity and weight variance, we propose a fine-grained, layer- and linear-wise OSR allocation strategy, MultiOSR. This strategy distributes OSR both across layers and within each layer, based on weight variance and parameter scale. Finally, extensive experiments on OPT and LLaMA model families demonstrate that SDQ-LLM achieves a more efficient and high-precision performance even under highly aggressive low-OSR settings. Our code is available at https://github.com/Dreamlittlecat/LLM-Quant-Factory.

large language model, natural language, quantization, (15 more...)

arXiv.org Artificial Intelligence

2510.03275

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 05:56:49 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. In this problem there is a hidden unknown parameter \theta*\in R^d and a finite set of arms X\subseteq R^d. When an arm is pulled you observe a reward x^T\theta*+epsilon where epsilon is a zero mean i.i.d noise with bounded range. The goal is to identify argmax_{x\in X} x^T\theta* with the least number of samples. Results: The paper characterizes the sample complexity of static and dynamic allocation strategies to identify the best arm.

algorithm, complexity, identification, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Learning-Based Resource Management in Integrated Sensing and Communication Systems

Lu, Ziyang, Gursoy, M. Cenk, Mohan, Chilukuri K., Varshney, Pramod K.

arXiv.org Artificial IntelligenceJun-27-2025

-- In this paper, we tackle the task of adaptive time allocation in integrated sensing and communication systems equipped with radar and communication units. The dual-functional radar-communication system's task involves allocating dwell times for tracking multiple targets and utilizing the remaining time for data transmission towards estimated target locations. We introduce a novel constrained deep reinforcement learning (CDRL) approach, designed to optimize resource allocation between tracking and communication under time budget constraints, thereby enhancing target communication quality. Our numerical results demonstrate the efficiency of our proposed CDRL framework, confirming its ability to maximize communication quality in highly dynamic environments while adhering to time constraints. A. Background 1) Cognitive Radar: Radar technology, integral to various applications in environmental sensing, space exploration, navigation, and traffic control, has become increasingly important with the emergence of autonomous vehicles and drones.

communication, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2506.20849

Country: North America > United States > New York > Onondaga County > Syracuse (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

ROS 2 Agnocast: Supporting Unsized Message Types for True Zero-Copy Publish/Subscribe IPC

Ishikawa-Aso, Takahiro, Kato, Shinpei

arXiv.org Artificial IntelligenceJun-23-2025

Robot applications, comprising independent components that mutually publish/subscribe messages, are built on inter-process communication (IPC) middleware such as Robot Operating System 2 (ROS 2). In large-scale ROS 2 systems like autonomous driving platforms, true zero-copy communication -- eliminating serialization and deserialization -- is crucial for efficiency and real-time performance. However, existing true zero-copy middleware solutions lack widespread adoption as they fail to meet three essential requirements: 1) Support for all ROS 2 message types including unsized ones; 2) Minimal modifications to existing application code; 3) Selective implementation of zero-copy communication between specific nodes while maintaining conventional communication mechanisms for other inter-node communications including inter-host node communications. This first requirement is critical, as production-grade ROS 2 projects like Autoware rely heavily on unsized message types throughout their codebase to handle diverse use cases (e.g., various sensors), and depend on the broader ROS 2 ecosystem, where unsized message types are pervasive in libraries. The remaining requirements facilitate seamless integration with existing projects. While IceOryx middleware, a practical true zero-copy solution, meets all but the first requirement, other studies achieving the first requirement fail to satisfy the remaining criteria. This paper presents Agnocast, a true zero-copy IPC framework applicable to ROS 2 C++ on Linux that fulfills all these requirements. Our evaluation demonstrates that Agnocast maintains constant IPC overhead regardless of message size, even for unsized message types. In Autoware PointCloud Preprocessing, Agnocast achieves a 16% improvement in average response time and a 25% improvement in worst-case response time.

artificial intelligence, message type, ro 2, (18 more...)

arXiv.org Artificial Intelligence

2506.16882

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.40)
Europe (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.49)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.35)

Add feedback

Optimizing Latent Dimension Allocation in Hierarchical VAEs: Balancing Attenuation and Information Retention for OOD Detection

Williamson, Dane, Ji, Yangfeng, Dwyer, Matthew

arXiv.org Artificial IntelligenceJun-13-2025

Out-of-distribution (OOD) detection is a critical task in machine learning, particularly for safety-critical applications where unexpected inputs must be reliably flagged. While hierarchical variational autoencoders (HVAEs) offer improved representational capacity over traditional VAEs, their performance is highly sensitive to how latent dimensions are distributed across layers. Existing approaches often allocate latent capacity arbitrarily, leading to ineffective representations or posterior collapse. In this work, we introduce a theoretically grounded framework for optimizing latent dimension allocation in HVAEs, drawing on principles from information theory to formalize the trade-off between information loss and representational attenuation. We prove the existence of an optimal allocation ratio $r^{\ast}$ under a fixed latent budget, and empirically show that tuning this ratio consistently improves OOD detection performance across datasets and architectures. Our approach outperforms baseline HVAE configurations and provides practical guidance for principled latent structure design, leading to more robust OOD detection with deep generative models.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.10089

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Optimized Cloud Resource Allocation Using Genetic Algorithms for Energy Efficiency and QoS Assurance

Panggabean, Caroline, C, Devaraj Verma, Gogoi, Bhagyashree, Limbu, Ranju, Sarker, Rhythm

arXiv.org Artificial IntelligenceApr-25-2025

Caroline Panggabean Department of CSE (AI) JAIN (Deemed - to - be University) Bangalore, Karnataka carolinepgabean@gmail.com ORCID: https://orcid.org/0009 - 0004 - 9964 - 7986 Ranju Limbu Department of CSE (AIM) JAIN (Deemed - to - be University) Bangalore, Karnataka 21btlca002 @jainuniversity.ac.in Dr. Devaraj Verma C Department of CSE (AI) JAIN (Deemed - to - be University) Bangalore, Karnataka c.devaraj@jainuniversity.ac.in ORCID: https://orcid.org/0000 - 0002 - 1504 - 4263 Rhythm Sarker Department of CSE (AIML) JAIN (Deemed - to - be University) Bangalore, Karnataka 21btrca065 @jainuniversity.ac.in Bhagyashree Gogoi Department of CSE (AI) JAIN (Deemed - to - be University) Bangalore, Karnataka 21btlca001 @ jainuniver s ity.ac.in Abstract -- Cloud computing environments demand dynamic and efficient resource management to ensure optimal performance, reduced energy consumption, and adherence to Service Level Agreements (SLAs). This paper presents a Genetic Algorithm (GA) - based approach for Virtual Machine (VM) placement and consol idation, aiming to minimize power usage while maintaining QoS constraints. The proposed method dynamically adjusts VM allocation based on real - time workload variations, outperforming traditional heuristics such as First Fit Decreasing (FFD) and Best Fit De creasing (BFD). Experimental results show notable reductions in energy consumption, VM migrations, SLA violation rates, and execution time.

cloud computing, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.17675

Country: Asia > India > Karnataka > Bengaluru (1.00)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (1.00)
Information Technology > Services (0.95)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences

Qin, Ziran, Cao, Yuchen, Lin, Mingbao, Hu, Wen, Fan, Shixuan, Cheng, Ke, Lin, Weiyao, Li, Jianguo

arXiv.org Artificial IntelligenceMar-16-2025

Large language models (LLMs) excel at processing long sequences, boosting demand for key-value (KV) caching. While recent efforts to evict KV cache have alleviated the inference burden, they often fail to allocate resources rationally across layers with different attention patterns. In this paper, we introduce Cascading and Adaptive KV cache Eviction (CAKE), a novel approach that frames KV cache eviction as a "cake-slicing problem." CAKE assesses layer-specific preferences by considering attention dynamics in both spatial and temporal dimensions, allocates rational cache size for layers accordingly, and manages memory constraints in a cascading manner. This approach enables a global view of cache allocation, adaptively distributing resources across diverse attention mechanisms while maintaining memory budgets. CAKE also employs a new eviction indicator that considers the shifting importance of tokens over time, addressing limitations in existing methods that overlook temporal dynamics. Comprehensive experiments on LongBench and NeedleBench show that CAKE maintains model performance with only 3.2% of the KV cache and consistently outperforms current baselines across various models and memory constraints, particularly in low-memory settings. Additionally, CAKE achieves over 10 speedup in decoding latency compared to full cache when processing contexts of 128K tokens with FlashAttention-2. New models such as GPT-4 (Achiam et al., 2023), Claude 3.5 (Anthropic, 2024), LLaMA 3.1 (Dubey et al., 2024) and Mistral Large 2 (AI, 2024) have extended token processing capacities beyond 128K. Shazeer (2019); Ainslie et al. (2023) partially address this issue by merging key-value heads during the training phase. However, optimizing key-value cache without additional training is crucial for efficient inference of long contexts under memory constraints, particularly in typical deployment scenarios where the model structure is fixed. One way to maintain a manageable KV cache size on the fly is to remove some KV pairs (Xiao et al., 2023; Zhang et al., 2024b; Li et al., 2024b). The idea is to eliminate less important KV pairs based on certain rules. Although recent methods have enhanced pair selection for removal, they typically assign uniform cache sizes across layers, disregarding layer-specific requirements.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.12491

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback