AITopics | dcp

Collaborating Authors

dcp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conformalized Percentile Interval: Finite Sample Validity and Improved Conditional Performance

Zou, Ran, Zhu, Wanrong, Nan, Bin

arXiv.org Machine LearningMay-6-2026

Conformal prediction provides distribution-free predictive intervals with finite-sample marginal coverage. However, achieving conditional validity and interval efficiency (in terms of short interval length) remains challenging, particularly in complex settings with heteroskedasticity, skewed responses, or estimation errors. We propose a conformal-style calibration method for responses obtained by the probability integral transform (PIT) of the conditional cumulative distribution function (CDF) estimated via neural networks to construct a finite-sample-adjusted percentile interval with the shortest length determined by the estimated conditional CDF. Calibrating in PIT space is effective because PIT values are asymptotically feature-independent when the CDF estimator is accurate, which mitigates feature-dependent miscoverage and improves conditional calibration. On the other hand, our percentile calibration adapts to the empirical PIT distribution, which is robust against a possibly imperfect estimation of the conditional CDF. We prove the finite-sample marginal coverage property of the proposed method and show its asymptotic conditional coverage under mild consistency conditions. Experiments on diverse synthetic and real-world benchmarks demonstrate better conditional calibration and substantially shorter intervals than existing methods.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

2605.03233

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Discrimination-aware Channel Pruning for Deep Neural Networks

Zhuangwei Zhuang, Mingkui Tan, Bohan Zhuang, Jing Liu, Yong Guo, Qingyao Wu, Junzhou Huang, Jinhui Zhu

Neural Information Processing SystemsFeb-12-2026, 20:48:58 GMT

Both strategies suffer from some limitations: the former kind is computationally expensive and difficult to converge, whilst the latter kind optimizes the reconstruction error but ignores the discriminative power of channels.

artificial intelligence, arxivpreprintarxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism

Jiang, Chenyu, Cai, Zhenkun, Tian, Ye, Jia, Zhen, Wang, Yida, Wu, Chuan

arXiv.org Artificial IntelligenceOct-14-2025

Context parallelism has emerged as a key technique to support long-context training, a growing trend in generative AI for modern large models. However, existing context parallel methods rely on static parallelization configurations that overlook the dynamic nature of training data, specifically, the variability in sequence lengths and token relationships (i.e., attention patterns) across samples. As a result, these methods often suffer from unnecessary communication overhead and imbalanced computation. In this paper, we present DCP, a dynamic context parallel training framework that introduces fine-grained blockwise partitioning of both data and computation. By enabling flexible mapping of data and computation blocks to devices, DCP can adapt to varying sequence characteristics, effectively reducing communication and improving memory and computation balance. Micro-benchmarks demonstrate that DCP accelerates attention by 1.19x~2.45x under causal masks and 2.15x~3.77x under sparse attention patterns. Additionally, we observe up to 0.94x~1.16x end-to-end training speed-up for causal masks, and 1.00x~1.46x for sparse masks.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3731569.3764849

2510.1062

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Discrete Compositional Generation via General Soft Operators and Robust Reinforcement Learning

Jiralerspong, Marco, Derman, Esther, Vucetic, Danilo, Malkin, Nikolay, Sun, Bilun, Zhang, Tianyu, Bacon, Pierre-Luc, Gidel, Gauthier

arXiv.org Artificial IntelligenceOct-13-2025

A major bottleneck in scientific discovery consists of narrowing an exponentially large set of objects, such as proteins or molecules, to a small set of promising candidates with desirable properties. While this process can rely on expert knowledge, recent methods leverage reinforcement learning (RL) guided by a proxy reward function to enable this filtering. By employing various forms of entropy regularization, these methods aim to learn samplers that generate diverse candidates that are highly rated by the proxy function. In this work, we make two main contributions. First, we show that these methods are liable to generate overly diverse, suboptimal candidates in large search spaces. To address this issue, we introduce a novel unified operator that combines several regularized RL operators into a general framework that better targets peakier sampling distributions. Secondly, we offer a novel, robust RL perspective of this filtering process. The regularization can be interpreted as robustness to a compositional form of uncertainty in the proxy function (i.e., the true evaluation of a candidate differs from the proxy's evaluation). Our analysis leads us to a novel, easy-to-use algorithm we name trajectory general mellowmax (TGM): we show it identifies higher quality, diverse candidates than baselines in both synthetic and real-world tasks. Code: https://github.com/marcojira/tgm.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2506.17007

Country:

North America > Canada (0.28)
Europe (0.28)

Genre: Research Report (0.81)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

that our implementation will be a widely used tool for embedding convex optimization problems in end-to-end learning

Neural Information Processing SystemsOct-3-2025, 07:46:23 GMT

We thank the reviewers for their constructive feedback on our paper. We especially appreciate our reviewers' conviction Reviewers 1 and 2 found some of our explanations of ASA form and DPP difficult to follow. We will also explain the motivation for our ruleset (reviewer 1's guess is essentially correct). This is what we meant by our vague phrasing "jointly DCP ... [with] one We will separately explain how to reduce certain expressions in which parameters are multiplied together ( e.g., We will clarify this point. In the revision, we will make sure to clearly explain this.

convex optimization problem, dpp, implementation, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.41)
Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Adversarial Text Generation with Dynamic Contextual Perturbation

Waghela, Hetvi, Sen, Jaydip, Rakshit, Sneha, Dasgupta, Subhasis

arXiv.org Artificial IntelligenceJun-12-2025

Adversarial attacks on Natural Language Processing (NLP) models expose vulnerabilities by introducing subtle perturbations to input text, often leading to misclassification while maintaining human readability. Existing methods typically focus on word-level or local text segment alterations, overlooking the broader context, which results in detectable or semantically inconsistent perturbations. We propose a novel adversarial text attack scheme named Dynamic Contextual Perturbation (DCP). DCP dynamically generates context-aware perturbations across sentences, paragraphs, and documents, ensuring semantic fidelity and fluency. Leveraging the capabilities of pre-trained language models, DCP iteratively refines perturbations through an adversarial objective function that balances the dual objectives of inducing model misclassification and preserving the naturalness of the text. This comprehensive approach allows DCP to produce more sophisticated and effective adversarial examples that better mimic natural language patterns. Our experimental results, conducted on various NLP models and datasets, demonstrate the efficacy of DCP in challenging the robustness of state-of-the-art NLP systems. By integrating dynamic contextual analysis, DCP significantly enhances the subtlety and impact of adversarial attacks. This study highlights the critical role of context in adversarial attacks and lays the groundwork for creating more robust NLP systems capable of withstanding sophisticated adversarial strategies.

bert, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/CALCON63337.2024.10914111

2506.09148

Country: Asia > India (0.48)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.78)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DCP: Learning Accelerator Dataflow for Neural Network via Propagation

Xu, Peng, Shao, Wenqi, Ding, Mingyu, Luo, Ping

arXiv.org Artificial IntelligenceOct-9-2024

Deep neural network (DNN) hardware (HW) accelerators have achieved great success in improving DNNs' performance and efficiency. One key reason is dataflow in executing a DNN layer, including on-chip data partitioning, computation parallelism, and scheduling policy, which have large impacts on latency and energy consumption. Unlike prior works that required considerable efforts from HW engineers to design suitable dataflows for different DNNs, this work proposes an efficient data-centric approach, named Dataflow Code Propagation (DCP), to automatically find the optimal dataflow for DNN layers in seconds without human effort. It has several attractive benefits that prior arts do not have. (i) We translate the HW dataflow configuration into a code representation in a unified dataflow coding space, which can be optimized by backpropagating gradients given a DNN layer or network. (ii) DCP learns a neural predictor to efficiently update the dataflow codes towards the desired gradient directions to minimize various optimization objectives e.g., latency and energy. (iii) It can be easily generalized to unseen HW configurations in a zero-shot or few-shot learning manner. For example, without using additional training data, DCP surpasses the GAMMA method that performs a full search using thousands of samples. Extensive experiments on several representative models such as MobileNet, ResNet, and ViT show that DCP outperforms its counterparts in various settings.

accelerator, dataflow, dnn accelerator, (13 more...)

arXiv.org Artificial Intelligence

2410.06553

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Stochastic COLREGs Evaluation for Safe Navigation under Uncertainty

Hansen, Peter Nicholas, Papageorgiou, Dimitrios, Galeazzi, Roberto, Blanke, Mogens

arXiv.org Artificial IntelligenceFeb-8-2024

The encounter situation between marine vessels determines how they should navigate to obey COLREGs, but time-varying and stochastic uncertainty in estimation of angles of encounter, and of closest point of approach, easily give rise to different assessment of situation at two approaching vessels. This may lead to high-risk conditions and could cause collision. This article considers decision making under uncertainty and suggests a novel method for probabilistic interpretation of vessel encounters that is explainable and provides a measure of uncertainty in the evaluation. The method is equally useful for decision support on a manned bridge as on Marine Autonomous Surface Ships (MASS) where it provides input for automated navigation. The method makes formal safety assessment and validation feasible. We obtain a resilient algorithm for machine interpretation of COLREGs under uncertainty and show its efficacy by simulations.

collision, probability, scenario, (16 more...)

arXiv.org Artificial Intelligence

2402.05662

Country: Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry:

Transportation (1.00)
Government > Military (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Add feedback

Programming Distributed Collective Processes in the eXchange Calculus

Audrito, Giorgio, Casadei, Roberto, Damiani, Ferruccio, Torta, Gianluca, Viroli, Mirko

arXiv.org Artificial IntelligenceJan-20-2024

Recent trends like the Internet of Things (IoT) suggest a vision of dense and multi-scale deployments of computing devices in nearly all kinds of environments. A prominent engineering challenge revolves around programming the collective adaptive behaviour of such computational ecosystems. This requires abstractions able to capture concepts like ensembles (dynamic groups of cooperating devices) and collective tasks (joint activities carried out by ensembles). In this work, we consider collections of devices interacting with neighbours and that execute in nearly-synchronised sense-compute-interact rounds, where the computation is given by a single program mapping sensing values and incoming messages to output and outcoming messages. To support programming whole computational collectives, we propose the abstraction of a distributed collective process, which can be used to define at once the ensemble formation logic and its collective task. We formalise the abstraction in the eXchange Calculus (XC), a core functional language based on neighbouring values (maps from neighbours to values) where state and interaction is handled through a single primitive, exchange, and provide a corresponding implementation in the FCPP language. Then, we exercise distributed collective processes using two case studies: multi-hop message propagation and distributed monitoring of spatial properties. Finally, we discuss the features of the abstraction and its suitability for different kinds of distributed computing applications.

computation, dcp, neighbour, (12 more...)

arXiv.org Artificial Intelligence

2401.11212

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
(18 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Smart Houses & Appliances (0.34)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(3 more...)

Add feedback

Dynamically Conservative Self-Driving Planner for Long-Tail Cases

Zhou, Weitao, Cao, Zhong, Deng, Nanshan, Liu, Xiaoyu, Jiang, Kun, Yang, Diange

arXiv.org Artificial IntelligenceMay-12-2023

Self-driving vehicles (SDVs) are becoming reality but still suffer from "long-tail" challenges during natural driving: the SDVs will continually encounter rare, safety-critical cases that may not be included in the dataset they were trained. Some safety-assurance planners solve this problem by being conservative in all possible cases, which may significantly affect driving mobility. To this end, this work proposes a method to automatically adjust the conservative level according to each case's "long-tail" rate, named dynamically conservative planner (DCP). We first define the "long-tail" rate as an SDV's confidence to pass a driving case. The rate indicates the probability of safe-critical events and is estimated using the statistics bootstrapped method with historical data. Then, a reinforcement learning-based planner is designed to contain candidate policies with different conservative levels. The final policy is optimized based on the estimated "long-tail" rate. In this way, the DCP is designed to automatically adjust to be more conservative in low-confidence "long-tail" cases while keeping efficient otherwise. The DCP is evaluated in the CARLA simulator using driving cases with "long-tail" distributed training data. The results show that the DCP can accurately estimate the "long-tail" rate to identify potential risks. Based on the rate, the DCP automatically avoids potential collisions in "long-tail" cases using conservative decisions while not affecting the average velocity in other typical cases. Thus, the DCP is safer and more efficient than the baselines with fixed conservative levels, e.g., an always conservative planner. This work provides a technique to guarantee SDV's performance in unexpected driving cases without resorting to a global conservative setting, which contributes to solving the "long-tail" problem practically.

artificial intelligence, baseline, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.07497

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > France > Hauts-de-France > Oise > Compiègne (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.94)
Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback