AITopics | deepset

Country: North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsFeb-10-2026, 14:13:11 GMT

RegularizingTowardsPermutationInvariancein RecurrentModels

Such "permutation invariant" functions have been studied extensively recently. Here we argue that temporal architectures such as RNNs are highly relevant for such problems, despite the inherent dependence of RNNs on order.

artificial intelligence, machine learning, permutation invariant, (18 more...)

Country:

Asia > Middle East > Israel (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R. Salakhutdinov, Alexander J. Smola

Deep Sets

Neural Information Processing SystemsNov-21-2025, 13:46:38 GMT

Neural Information Processing Systems http://nips.cc/

deepset, machine learning, natural language, (17 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.94)
(2 more...)

Barthe-Gold, Baptiste, Nguyen, Nhat-Minh, Thiele, Leander

Reconstructing the local density field with combined convolutional and point cloud architecture

arXiv.org Machine LearningOct-10-2025

We construct a neural network to perform regression on the local dark-matter density field given line-of-sight peculiar velocities of dark-matter halos, biased tracers of the dark matter field. Our architecture combines a convolutional U-Net with a point-cloud DeepSets. This combination enables efficient use of small-scale information and improves reconstruction quality relative to a U-Net-only approach. Specifically, our hybrid network recovers both clustering amplitudes and phases better than the U-Net on small scales.

density field, peculiar velocity, reconstruction, (15 more...)

arXiv.org Machine Learning

2510.08573

Country:

North America > United States (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Neural Information Processing SystemsAug-17-2025, 09:24:01 GMT

Below, we address the main questions and concerns that were raised in the reviews

We thank the reviewers for their thoughtful comments and suggestions. We will incorporate them in our revised version. Below, we address the main questions and concerns that were raised in the reviews. This is a great suggestion. Table 1 compares the training time for all of the models on the particle physics experiment.

artificial intelligence, machine learning, main question and concern, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

arXiv.org Artificial IntelligenceAug-11-2025

Enhancing Retrieval-Augmented Generation for Electric Power Industry Customer Support

Chan, Hei Yu, Ho, Kuok Tou, Ma, Chenglong, Si, Yujing, Lin, Hok Lai, Lam, Sa Lei

Many AI customer service systems use standard NLP pipelines or finetuned language models, which often fall short on ambiguous, multi-intent, or detail-specific queries. This case study evaluates recent techniques: query rewriting, RAG Fusion, keyword augmentation, intent recognition, and context reranking, for building a robust customer support system in the electric power domain. We compare vector-store and graph-based RAG frameworks, ultimately selecting the graph-based RAG for its superior performance in handling complex queries. We find that query rewriting improves retrieval for queries using non-standard terminology or requiring precise detail. RAG Fusion boosts performance on vague or multifaceted queries by merging multiple retrievals. Reranking reduces hallucinations by filtering irrelevant contexts. Intent recognition supports the decomposition of complex questions into more targeted sub-queries, increasing both relevance and efficiency. In contrast, keyword augmentation negatively impacts results due to biased keyword selection. Our final system combines intent recognition, RAG Fusion, and reranking to handle disambiguation and multi-source queries. Evaluated on both a GPT-4-generated dataset and a real-world electricity provider FAQ dataset, it achieves 97.9% and 89.6% accuracy respectively, substantially outperforming baseline RAG models.

large language model, machine learning, natural language, (17 more...)

2508.05664

Country:

Asia > Macao (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Asia > China (0.04)

Genre: Research Report (0.67)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Kholkar, Gauri, Ahuja, Ratinder

CAPTURE: Context-Aware Prompt Injection Testing and Robustness Enhancement

arXiv.org Artificial IntelligenceJun-18-2025

Prompt injection remains a major security risk for large language models. However, the efficacy of existing guardrail models in context-aware settings remains underexplored, as they often rely on static attack benchmarks. Additionally, they have over-defense tendencies. We introduce CAPTURE, a novel context-aware benchmark assessing both attack detection and over-defense tendencies with minimal in-domain examples. Our experiments reveal that current prompt injection guardrail models suffer from high false negatives in adversarial cases and excessive false positives in benign scenarios, highlighting critical limitations. To demonstrate our framework's utility, we train CaptureGuard on our generated data. This new model drastically reduces both false negative and false positive rates on our context-aware datasets while also generalizing effectively to external benchmarks, establishing a path toward more robust and practical prompt injection defenses.

large language model, machine learning, natural language, (16 more...)

2505.12368

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningMay-30-2025

On Transferring Transferability: Towards a Theory for Size Generalization

Levin, Eitan, Ma, Yuxin, Díaz, Mateo, Villar, Soledad

Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work by introducing a general framework for transferability across dimensions. We show that transferability corresponds precisely to continuity in a limit space formed by identifying small problem instances with equivalent large ones. This identification is driven by the data and the learning task. We instantiate our framework on existing architectures, and implement the necessary changes to ensure their transferability. Finally, we provide design principles for designing new transferable models. Numerical experiments support our findings.

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Machine Learning

2505.23599

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Xie, Jiahao, Tong, Guangmo

Advances in Set Function Learning: A Survey of Techniques and Applications

arXiv.org Artificial IntelligenceJan-24-2025

Set function learning has emerged as a crucial area in machine learning, addressing the challenge of modeling functions that take sets as inputs. Unlike traditional machine learning that involves fixed-size input vectors where the order of features matters, set function learning demands methods that are invariant to permutations of the input set, presenting a unique and complex problem. This survey provides a comprehensive overview of the current development in set function learning, covering foundational theories, key methodologies, and diverse applications. We categorize and discuss existing approaches, focusing on deep learning approaches, such as DeepSets and Set Transformer based methods, as well as other notable alternative methods beyond deep learning, offering a complete view of current models. We also introduce various applications and relevant datasets, such as point cloud processing and multi-label classification, highlighting the significant progress achieved by set function learning methods in these domains. Finally, we conclude by summarizing the current state of set function learning approaches and identifying promising future research directions, aiming to guide and inspire further advancements in this promising field.

artificial intelligence, machine learning, representation, (14 more...)

2501.14991

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-1-2024

Attention Tracker: Detecting Prompt Injection Attacks in LLMs

Hung, Kuo-Han, Ko, Ching-Yun, Rawat, Ambrish, Chung, I-Hsin, Hsu, Winston H., Chen, Pin-Yu

Large Language Models (LLMs) have revolutionized various domains but remain vulnerable to prompt injection attacks, where malicious inputs manipulate the model into ignoring original instructions and executing designated action. In this paper, we investigate the underlying mechanisms of these attacks by analyzing the attention patterns within LLMs. We introduce the concept of the distraction effect, where specific attention heads, termed important heads, shift focus from the original instruction to the injected instruction. Building on this discovery, we propose Attention Tracker, a training-free detection method that tracks attention patterns on instruction to detect prompt injection attacks without the need for additional LLM inference. Our method generalizes effectively across diverse models, datasets, and attack types, showing an AUROC improvement of up to 10.0% over existing methods, and performs well even on small LLMs. We demonstrate the robustness of our approach through extensive evaluations and provide insights into safeguarding LLM-integrated systems from prompt injection vulnerabilities.

large language model, machine learning, natural language, (15 more...)

2411.00348

Country:

North America > United States (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Spain (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)