datetime
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Israel (0.04)
SupplementaryMaterial
This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant (No.2019-0-00075, Artificial Intelligence Graduate School Program(KAIST)), National Research Foundation of Korea (NRF) grant (NRF2020H1D3A2A03100945) andDataVoucher grant(2021-DV-I-P-00114), funded bythe Koreagovernment(MSIT). The dataset contains question-SQL pairs if the question is answerable. Are relationships between individual instances made explicit (e.g., users' movie ratings, socialnetworklinks)? N/A. Arethereanyerrors,sourcesofnoise,orredundanciesinthedataset? Question templates are created to have slots that are later filled with pre-defined values and records from the database. EHRSQL is based on patients in MIMIC-III and eICU.
Towards Small Language Models for Security Query Generation in SOC Workflows
Muzammil, Saleha, Reddy, Rahul, Kamalakrishnan, Vishal, Ahmadi, Hadi, Hassan, Wajih Ul
Analysts in Security Operations Centers routinely query massive telemetry streams using Kusto Query Language (KQL). Writing correct KQL requires specialized expertise, and this dependency creates a bottleneck as security teams scale. This paper investigates whether Small Language Models (SLMs) can enable accurate, cost-effective natural-language-to-KQL translation for enterprise security. We propose a three-knob framework targeting prompting, fine-tuning, and architecture design. First, we adapt existing NL2KQL framework for SLMs with lightweight retrieval and introduce error-aware prompting that addresses common parser failures without increasing token count. Second, we apply LoRA fine-tuning with rationale distillation, augmenting each NLQ-KQL pair with a brief chain-of-thought explanation to transfer reasoning from a teacher model while keeping the SLM compact. Third, we propose a two-stage architecture that uses an SLM for candidate generation and a low-cost LLM judge for schema-aware refinement and selection. We evaluate nine models (five SLMs and four LLMs) across syntax correctness, semantic accuracy, table selection, and filter precision, alongside latency and token cost. On Microsoft's NL2KQL Defender Evaluation dataset, our two-stage approach achieves 0.987 syntax and 0.906 semantic accuracy. We further demonstrate generalizability on Microsoft Sentinel data, reaching 0.964 syntax and 0.831 semantic accuracy. These results come at up to 10x lower token cost than GPT-5, establishing SLMs as a practical, scalable foundation for natural-language querying in security operations.
Agentic-KGR: Co-evolutionary Knowledge Graph Construction through Multi-Agent Reinforcement Learning
Li, Jing, Sun, Zhijie, Zhou, Zhicheng, Qiu, Suming, Huang, Junjie, Sun, Haijia, Qiu, Linyuan
Current knowledge-enhanced large language models (LLMs) rely on static, pre-constructed knowledge bases that suffer from coverage gaps and temporal obsolescence, limiting their effectiveness in dynamic information environments. We present Agentic-KGR, a novel framework enabling co-evolution between LLMs and knowledge graphs (KGs) through multi-round reinforcement learning (RL). Our approach introduces three key innovations: (1) a dynamic schema expansion mechanism that systematically extends graph ontologies beyond pre-defined boundaries during training; (2) a retrieval-augmented memory system enabling synergistic co-evolution between model parameters and knowledge structures through continuous optimization; (3) a learnable multi-scale prompt compression approach that preserves critical information while reducing computational complexity through adaptive sequence optimization. Experimental results demonstrate substantial improvements over supervised baselines and single-round RL approaches in knowledge extraction tasks. When integrated with GraphRAG, our method achieves superior performance in downstream QA tasks, with significant gains in both accuracy and knowledge coverage compared to existing methods.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Israel (0.04)
643e347250cf9289e5a2a6c1ed5ee42e-Supplemental-Datasets_and_Benchmarks.pdf
The following section is answers to questions listed in datasheets for datasets. A.1 Motivation For what purpose was the dataset created? Who created the dataset (e.g., which team, research group) and on behalf of which entity Who funded the creation of the dataset? This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant (No.2019-0-00075, Artificial Intelligence Graduate School Program(KAIST)), National Research Foundation of Korea (NRF) grant (NRF-2020H1D3A2A03100945) and Data V oucher grant (2021-DV -I-P-00114), funded by the A.2 Composition What do the instances that comprise the dataset represent (e.g., documents, photos, people, countries)? EHRSQL contains natural questions and their corresponding SQL queries (text). How many instances are there in total (of each type, if appropriate)? There are about 24.4K instances (22.5K answerable; 1.9K unanswerable). We conducted a poll at a university hospital and collected a wide range of questions frequently asked on the structured EHR data. What data does each instance consist of? The dataset contains question-SQL pairs if the question is answerable.
- North America > United States (0.04)
- Asia > Middle East > Israel (0.04)
Recursive Question Understanding for Complex Question Answering over Heterogeneous Personal Data
Christmann, Philipp, Weikum, Gerhard
Question answering over mixed sources, like text and tables, has been advanced by verbalizing all contents and encoding it with a language model. A prominent case of such heterogeneous data is personal information: user devices log vast amounts of data every day, such as calendar entries, workout statistics, shopping records, streaming history, and more. Information needs range from simple look-ups to queries of analytical nature. The challenge is to provide humans with convenient access with small footprint, so that all personal data stays on the user devices. We present ReQAP, a novel method that creates an executable operator tree for a given question, via recursive decomposition. Operators are designed to enable seamless integration of structured and unstructured sources, and the execution of the operator tree yields a traceable answer. We further release the PerQA benchmark, with persona-based data and questions, covering a diverse spectrum of realistic user needs.
- Research Report > Experimental Study (0.67)
- Research Report > Promising Solution (0.48)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Distill-C: Enhanced NL2SQL via Distilled Customization with LLMs
Hoang, Cong Duy Vu, Tangari, Gioacchino, Lanfranchi, Clemence, Guo, Dalu, Cayet, Paul, Siu, Steve, Dharmasiri, Don, Li, Yuan-Fang, Duong, Long, Hilloulin, Damien, Patra, Rhicheek, Hong, Sungpack, Chafi, Hassan
The growing adoption of large language models (LLMs) in business applications has amplified interest in Natural Language to SQL (NL2SQL) solutions, in which there is competing demand for high performance and efficiency. Domain- and customer-specific requirements further complicate the problem. To address this conundrum, we introduce Distill-C, a distilled customization framework tailored for NL2SQL tasks. Distill-C utilizes large teacher LLMs to produce high-quality synthetic data through a robust and scalable pipeline. Finetuning smaller and open-source LLMs on this synthesized data enables them to rival or outperform teacher models an order of magnitude larger. Evaluated on multiple challenging benchmarks, Distill-C achieves an average improvement of 36% in execution accuracy compared to the base models from three distinct LLM families. Additionally, on three internal customer benchmarks, Distill-C demonstrates a 22.6% performance improvement over the base models. Our results demonstrate that Distill-C is an effective, high-performing and generalizable approach for deploying lightweight yet powerful NL2SQL models, delivering exceptional accuracies while maintaining low computational cost.
- Asia > Singapore (0.15)
- Oceania > Australia (0.04)
- North America > United States (0.04)
- (2 more...)
Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis
Mehri, Shuhaib, Chen, Xiusi, Ji, Heng, Hakkani-Tür, Dilek
LLMs demonstrate remarkable capabilities in following natural language instructions, largely due to instruction-tuning on high-quality datasets. While synthetic data generation has emerged as a scalable approach for creating such datasets, maintaining consistent quality standards remains challenging. Recent approaches incorporate feedback to improve data quality, but typically operate at the sample level, generating and applying feedback for each response individually. In this work, we propose Reference-Level Feedback, a novel methodology that instead collects feedback based on high-quality reference samples from carefully curated seed data. We use this feedback to capture rich signals of desirable characteristics and propagate it throughout the data synthesis process. We present REFED, a dataset of 10K instruction-response pairs synthesized using such feedback. We demonstrate the effectiveness of our approach by showing that Llama-3.1-8B-Instruct finetuned on REFED achieves state-of-the-art performance among similar-sized SFT-based models on AlpacaEval 2.0 and strong results on Arena-Hard. Through extensive experiments, we show that our approach consistently outperforms traditional sample-level feedback methods with significantly fewer feedback collections and improves performance across different model architectures.
- Europe > Monaco (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- (7 more...)
Application-oriented automatic hyperparameter optimization for spiking neural network prototyping
Hyperparameter optimization (HPO) is of paramount importance in the development of high-performance, specialized artificial intelligence (AI) models, ranging from well-established machine learning (ML) solutions to the deep learning (DL) domain and the field of spiking neural networks (SNNs). The latter introduce further complexity due to the neuronal computational units and their additional hyperparameters, whose inadequate setting can dramatically impact the final model performance. At the cost of possible reduced generalization capabilities, the most suitable strategy to fully disclose the power of SNNs is to adopt an application-oriented approach and perform extensive HPO experiments. To facilitate these operations, automatic pipelines are fundamental, and their configuration is crucial. In this document, the Neural Network Intelligence (NNI) toolkit is used as reference framework to present one such solution, with a use case example providing evidence of the corresponding results. In addition, a summary of published works employing the presented pipeline is reported as possible source of insights into application-oriented HPO experiments for SNN prototyping.
- Europe > Switzerland (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)