directive
Language models are weak learners
A central notion in practical and theoretical machine learning is that of a weak learner, classifiers that achieve better-than-random performance (on any given distribution over data), even by a small margin. Such weak learners form the practical basis for canonical machine learning methods such as boosting.
- Europe > Portugal > Lisbon > Lisbon (0.05)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (4 more...)
- Research Report > New Finding (0.46)
- Overview (0.46)
- Education (0.67)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)
- Health & Medicine > Therapeutic Area > Endocrinology (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms
Bhattacharjee, Arijit, TehraniJamsaz, Ali, Chen, Le, Hasabnis, Niranjan, Capota, Mihai, Ahmed, Nesreen, Jannesari, Ali
Recent advances in large language models (LLMs) have significantly accelerated progress in code translation, enabling more accurate and efficient transformation across programming languages. While originally developed for natural language processing, LLMs have shown strong capabilities in modeling programming language syntax and semantics, outperforming traditional rule-based systems in both accuracy and flexibility. These models have streamlined cross-language conversion, reduced development overhead, and accelerated legacy code migration. In this paper, we introduce OMPILOT, a novel domain-specific encoder-decoder transformer tailored for translating C++ code into OpenMP, enabling effective shared-memory parallelization. OMPILOT leverages custom pre-training objectives that incorporate the semantics of parallel constructs and combines both unsupervised and supervised learning strategies to improve code translation robustness. Unlike previous work that focused primarily on loop-level transformations, OMPILOT operates at the function level to capture a wider semantic context. To evaluate our approach, we propose OMPBLEU, a novel composite metric specifically crafted to assess the correctness and quality of OpenMP parallel constructs, addressing limitations in conventional translation metrics.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- North America > United States > Iowa > Story County > Ames (0.04)
- Europe > Switzerland (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics
Prabhakar, Akshara, Ram, Roshan, Chen, Zixiang, Savarese, Silvio, Wang, Frank, Xiong, Caiming, Wang, Huan, Yao, Weiran
As information grows exponentially, enterprises face increasing pressure to transform unstructured data into coherent, actionable insights. While autonomous agents show promise, they often struggle with domain-specific nuances, intent alignment, and enterprise integration. We present Enterprise Deep Research (EDR), a multi-agent system that integrates (1) a Master Planning Agent for adaptive query decomposition, (2) four specialized search agents (General, Academic, GitHub, LinkedIn), (3) an extensible MCP-based tool ecosystem supporting NL2SQL, file analysis, and enterprise workflows, (4) a Visualization Agent for data-driven insights, and (5) a reflection mechanism that detects knowledge gaps and updates research direction with optional human-in-the-loop steering guidance. These components enable automated report generation, real-time streaming, and seamless enterprise deployment, as validated on internal datasets. On open-ended benchmarks including DeepResearch Bench and DeepConsult, EDR outperforms state-of-the-art agentic systems without any human steering. We release the EDR framework and benchmark trajectories to advance research on multi-agent reasoning applications. Code at https://github.com/SalesforceAIResearch/enterprise-deep-research and Dataset at https://huggingface.co/datasets/Salesforce/EDR-200
- Europe > Austria > Vienna (0.14)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
SimpleVSF: VLM-Scoring Fusion for Trajectory Prediction of End-to-End Autonomous Driving
Zheng, Peiru, Zhao, Yun, Gong, Zhan, Zhu, Hong, Wu, Shaohua
End-to-end autonomous driving has emerged as a promising paradigm for achieving robust and intelligent driving policies. However, existing end-to-end methods still face significant challenges, such as suboptimal decision-making in complex scenarios. In this paper,we propose SimpleVSF (Simple VLM-Scoring Fusion), a novel framework that enhances end-to-end planning by leveraging the cognitive capabilities of Vision-Language Models (VLMs) and advanced trajectory fusion techniques. We utilize the conventional scorers and the novel VLM-enhanced scorers. And we leverage a robust weight fusioner for quantitative aggregation and a powerful VLM-based fusioner for qualitative, context-aware decision-making. As the leading approach in the ICCV 2025 NAVSIM v2 End-to-End Driving Challenge, our SimpleVSF framework demonstrates state-of-the-art performance, achieving a superior balance between safety, comfort, and efficiency.
- Transportation > Ground > Road (0.76)
- Information Technology > Robotics & Automation (0.66)
- Automobiles & Trucks (0.66)
Directive, Metacognitive or a Blend of Both? A Comparison of AI-Generated Feedback Types on Student Engagement, Confidence, and Outcomes
Alsaiari, Omar, Baghaei, Nilufar, Lodge, Jason M., Noroozi, Omid, Gašević, Dragan, Boden, Marie, Khosravi, Hassan
Feedback is one of the most powerful influences on student learning, with extensive research examining how best to implement it in educational settings. Increasingly, feedback is being generated by artificial intelligence (AI), offering scalable and adaptive responses. Two widely studied approaches are directive feedback, which gives explicit explanations and reduces cognitive load to speed up learning, and metacognitive feedback which prompts learners to reflect, track their progress, and develop self-regulated learning (SRL) skills. While both approaches have clear theoretical advantages, their comparative effects on engagement, confidence, and quality of work remain underexplored. This study presents a semester-long randomised controlled trial with 329 students in an introductory design and programming course using an adaptive educational platform. Participants were assigned to receive directive, metacognitive, or hybrid AI-generated feedback that blended elements of both directive and metacognitive feedback. Results showed that revision behaviour differed across feedback conditions, with Hybrid prompting the most revisions compared to Directive and Metacognitive. Confidence ratings were uniformly high, and resource quality outcomes were comparable across conditions. These findings highlight the promise of AI in delivering feedback that balances clarity with reflection. Hybrid approaches, in particular, show potential to combine actionable guidance for immediate improvement with opportunities for self-reflection and metacognitive growth.
- Oceania > Australia > Queensland > Brisbane (0.05)
- Asia > Middle East > Saudi Arabia > Najran Province > Najran (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (4 more...)
- Research Report > Strength High (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.46)
- Education > Educational Setting > Online (0.93)
- Education > Educational Setting > Higher Education (0.69)
- Education > Curriculum > Subject-Specific Education (0.68)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy in Large Language Models
Pandey, Sanskar, Chopra, Ruhaan, Puniya, Angkul, Pal, Sohom
Large language models internalize a structural trade-off between truthfulness and obsequious flattery, emerging from reward optimization that conflates helpfulness with polite submission. This latent bias, known as sycophancy, manifests as a preference for user agreement over principled reasoning. We introduce Beacon, a single-turn forced-choice benchmark that isolates this bias independent of conversational context, enabling precise measurement of the tension between factual accuracy and submissive bias. Evaluations across twelve state-of-the-art models reveal that sycophancy decomposes into stable linguistic and affective sub-biases, each scaling with model capacity. We further propose prompt-level and activation-level interventions that modulate these biases in opposing directions, exposing the internal geometry of alignment as a dynamic manifold between truthfulness and socially compliant judgment. Beacon reframes sycophancy as a measurable form of normative misgeneralization, providing a reproducible foundation for studying and mitigating alignment drift in large-scale generative systems.
Swedish Death Cleaning, but for Your Digital Life
The art of ordering and culling your possessions before you die should extend to your documents, photos, and digital accounts. Digital generated image of semi transparent multiple data server discs on white background. After Adam Liljenberg's grandmother died, his grandfather was ready to downsize and move into an assisted living facility. As Swedes, they were familiar with Swedish death cleaning, the idea that as you near the end of life, you declutter and organize your belongings so as not to burden those who survive you. When Liljenberg arrived to help his grandfather sort through his possessions, he didn't expect to be rescuing digital photos off a phone full of malware.
- Information Technology > Security & Privacy (0.91)
- Health & Medicine > Therapeutic Area (0.70)
- Information Technology > Artificial Intelligence (0.96)
- Information Technology > Security & Privacy (0.69)
- Information Technology > Communications > Mobile (0.50)
PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research
Oh, Nick, Vrakas, Giorgos D., Brooke, Siân J. M., Morinière, Sasha, Duke, Toju
We introduce PETLP (Privacy-by-design Extract, Transform, Load, and Present), a compliance framework that embeds legal safeguards directly into extended ETL pipelines. Central to PETLP is treating Data Protection Impact Assessments as living documents that evolve from preregistration through dissemination. Through systematic Red-dit analysis, we demonstrate how extraction rights fundamentally differ between qualifying research organisations (who can invoke DSM Article 3 to override platform restrictions) and commercial entities (bound by terms of service), whilst GDPR obligations apply universally. We demonstrate why true anonymisation remains unachievable for social media data and expose the legal gap between permitted dataset creation and uncertain model distribution. By structuring compliance decisions into practical workflows and simplifying institutional data management plans, PETLP enables researchers to navigate regulatory complexity with confidence, bridging the gap between legal requirements and research practice.
- Europe > Ireland (0.04)
- Europe > Middle East > Cyprus (0.04)
- Europe > Germany (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government > Europe Government (0.47)
ECO: Enhanced Code Optimization via Performance-Aware Prompting for Code-LLMs
Kim, Su-Hyeon, Hahn, Joonghyuk, Cha, Sooyoung, Han, Yo-Sub
Code runtime optimization--the task of rewriting a given code to a faster one-- remains challenging, as it requires reasoning about performance trade-offs involving algorithmic and structural choices. Recent approaches employ code-LLMs with slow-fast code pairs provided as optimization guidance, but such pair-based methods obscure the causal factors of performance gains and often lead to superficial pattern imitation rather than genuine performance reasoning. We introduce ECO, a performance-aware prompting framework for code optimization. ECO first distills runtime optimization instructions (ROIs) from reference slow-fast code pairs; Each ROI describes root causes of inefficiency and the rationales that drive performance improvements. For a given input code, ECO in parallel employs (i) a symbolic advisor to produce a bottleneck diagnosis tailored to the code, and (ii) an ROI retriever to return related ROIs. These two outputs are then composed into a performance-aware prompt, providing actionable guidance for code-LLMs. ECO's prompts are model-agnostic, require no fine-tuning, and can be easily prepended to any code-LLM prompt. Our empirical studies highlight that ECO prompting significantly improves code-LLMs' ability to generate efficient code, achieving speedups of up to 7.81 while minimizing correctness loss. Code runtime optimization--the task of rewriting a given code to a faster one--is a fundamental problem in software engineering, as it directly affects user experience and system performance (ISO/IEC, 2011). Recent advances in large language models for code (code-LLMs) demonstrated remarkable ability in ensuring functional correctness through tasks such as code synthesis, translation, and summarization (Chen et al., 2021; Xu et al., 2022). However, correctness alone does not imply efficiency; generating faster code requires performance-oriented reasoning that goes beyond code semantics. This gap makes code optimization particularly challenging for approaches that rely solely on the intrinsic capabilities of code-LLMs (Shypula et al., 2024). Early works in code optimization utilized compiler-driven techniques, which applied rule-based analysis at the intermediate representation level, such as dead code elimination or loop unrolling (Wegman & Zadeck, 1991; Booshehri et al., 2013). These approaches are effective for addressing well-defined low-level inefficiencies, but they fail to capture the dominant performance bottlenecks--program-level, context-dependent optimizations including algorithmic restructuring or data-structure selection. However, code-LLMs alone lack the capacity to optimize code and therefore require external guidance. Building on this, Shypula et al. (2024) and Gao et al. (2025) exploit slow-fast code pairs through prompting techniques such as in-context learning (ICL) and retrieval-augmented generation (RAG), where the example pairs are chosen randomly or by code-similarity retrieval.
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > South Korea > Gyeonggi-do > Suwon (0.04)
Creation of the Chinese Adaptive Policy Communication Corpus
Sun, Bolun, Chang, Charles, Ang, Yuen Yuen, Hao, Pingxu, Mu, Ruotong, Xu, Yuchen, Zhang, Zhengxin
We introduce CAPC-CG, the Chinese Adaptive Policy Communication (Central Government) Corpus, the first open dataset of Chinese policy directives annotated with a five-color taxonomy of clear and ambiguous language categories, building on Ang's theory of adaptive policy communication. Spanning 1949-2023, this corpus includes national laws, administrative regulations, and ministerial rules issued by China's top authorities. Each document is segmented into paragraphs, producing a total of 3.3 million units. Alongside the corpus, we release comprehensive metadata, a two-round labeling framework, and a gold-standard annotation set developed by expert and trained coders. Inter-annotator agreement achieves a Fleiss's kappa of K = 0.86 on directive labels, indicating high reliability for supervised modeling. We provide baseline classification results with several large language models (LLMs), together with our annotation codebook, and describe patterns from the dataset. This release aims to support downstream tasks and multilingual NLP research in policy communication.
- Asia > China (1.00)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Workflow (0.95)
- Research Report (0.64)
- Law > Statutes (1.00)
- Government > Regional Government > Asia Government > China Government (1.00)