AITopics | prompt template

Collaborating Authors

prompt template

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs

Neural Information Processing SystemsJun-16-2026, 09:25:50 GMT

The selection of hyperparameters, such as prompt templates in large language models (LLMs), must often strike a balance between reliability and cost. In many cases, structural relationships between the expected reliability levels of the hyperparameters can be inferred from prior information and held-out data - e.g., longer prompt templates may be more detailed and thus more reliable. However, existing hyperparameter selection methods either do not provide formal reliability guarantees or are unable to incorporate structured knowledge in the hyperparameter space. This paper introduces reliability graph-based Pareto testing (RG-PT), a novel multi-objective hyperparameter selection framework that maintains formal reliability guarantees in terms of false discovery rate (FDR), while accounting for known relationships among hyperparameters via a directed acyclic graph. Edges in the graph reflect expected reliability and cost trade-offs among hyperparameters, which are inferred via the Bradley-Terry (BT) ranking model from prior information and held-out data. Experimental evaluations demonstrate that RG-PT significantly outperforms existing methods such as learn-then-test (LTT) and Pareto testing (PT) through a more efficient exploration of the hyperparameter space.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Realistic Doctor-Patient Interactions

Neural Information Processing SystemsJun-15-2026, 16:52:44 GMT

Doctor-patient consultations require multi-turn, context-aware communication tailored to diverse patient personas. Training or evaluating doctor LLMs in such settings requires realistic patient interaction systems. However, existing simulators often fail to reflect the full range of personas seen in clinical practice. To address this, we introduce PATIENTSIM, a patient simulator that generates realistic and diverse patient personas for clinical scenarios, grounded in medical expertise. PATIENTSIM operates using: 1) clinical profiles, including symptoms and medical history, derived from real-world data in the MIMIC-ED and MIMIC-IV datasets, and 2) personas defined by four axes: personality, language proficiency, medical history recall level, and cognitive confusion level, resulting in 37 unique combinations. We evaluate eight LLMs for factual accuracy and persona consistency. The top-performing open-source model, Llama 3.3 70B, is validated by four clinicians to confirm the robustness of our framework. As an open-source, customizable platform, PATIENTSIM provides a reproducible and scalable solution that can be customized for specific training needs. Offering a privacy-compliant environment, it serves as a robust testbed for evaluating medical dialogue systems across diverse patient presentations and shows promise as an educational tool for healthcare.

information, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs

Neural Information Processing SystemsJun-11-2026, 16:06:10 GMT

The selection of hyperparameters, such as prompt templates in large language models (LLMs), must often strike a balance between reliability and cost. In many cases, structural relationships between the expected reliability levels of the hyperparameters can be inferred from prior information and held-out data -- e.g., longer prompt templates may be more detailed and thus more reliable. However, existing hyperparameter selection methods either do not provide formal reliability guarantees or are unable to incorporate structured knowledge in the hyperparameter space. This paper introduces reliability graph-based Pareto testing (RG-PT), a novel multi-objective hyperparameter selection framework that maintains formal reliability guarantees in terms of false discovery rate (FDR), while accounting for known relationships among hyperparameters via a directed acyclic graph. Edges in the graph reflect expected reliability and cost trade-offs among hyperparameters, which are inferred via the Bradley-Terry (BT) ranking model from prior information and held-out data. Experimental evaluations demonstrate that RG-PT significantly outperforms existing methods such as learn-then-test (LTT) and Pareto testing (PT) through a more efficient exploration of the hyperparameter space.

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

Add feedback

Efficient multi-prompt evaluation of LLMs

Neural Information Processing SystemsMar-19-2026, 06:20:17 GMT

Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the reproducibility of results on leaderboards. Many recent works empirically verify prompt sensitivity and advocate for changes in LLM evaluation. In this paper, we consider the problem of estimating the performance distribution across many prompt variants instead of finding a single prompt to evaluate with. We introduce PromptEval, a method for estimating performance across a large set of prompts borrowing strength across prompts and examples to produce accurate estimates under practical evaluation budgets. The resulting distribution can be used to obtain performance quantiles to construct various robust performance metrics (e.g., top 95% quantile or median). We prove that PromptEval consistently estimates the performance distribution and demonstrate its efficacy empirically on three prominent LLM benchmarks: MMLU, BIG-bench Hard, and LMentry; for example, PromptEval can accurately estimate performance quantiles across 100 prompt templates on MMLU with a budget equivalent to two single-prompt evaluations. Moreover, we show how PromptEval can be useful in LLM-as-a-judge and best prompt identification applications.

artificial intelligence, large language model, natural language, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications Dingkang Y ang 1,3 Jinjie Wei

Neural Information Processing SystemsFeb-18-2026, 18:53:06 GMT

In the continuous pre-training phase, we introduce a hybrid instruction pre-training mechanism to mitigate the internal-injected knowledge inconsistency of LLMs for medical domain adaptation.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Jilin Province > Changchun (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

d8efbb5dd415974eb095c3f06bff1f48-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 08:24:46 GMT

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Michigan (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education > Educational Setting > Online (0.68)
Information Technology > Services (0.46)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

d6f034bb216b472fc7d32ec7aff20342-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 07:43:27 GMT

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.92)
Health & Medicine (0.68)
Law (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

StreamBench: Towards Benchmarking Continuous Improvement of Language Agents

Neural Information Processing SystemsFeb-17-2026, 23:01:39 GMT

To address this gap, we introduce StreamBench, a pioneering benchmark designed to evaluate the continuous improvement of LLM agents over an input-feedback sequence.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > Taiwan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.47)

Technology: