AITopics | ape

Collaborating Authors

ape

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

This is the most underrated sci-fi film franchise of the 21st century

New ScientistMay-20-2026, 18:00:00 GMT

AS A sci-fi fan, you learn not to dwell on the films that could have been. Whether it's Alejandro Jodorowsky's unmade Dune, Guillermo del Toro's cancelled take on At the Mountains of Madness, or the versions of Return of the Jedi that Davids Lynch and Cronenberg could have made, it's best not to torture yourself over cinematic what-ifs. That's why I had given up hope of there being a new instalment of the most underrated sci-fi film franchise of the 21st century so far. Though well received by critics and audiences alike, none of the four films have won Oscars or seem to have made much of an impact on pop culture. But then, earlier this month, we got confirmation that a fifth movie was on the way.

artificial intelligence, science fiction, social media, (14 more...)

New Scientist

Industry:

Leisure & Entertainment (0.71)
Media > Film (0.52)
Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Science Fiction (0.73)

Add feedback

Adaptive Algorithms for Relaxed Pareto Set Identification

Neural Information Processing SystemsFeb-13-2026, 22:20:33 GMT

In this paper we revisit the fixed-confidence identification of the Pareto optimal set in a multi-objective multi-armed bandit model.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Algebraic Positional Encodings

Neural Information Processing SystemsFeb-11-2026, 14:53:47 GMT

This design preserves the structural properties of the source domain, thereby ensuring that the end-model upholds them.

experiment, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

LangMark: A Multilingual Dataset for Automatic Post-Editing

Velazquez, Diego, Grace, Mikaela, Karageorgos, Konstantinos, Carin, Lawrence, Schliem, Aaron, Zaikis, Dimitrios, Wechsler, Roger

arXiv.org Artificial IntelligenceNov-24-2025

Automatic post-editing (APE) aims to correct errors in machine-translated text, enhancing translation quality, while reducing the need for human intervention. Despite advances in neural machine translation (NMT), the development of effective APE systems has been hindered by the lack of large-scale multilingual datasets specifically tailored to NMT outputs. To address this gap, we present and release LangMark, a new human-annotated multilingual APE dataset for English translation to seven languages: Brazilian Portuguese, French, German, Italian, Japanese, Russian, and Spanish. The dataset has 206,983 triplets, with each triplet consisting of a source segment, its NMT output, and a human post-edited translation. Annotated by expert human linguists, our dataset offers both linguistic diversity and scale. Leveraging this dataset, we empirically show that Large Language Models (LLMs) with few-shot prompting can effectively perform APE, improving upon leading commercial and even proprietary machine translation systems. We believe that this new resource will facilitate the future development and evaluation of APE systems.

large language model, machine learning, translation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.acl-long.1569

2511.17153

Country:

Europe (1.00)
North America > Mexico (0.28)
North America > United States (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Auto-Prompt Ensemble for LLM Judge

Li, Jiajie, Zhang, Huayi, Lin, Peng, Xiong, Jinjun, Xu, Wei

arXiv.org Artificial IntelligenceOct-9-2025

We present a novel framework that improves the reliability of LLM judges by selectively augmenting LLM with auxiliary evaluation dimensions. Existing LLM judges often miss crucial evaluation dimensions because they fail to recognize the implicit standards underlying human assessments. To address this challenge, we propose the Auto-Prompt Ensemble (APE), an adaptive framework that automatically learns evaluation dimensions from its failure cases. APE incorporates a confidence-based ensemble mechanism to decide when to adopt the judgments from additional evaluation dimensions through a novel confidence estimation approach called Collective Confidence. Extensive experiments demonstrate that APE improves the reliability of LLM Judge across diverse standard benchmarks. For instance, APE enhances GPT-4o agreement rate on Reward Bench from 87.2% to 90.5% in the zero-shot setting. Overall, APE provides a principled approach for LLM Judge to leverage test-time computation, and bridge the evaluation gap between human and LLM judges.

dimension, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.06538

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Adaptive Algorithms for Relaxed Pareto Set Identification

Neural Information Processing SystemsOct-8-2025, 21:12:16 GMT

In this paper we revisit the fixed-confidence identification of the Pareto optimal set in a multi-objective multi-armed bandit model.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Language Modeling with Learned Meta-Tokens

Shah, Alok N., Gupta, Khush, Ramji, Keshav, Chaudhari, Pratik

arXiv.org Artificial IntelligenceSep-23-2025

While modern Transformer-based language models (LMs) have achieved major success in multi-task generalization, they often struggle to capture long-range dependencies within their context window. This work introduces a novel approach using meta-tokens, special tokens injected during pre-training, along with a dedicated meta-attention mechanism to guide LMs to use these tokens. We pre-train a language model with a modified GPT-2 architecture equipped with meta-attention in addition to causal multi-head attention, and study the impact of these tokens on a suite of synthetic tasks. We find that data-efficient language model pre-training on fewer than 100B tokens utilizing meta-tokens and our meta-attention mechanism achieves strong performance on these tasks after fine-tuning. We suggest that these gains arise due to the meta-tokens sharpening the positional encoding. This enables them to operate as trainable, content-based landmarks, implicitly compressing preceding context and "caching" it in the meta-token. At inference-time, the meta-token points to relevant context, facilitating length generalization up to 2$\times$ its context window, even after extension with YaRN. We provide further evidence of these behaviors by visualizing model internals to study the residual stream, and assessing the compression quality by information-theoretic analysis on the rate-distortion tradeoff. Our findings suggest that pre-training LMs with meta-tokens offers a simple, data-efficient method to enhance long-context language modeling performance, while introducing new insights into the nature of their behavior towards length generalization.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2509.16278

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RadiomicsRetrieval: A Customizable Framework for Medical Image Retrieval Using Radiomics Features

Na, Inye, Rue, Nejung, Chung, Jiwon, Park, Hyunjin

arXiv.org Artificial IntelligenceJul-14-2025

Medical image retrieval is a valuable field for supporting clinical decision-making, yet current methods primarily support 2D images and require fully annotated queries, limiting clinical flexibility. To address this, we propose RadiomicsRetrieval, a 3D content-based retrieval framework bridging handcrafted radiomics descriptors with deep learning-based embeddings at the tumor level . Unlike existing 2D approaches, RadiomicsRetrieval fully exploits volumetric data to leverage richer spatial context in medical images. We employ a promptable segmentation model (e.g., SAM) to derive tumor-specific image embeddings, which are aligned with radiomics features extracted from the same tumor via contrastive learning. These representations are further enriched by anatomical positional embedding (APE). As a result, RadiomicsRe-trieval enables flexible querying based on shape, location, or partial feature sets. Extensive experiments on both lung CT and brain MRI public datasets demonstrate that radiomics features significantly enhance retrieval specificity, while APE provides global anatomical context essential for location-based searches. Notably, our framework requires only minimal user prompts (e.g., a single point), minimizing segmentation overhead and supporting diverse clinical scenarios. The capability to query using either image embeddings or selected radiomics attributes highlights its adaptability, potentially benefiting diagnosis, treatment planning, and research on large-scale medical imaging repositories.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.08546

Country: Asia > South Korea (0.14)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

APE: Selective Fine-tuning with Acceptance Criteria for Language Model Adaptation

Marín, Javier

arXiv.org Artificial IntelligenceJun-10-2025

Adapting large pre-trained language models to specific tasks requires balancing performance improvement with preservation of learned capabilities. Standard fine-tuning approaches optimize a single objective function through gradient descent, often leading to catastrophic forgetting [16] or instability in learned representations. Parameter-efficient methods like LoRA [11] constrain modifications to low-dimensional subspaces but limit adaptation scope. We propose Adjacent Possible Exploration (APE), a selective fine-tuning approach that explores multiple parameter modification directions while implementing acceptance criteria to maintain model stability. The method draws conceptual inspiration from evolutionary optimization principles, particularly the biological constraint that viable changes must preserve essential system properties while enabling incremental improvement. APE operates by generating multiple candidate parameter updates through fine-tuning on randomly sampled data subsets, then selecting only those updates that exceed a performance improvement threshold. This creates a filtered optimization process that systematically explores beneficial parameter modifications while rejecting changes that fall within noise levels or potentially destabilize learned representations. Our key contributions include: (1) A practical algorithm for selective fine-tuning that balances exploration and stability, (2) Empirical validation showing superior performance compared to standard adaptation methods, and (3) Analysis of why selective acceptance of parameter modifications leads to more robust model adaptation. 1 The approach demonstrates that systematic exploration of parameter space through filtered selection can achieve better adaptation results than unconstrained optimization, providing a principled framework for controlled model modification that maintains stability while enabling significant performance improvements.

machine learning, modification, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.19912

Country: Europe > United Kingdom > England (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding

Yang, Xinyu, Chen, Tianqi, Chen, Beidi

arXiv.org Artificial IntelligenceFeb-12-2025

Recent advances in context-augmented generation (CAG) techniques, particularly retrieval-augmented generation (RAG) (Gupta et al., 2024; Gao et al., 2023) and in-context learning (ICL) (Dong et al., 2022; Wei et al., 2022), have been widely adopted in large language models (LLMs) (Dubey et al., 2024; Achiam et al., 2023), improving their ability to generalize to unseen tasks with contextual information, as demonstrated in Figure 1 (top). These techniques employ a sequential encoding process to ground LLM inputs with knowledge from external sources: concatenating the retrieved texts into one sequence, and encoding the sequence into key-value (KV) states as the context for subsequent queries. While this new, significantly longer input improves performance, the increased latency in context prefilling becomes a bottleneck in tasks that require long inputs but generate short outputs (Bai et al., 2023; Agarwal et al., 2024; Jiang et al., 2024b). For example, prefilling a 128K context takes 17 seconds, whereas generating 256 tokens requires only 6 seconds. This discrepancy leaves significant room to improve the practical efficiency of CAG systems in real-world deployments (Liu, 2022; Chase, 2022).

arxiv preprint arxiv, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.05431

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback