AITopics | supervisor

Collaborating Authors

supervisor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reverse Engineering Human Preferences with Reinforcement Learning

Neural Information Processing SystemsJun-19-2026, 20:43:57 GMT

The capabilities of Large Language Models (LLMs) are routinely evaluated by other LLMs trained to predict human preferences. This framework--known as LLM-as-a-judge--is highly scalable and relatively low cost. However, it is also vulnerable to malicious exploitation, as LLM responses can be tuned to overfit the preferences of the judge. Previous work shows that the answers generated by a candidate-LLM can be edited post hoc to maximise the score assigned to them by a judge-LLM. In this study, we adopt a different approach and use the signal provided by judge-LLMs as a reward to adversarially tune models that generate text preambles designed to boost downstream performance.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Asia (0.68)
Europe (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective

Neural Information Processing SystemsJun-14-2026, 05:49:18 GMT

Existing machine-generated text (MGT) detection methods implicitly assume labels as the golden standard. However, we reveal boundary ambiguity in MGT detection, implying that traditional training paradigms are inexact. Moreover, limitations of human cognition and the superintelligence of detectors make inexact learning widespread and inevitable. To this end, we propose an easy-to-hard enhancement framework to provide reliable supervision under such inexact conditions. Distinct from knowledge distillation, our framework employs an easy supervisor targeting relatively simple longer-text detection tasks (despite weaker capabilities), to enhance the more challenging target detector. Firstly, longer texts targeted by supervisors theoretically alleviate the impact of inexact labels, laying the foundation for reliable supervision. Secondly, by structurally incorporating the detector into the supervisor, we theoretically model the supervisor as a lower performance bound for the detector.

artificial intelligence, name change, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.57)

Add feedback

Teachable Reinforcement Learning via Advice Distillation

Neural Information Processing SystemsApr-25-2026, 11:38:53 GMT

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention. Can we overcome these challenges by building agents that learn from rich, interactive feedback instead? We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher. We begin by formalizing a class of human-in-the-loop decision making problems in which multiple forms of teacher-provided advice are available to a learner. We then describe a simple learning algorithm for these problems that first learns to interpret advice, then learns from advice to complete tasks even in the absence of human supervision. In puzzle-solving, navigation, and locomotion domains, we show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms and often less than imitation learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Industry: Education > Educational Setting > Online (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Studying multiplicity: an interview with Prakhar Ganesh

AIHubMar-12-2026, 23:37:34 GMT

In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. We sat down with Prakhar Ganesh to learn about his work on responsible AI, which is focussed on the concept of multiplicity. We found out more about some of the projects he's been involved in, his future plans, and how he got into the field. Could you start with a quick introduction to yourself, where you're studying, and the broad topic of your research? My name is Prakhar Ganesh. I'm also affiliated with Mila, which is a research institute in Montreal. My supervisor is Professor Golnoosh Farnadi.

artificial intelligence, multiplicity, natural language, (18 more...)

AIHub

Country:

North America > Canada > Quebec > Montreal (0.35)
Asia > Singapore (0.05)
Asia > India (0.05)

Genre: Personal > Interview (1.00)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.90)
Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Communications > Social Media (0.69)

Add feedback

Coding for underwater robotics

RobohubMar-12-2026, 23:31:58 GMT

During a summer internship at MIT Lincoln Laboratory, Ivy Mahncke, an undergraduate student of robotics engineering at Olin College of Engineering, took a hands-on approach to testing algorithms for underwater navigation. She first discovered her love for working with underwater robotics as an intern at the Woods Hole Oceanographic Institution in 2024. Drawn by the chance to tackle new problems and cutting-edge algorithm development, Mahncke began an internship with Lincoln Laboratory's Advanced Undersea Systems and Technology Group in 2025. Mahncke spent the summer developing and troubleshooting an algorithm that would help a human diver and robotic vehicle collaboratively navigate underwater. The lack of traditional localization aids -- such as the Global Positioning System, or GPS -- in an underwater environment posed challenges for navigation that Mahncke and her mentors sought to overcome.

artificial intelligence, lincoln laboratory, mahncke, (13 more...)

Robohub

Country: Atlantic Ocean (0.05)

Industry: Health & Medicine > Therapeutic Area (0.57)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.51)

Add feedback

Jeffrey Epstein's Ties to CBP Agents Sparked a DOJ Probe

WIREDFeb-20-2026, 03:29:17 GMT

Documents say customs officers in the US Virgin Islands had friendly relationships with Epstein years after his 2008 conviction, showing how the infamous sex offender tried to cultivate allies. United States prosecutors and federal law enforcement spent over a year examining ties between Jeffrey Epstein and Customs and Border Protection officers stationed in the US Virgin Islands (USVI), according to documents recently released by the Department of Justice. As The Guardian and New York Times have reported, emails, text messages, and investigative records show that Epstein cultivated friendships with several officers, entertaining them on his island and offering to take them for whale-watching trips in his helicopter. He even brought one cannolis for Christmas Eve. In turn, Epstein would bring certain officers his complaints about his treatment at the hands of other CBP and federal agents.

artificial intelligence, epstein, routch, (15 more...)

WIRED

Country:

North America > United States > Minnesota (0.15)
North America > United States > California (0.14)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Immigration & Customs (1.00)

Technology:

Information Technology > Artificial Intelligence (0.47)
Information Technology > Communications > Mobile (0.35)

Add feedback

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Lin, Justin W., Jones, Eliot Krzysztof, Jasper, Donovan Julian, Ho, Ethan Jun-shen, Wu, Anna, Yang, Arnold Tianyi, Perry, Neil, Zou, Andy, Fredrikson, Matt, Kolter, J. Zico, Liang, Percy, Boneh, Dan, Ho, Daniel E.

arXiv.org Artificial IntelligenceDec-11-2025

We present the first comprehensive evaluation of AI agents against human cybersecurity professionals in a live enterprise environment. We evaluate ten cybersecurity professionals alongside six existing AI agents and ARTEMIS, our new agent scaffold, on a large university network consisting of ~8,000 hosts across 12 subnets. ARTEMIS is a multi-agent framework featuring dynamic prompt generation, arbitrary sub-agents, and automatic vulnerability triaging. In our comparative study, ARTEMIS placed second overall, discovering 9 valid vulnerabilities with an 82% valid submission rate and outperforming 9 of 10 human participants. While existing scaffolds such as Codex and CyAgent underperformed relative to most human participants, ARTEMIS demonstrated technical sophistication and submission quality comparable to the strongest participants. We observe that AI agents offer advantages in systematic enumeration, parallel exploitation, and cost -- certain ARTEMIS variants cost $18/hour versus $60/hour for professional penetration testers. We also identify key capability gaps: AI agents exhibit higher false-positive rates and struggle with GUI-based tasks.

artificial intelligence, machine learning, vulnerability, (15 more...)

arXiv.org Artificial Intelligence

2512.09882

Country: North America > United States (0.67)

Genre: Research Report (0.84)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction

Yang, Yifan, Duan, Zhixiang, Xie, Tianshi, Cao, Fuyu, Shen, Pinxi, Song, Peili, Jin, Piaopiao, Sun, Guokang, Xu, Shaoqing, You, Yangwei, Liu, Jingtai

arXiv.org Artificial IntelligenceDec-4-2025

Robotic manipulation is a fundamental component of automation. However, traditional perception-planning pipelines often fall short in open-ended tasks due to limited flexibility, while the architecture of a single end-to-end Vision-Language-Action (VLA) offers promising capabilities but lacks crucial mechanisms for anticipating and recovering from failure. To address these challenges, we propose FPC-VLA, a dual-model framework that integrates VLA with a supervisor for failure prediction and correction. The supervisor evaluates action viability through vision-language queries and generates corrective strategies when risks arise, trained efficiently without manual labeling. A dual-stream fusion module further refines actions by leveraging past predictions. Evaluation results on multiple simulation platforms (SIMPLER and LIBERO) and robot embodiments (WidowX, Google Robot, Franka) show that FPC-VLA outperforms state-of-the-art models in both zero-shot and fine-tuned settings. Successful real-world deployments on diverse, long-horizon tasks confirm FPC-VLA's strong generalization and practical utility for building more reliable autonomous systems.

artificial intelligence, large language model, natural language, (12 more...)

arXiv.org Artificial Intelligence

2509.04018

Country: Asia > China (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

CLIPPan: Adapting CLIP as A Supervisor for Unsupervised Pansharpening

Jian, Lihua, Liu, Jiabo, Wu, Shaowu, Chen, Lihui

arXiv.org Artificial IntelligenceNov-17-2025

Despite remarkable advancements in supervised pansharpening neural networks, these methods face domain adaptation challenges of resolution due to the intrinsic disparity between simulated reduced-resolution training data and real-world full-resolution scenarios.To bridge this gap, we propose an unsupervised pansharpening framework, CLIPPan, that enables model training at full resolution directly by taking CLIP, a visual-language model, as a supervisor. However, directly applying CLIP to supervise pansharpening remains challenging due to its inherent bias toward natural images and limited understanding of pansharpening tasks. Therefore, we first introduce a lightweight fine-tuning pipeline that adapts CLIP to recognize low-resolution multispectral, panchromatic, and high-resolution multispectral images, as well as to understand the pansharpening process. Then, building on the adapted CLIP, we formulate a novel \textit{loss integrating semantic language constraints}, which aligns image-level fusion transitions with protocol-aligned textual prompts (e.g., Wald's or Khan's descriptions), thus enabling CLIPPan to use language as a powerful supervisory signal and guide fusion learning without ground truth. Extensive experiments demonstrate that CLIPPan consistently improves spectral and spatial fidelity across various pansharpening backbones on real-world datasets, setting a new state of the art for unsupervised full-resolution pansharpening.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.10896

Country: Asia > China (0.47)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

ProST: Progressive Sub-task Training for Pareto-Optimal Multi-agent Systems Using Small Language Models

Bijoy, Biddut Sarker, Hasan, Mohammad Saqib, Alipoormolabashi, Pegah, Sil, Avirup, Balasubramanian, Aruna, Balasubramanian, Niranjan

arXiv.org Artificial IntelligenceNov-12-2025

Multi-agent systems with smaller language models (SLMs) present a viable alternative to single agent systems powered by large language models (LLMs) for addressing complex problems. In this work, we study how these alternatives compare in terms of both effectiveness and efficiency. To study this trade-off, we instantiate single and multi-agent systems for the complex problems in the AppWorld environment using different sized language models. We find that difficulties with long-trajectory learning in smaller language models (SLMs) limit their performance. Even when trained for specialized roles, SLMs fail to learn all subtasks effectively. To address this issue, we introduce a simple progressive sub-task training strategy, which introduces new sub-tasks progressively in each training epoch. We find that this novel strategy, analogous to instance level curriculum learning, consistently improves the effectiveness of multi-agents at all configurations. Our Pareto analysis shows that fine-tuned multi-agent systems yield better effectiveness-efficiency trade-offs. Additional ablations and analyses shows the importance of our progressive training strategy and its ability to reduce subtask error rates.

artificial intelligence, subtask, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2509.04508

Country:

North America > United States (0.67)
Asia (0.46)

Genre: Workflow (1.00)

Industry: Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback