AITopics | multiple-choice question

Collaborating Authors

multiple-choice question

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VMDT: Decoding the Trustworthiness of Video Foundation Models

Neural Information Processing SystemsJun-23-2026, 01:31:42 GMT

As foundation models become more sophisticated, ensuring their trustworthiness becomes increasingly critical; yet, unlike text and image, the video modality still lacks comprehensive trustworthiness benchmarks. We introduce VMDT (VideoModal DecodingTrust), the first unified platform for evaluating text-to-video (T2V) and video-to-text (V2T) models across five key trustworthiness dimensions: safety, hallucination, fairness, privacy, and adversarial robustness. Through our extensive evaluation of 7 T2V models and 19 V2T models using VMDT, we uncover several significant insights. For instance, all open-source T2V models evaluated fail to recognize harmful queries and often generate harmful videos, while exhibiting higher levels of unfairness compared to image modality models. In V2T models, unfairness and privacy risks rise with scale, whereas hallucination and adversarial robustness improve--though overall performance remains low. Uniquely, safety shows no correlation with model size, implying that factors other than scale govern current safety levels. Our findings highlight the urgent need for developing more robust and trustworthy video foundation models, and VMDT provides a systematic framework for measuring and tracking progress toward this goal.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.45)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.65)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(5 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Results on FAVOR Bench

Neural Information Processing SystemsJun-22-2026, 20:58:18 GMT

Prompt Template: Generating QAPairs for Camera Motion (CM) Task You are a professional question designer focusing on temporal dynamics in videos, including camera movements, motions, activities, and interactions, rather than static content. You will receive detailed annotations about the temporal details of the entire video, with duration markers in parentheses after "camera_motion" and "motion_list". Based on these annotations, design 3 multiple-choice questions around the "Camera Motion" theme to test models' fine-grained video motion understanding, particularly: Understanding camera movement direction and focus changes in the video. Additionally, follow these question design guidelines: 1. If a video's "camera_motion" has only one element, such as "camera_motion": "static", or "camera_motion": "camera shaking (0-22)", skip this video and don't generate any content.

large language model, machine learning, natural language, (23 more...)

Neural Information Processing Systems

Genre: Questionnaire & Opinion Survey (0.36)

Industry:

Media > Television (1.00)
Media > Photography (1.00)
Media > Film (1.00)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Add feedback

Appendix ATask Definitions

Neural Information Processing SystemsJun-18-2026, 07:47:26 GMT

Table 3 outlines the and reasoning tasks included in the MMPerspective benchmark. Sample cases and representative questions are included to illustrate the task format and input style. We also show examples of perspective-invariant image operations for robustness evaluation in Figure 17, including cropping, masking, flipping, and rotation. Where is the vanishing point in this image? Critical Line Perception (CLP) 123 Figure 9 Determine which of the highlighted lines is the horizon line. Which line highlighted in the image is the Horizon Line?

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving

Neural Information Processing SystemsJun-14-2026, 05:28:13 GMT

We introduce STSBench, a scenario-based framework to benchmark the holistic understanding of vision-language models (VLMs) for autonomous driving. The framework automatically mines predefined traffic scenarios from any dataset using ground-truth annotations, provides an intuitive user interface for efficient human verification, and generates multiple-choice questions for model evaluation. Applied to the nuScenes dataset, we present STSnu, the first benchmark that evaluates the spatio-temporal reasoning capabilities of VLMs based on comprehensive 3D perception. Existing benchmarks typically target off-the-shelf or fine-tuned VLMs for images or videos from a single viewpoint, focusing on semantic tasks such as object recognition, dense captioning, risk assessment, or scene understanding. In contrast, STSnu evaluates driving expert VLMs for end-to-end driving, operating on videos from multi-view cameras or LiDAR. It specifically assesses their ability to reason about both ego-vehicle actions and complex interactions among traffic participants, a crucial capability for autonomous vehicles.

artificial intelligence, name change, proceedings, (10 more...)

Neural Information Processing Systems

Industry: Information Technology (0.65)

Technology:

Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.62)

Add feedback

My Son's Math Homework Is Essentially Just Pokémon

The Atlantic - TechnologyMay-16-2026, 11:30:00 GMT

My Son's Math Homework Is Essentially Just Pokémon Education games are taking over American classrooms. One afternoon earlier this year, my 11-year-old son was sitting at his laptop and working quietly on his math homework. At least, that's what he was supposed to be doing. When I glanced at his screen, equations were nowhere to be seen. He was controlling a monster in the midst of battle, casting magic spells to outduel an opposing player.

artificial intelligence, prodigy, student, (11 more...)

The Atlantic - Technology

Country: North America > United States > California (0.29)

Industry:

Education > Curriculum > Subject-Specific Education (0.96)
Leisure & Entertainment > Games > Computer Games (0.86)
Education > Educational Setting > K-12 Education (0.72)

Technology: Information Technology > Artificial Intelligence > Games (0.41)

Add feedback

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data Only The Falcon LLMTeam

Neural Information Processing SystemsApr-30-2026, 09:16:27 GMT

This curation process is believed to be necessary to produce 5 performant models with broad zero-shot generalization abilities. However, as larger 6 models requiring pretraining on trillions of tokens are considered, it is unclear how 7 scalable is curation, and whether we will run out of unique high-quality data soon.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: