AITopics | Law

Collaborating Authors

Law

Supplementary Material - WikiDO: A New Benchmark Evaluating Cross-Modal Retrieval for Vision-Language Models

Neural Information Processing SystemsJun-2-2025, 09:37:51 GMT

Q1 For what purpose was the dataset created? Was there a specific task in mind? Q2 Who created the dataset (e.g., which team, research group) and on behalf of which Q3 Who funded the creation of the dataset? Q1 What do the instances that comprise the dataset represent (e.g., documents, photos, Are there multiple types of instances (e.g., movies, users, and ratings; Q2 How many instances are there in total (of each type, if appropriate)? Is the sample representative of the larger set (e.g., geographic coverage)?

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Industry:

Law (0.68)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

VHELM: A Holistic Evaluation of Vision Language Models Chi Heem Wong

Neural Information Processing SystemsJun-2-2025, 09:33:14 GMT

Current benchmarks for assessing vision-language models (VLMs) often focus on their perception or problem-solving capabilities and neglect other critical aspects such as fairness, multilinguality, or toxicity. Furthermore, they differ in their evaluation procedures and the scope of the evaluation, making it difficult to compare models. To address these issues, we extend the HELM framework to VLMs to present the Holistic Evaluation of Vision Language Models (VHELM). VHELM aggregates various datasets to cover one or more of the 9 aspects: visual perception, knowledge, reasoning, bias, fairness, multilinguality, robustness, toxicity, and safety. In doing so, we produce a comprehensive, multi-dimensional view of the capabilities of the VLMs across these important factors.

benchmark, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Law (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

case, please provide a description

Neural Information Processing SystemsJun-2-2025, 09:25:20 GMT

This document is based on Datasheets for Datasets by and edges)? Please see the most updated version The instances of this graph-based dataset comprise here. Link prediction on this dataset is a multi-instance prediction task [3]. For what purpose was the dataset created? Was there a specific task in mind? Was there a specific gap that How many instances are there in total (of each type, needed to be filled?

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Law (0.94)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.68)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Information Management (0.67)

Add feedback

TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs Julia Gastinger 1,2,6 Shenyang Huang 1,4 Mikhail Galkin 3

Neural Information Processing SystemsJun-2-2025, 09:23:51 GMT

Multi-relational temporal graphs are powerful tools for modeling real-world data, capturing the evolving and interconnected nature of entities over time. Recently, many novel models are proposed for ML on such graphs intensifying the need for robust evaluation and standardized benchmark datasets. However, the availability of such resources remains scarce and evaluation faces added complexity due to reproducibility issues in experimental protocols. To address these challenges, we introduce Temporal Graph Benchmark 2.0 (TGB 2.0), a novel benchmarking framework tailored for evaluating methods for predicting future links on Temporal Knowledge Graphs and Temporal Heterogeneous Graphs with a focus on large-scale datasets, extending the Temporal Graph Benchmark.

artificial intelligence, machine learning, temporal reasoning, (20 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia > China > Liaoning Province > Shenyang (0.40)
North America > Canada > Quebec (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.93)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Neural Information Processing SystemsJun-2-2025, 09:13:12 GMT

Research in auditory, visual, and audiovisual speech recognition (ASR, VSR, and AVSR, respectively) has traditionally been conducted independently. Even recent self-supervised studies addressing two or all three tasks simultaneously tend to yield separate models, leading to disjoint inference pipelines with increased memory requirements and redundancies. This paper proposes unified training strategies for these systems. We demonstrate that training a single model for all three tasks enhances VSR and AVSR performance, overcoming typical optimisation challenges when training from scratch. Moreover, we introduce a greedy pseudo-labelling approach to more effectively leverage unlabelled samples, addressing shortcomings in related self-supervised methods. Finally, we develop a self-supervised pretraining method within our framework, proving its effectiveness alongside our semi-supervised approach. Despite using a single model for all tasks, our unified approach achieves state-of-the-art performance compared to recent methods on LRS3 and LRS2 for ASR, VSR, and AVSR, as well as on the newly released WildVSR dataset. Code and models are available at https://github.com/

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Education (0.68)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Boyi Wei 1 Yangsibo Huang 1 Noah A. Smith

Neural Information Processing SystemsJun-2-2025, 08:58:11 GMT

These models can memorize and generate content similar to their training data, posing potential concerns. Therefore, model creators are motivated to develop mitigation methods that prevent generating protected content.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.67)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.93)
Leisure & Entertainment > Sports > Baseball (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition

Neural Information Processing SystemsJun-2-2025, 08:27:26 GMT

Our world is full of varied actions and moves across specialized domains that we, as humans, strive to identify and understand. Within any single domain, actions can often appear quite similar, making it challenging for deep models to distinguish them accurately. To evaluate the effectiveness of multimodal foundation models in helping us recognize such actions, we present ActionAtlas v1.0, a multiple-choice video question-answering benchmark featuring short videos across various sports. Each video in the dataset is paired with a question and four or five choices. The question pinpoints specific individuals, asking which choice "best" describes their action within a certain temporal context.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Tennis (1.00)
Information Technology (1.00)
Government (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

The Fragility of Fairness: Causal Sensitivity Analysis for Fair Machine Learning Department of Statistics Department of Statistics University of Oxford

Neural Information Processing SystemsJun-2-2025, 08:22:22 GMT

Fairness metrics are a core tool in the fair machine learning literature (FairML), used to determine that ML models are, in some sense, "fair." Real-world data, however, are typically plagued by various measurement biases and other violated assumptions, which can render fairness assessments meaningless. We adapt tools from causal sensitivity analysis to the FairML context, providing a general framework which (1) accommodates effectively any combination of fairness metric and bias that can be posed in the "oblivious setting"; (2) allows researchers to investigate combinations of biases, resulting in non-linear sensitivity; and (3) enables flexible encoding of domain-specific constraints and assumptions. Employing this framework, we analyze the sensitivity of the most common parity metrics under 3 varieties of classifier across 14 canonical fairness datasets. Our analysis reveals the striking fragility of fairness assessments to even minor dataset biases. We show that causal sensitivity analysis provides a powerful and necessary toolkit for gauging the informativeness of parity metric evaluations. Our repository is available here.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.40)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance > Credit (0.67)
Education > Curriculum > Subject-Specific Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Aligning LLM Agents by Learning Latent Preference from User Edits

Neural Information Processing SystemsJun-2-2025, 08:17:13 GMT

We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment with the user's preference, and for reducing the cost of user edits over time. We propose a learning framework, PRELUDE that infers a description of the user's latent preference based on historic edit data. The inferred user preference descriptions are used to define prompts for generating responses in the future.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Education (0.66)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models

Neural Information Processing SystemsJun-2-2025, 08:11:36 GMT

Diffusion models have revolutionized customized text-to-image generation, allowing for efficient synthesis of photos from personal data with textual descriptions. However, these advancements bring forth risks including privacy breaches and unauthorized replication of artworks. Previous researches primarily center around using "prompt-specific methods" to generate adversarial examples to protect personal images, yet the effectiveness of existing methods is hindered by constrained adaptability to different prompts. In this paper, we introduce a Prompt-Agnostic Adversarial Perturbation (PAP) method for customized diffusion models. PAP first models the prompt distribution using a Laplace Approximation, and then produces prompt-agnostic perturbations by maximizing a disturbance expectation based on the modeled distribution. This approach effectively tackles the promptagnostic attacks, leading to improved defense stability. Extensive experiments in face privacy and artistic style protection, demonstrate the superior generalization of PAP in comparison to existing techniques. Our code will be available at https://github.com/vancyland/PAP.

artificial intelligence, arxiv preprint arxiv, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: