Goto

Collaborating Authors

 Law



Supplementary Material - WikiDO: A New Benchmark Evaluating Cross-Modal Retrieval for Vision-Language Models A Datasheet for WikiDO dataset 1 A.1 Motivation

Neural Information Processing Systems

Q1 For what purpose was the dataset created? Q2 Who created the dataset (e.g., which team, research group) and on behalf of which Q3 Who funded the creation of the dataset? Q1 What do the instances that comprise the dataset represent (e.g., documents, photos, Are there multiple types of instances (e.g., movies, users, and ratings; Is the sample representative of the larger set (e.g., geographic coverage)? Q4 What data does each instance consist of? In either case, please provide a description.


VHELM: A Holistic Evaluation of Vision Language Models Tony Lee 1 Haoqin T u 2 Chi Heem Wong

Neural Information Processing Systems

Our framework is designed to be lightweight and automatic so that evaluation runs are cheap and fast. Our initial run evaluates 22 VLMs on 21 existing datasets to provide a holistic snapshot of the models. We uncover new key findings, such as the fact that efficiency-focused models (e.g., Claude 3 Haiku or Gemini 1.5 Flash) perform significantly




California's landmark frontier AI law to bring transparency

Al Jazeera

California's landmark frontier AI law to bring transparency Late last month, California became the first state in the United States to pass a law to regulate cutting-edge AI technologies. Now experts are divided over its impact. They agree that the law, the Transparency in Frontier Artificial Intelligence Act, is a modest step forward, but it is still far from actual regulation. It mandates reporting of incidents such as large-scale cyber-attacks, deaths of 50 or more people, large monetary losses and other safety-related events caused by AI models. It also puts in place whistleblower protections.


Probing Social Bias in Labor Market Text Generation by ChatGPT: A Masked Language Model Approach Lei Ding

Neural Information Processing Systems

The complexity of automating bias evaluation in textual content poses significant challenges. Traditional approaches in social sciences, such as content analysis, often rely on manual word counts from static lists [Gaucher et al., 2011], which may miss the subtleties and unlisted language cues that advanced NLP technologies can detect.