Goto

Collaborating Authors

 Media


CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care

Neural Information Processing Systems

The recent advances in natural language processing (NLP), have led to a new trend of applying large language models (LLMs) to real-world scenarios. While the latest LLMs are astonishingly fluent when interacting with humans, they suffer from the misinformation problem by unintentionally generating factually false statements. This can lead to harmful consequences, especially when produced within sensitive contexts, such as healthcare. Yet few previous works have focused on evaluating misinformation in the long-form (LF) generation of LLMs, especially for knowledge-intensive topics. Moreover, although LLMs have been shown to perform well in different languages, misinformation evaluation has been mostly conducted in English.


Japan to quadruple spending support for chips and AI in budget

The Japan Times

A prototype of a Rapidus 300mm wafer displayed at the Semicon Japan exhibition. The industry ministry has earmarked ¥150 billion for state-backed chip venture Rapidus, bringing the cumulative government investment in the venture to ¥250 billion. The industry ministry is set to nearly quadruple its budgeted support for cutting-edge semiconductors and artificial-intelligence development to about ¥1.23 trillion ($7.9 billion) for the fiscal year starting in April. Overall the Ministry of Economy, Trade and Industry's budget rose by about 50% from the previous year to ¥3.07 trillion, largely due to the jump in chips and AI spending. After Prime Minister Sanae Takaichi's Cabinet signed off on it Friday, the government's initial budget plan will be debated in parliament in the new year.


CNN {2}: Viewpoint Generalization via a Binocular Vision

Neural Information Processing Systems

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN^{2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye. CNN^{2} uses novel augmentation, pooling, and convolutional layers to learn a sense of three-dimensionality in a recursive manner. Empirical evaluation shows that CNN^{2} has improved viewpoint generalizability compared to vanilla CNNs. Furthermore, CNN^{2} is easy to implement and train, and is compatible with existing CNN-based specialized techniques for different applications.


Breaking Bad: A Dataset for Geometric Fracture and Reassembly

Neural Information Processing Systems

We introduce Breaking Bad, a large-scale dataset of fractured objects. Our dataset consists of over one million fractured objects simulated from ten thousand base models. The fracture simulation is powered by a recent physically based algorithm that efficiently generates a variety of fracture modes of an object.


MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

Neural Information Processing Systems

Text-guided image editing is widely needed in daily life, ranging from personal use to professional applications such as Photoshop.However, existing methods are either zero-shot or trained on an automatically synthesized dataset, which contains a high volume of noise.Thus, they still require lots of manual tuning to produce desirable outcomes in practice.To address this issue, we introduce MagicBrush, the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing.MagicBrush comprises over 10K manually annotated triplets (source image, instruction, target image), which supports trainining large-scale text-guided image editing models.We fine-tune InstructPix2Pix on MagicBrush and show that the new model can produce much better images according to human evaluation.We further conduct extensive experiments to evaluate current image editing baselines from multiple dimensions including quantitative, qualitative, and human evaluations.The results reveal the challenging nature of our dataset and the gap between current baselines and real-world editing needs.


12 award-winning photos of our beautiful world

Popular Science

In Onyx Tempest, I wanted to capture the intensity of a Friesian skidding into a sharp turn. The backlit dust, flying mane, and sudden shift in momentum revealed his power and control. I framed the moment to emphasize energy, contrast, and the precision of the movement. Breakthroughs, discoveries, and DIY tips sent every weekday. The reFocus Awards has announced the stunning winners of the 2025 Photographers of the Year at the World Photo Annual .


Consensus and Subjectivity of Skin Tone Annotation for ML Fairness

Neural Information Processing Systems

Understanding different human attributes and how they affect model behavior may become a standard need for all model creation and usage, from traditional computer vision tasks to the newest multimodal generative AI systems. In computer vision specifically, we have relied on datasets augmented with perceived attribute signals (eg, gender presentation, skin tone, and age) and benchmarks enabled by these datasets. Typically labels for these tasks come from human annotators. However, annotating attribute signals, especially skin tone, is a difficult and subjective task. Perceived skin tone is affected by technical factors, like lighting conditions, and social factors that shape an annotator's lived experience.This paper examines the subjectivity of skin tone annotation through a series of annotation experiments using the Monk Skin Tone (MST) scale~\cite{Monk2022Monk}, a small pool of professional photographers, and a much larger pool of trained crowdsourced annotators. Along with this study we release the Monk Skin Tone Examples (MST-E) dataset, containing 1515 images and 31 videos spread across the full MST scale. MST-E is designed to help train human annotators to annotate MST effectively.Our study shows that annotators can reliably annotate skin tone in a way that aligns with an expert in the MST scale, even under challenging environmental conditions. We also find evidence that annotators from different geographic regions rely on different mental models of MST categories resulting in annotations that systematically vary across regions. Given this, we advise practitioners to use a diverse set of annotators and a higher replication count for each image when annotating skin tone for fairness research.


Amazon adds controversial AI facial recognition to Ring

FOX News

Amazon Ring introduces AI-powered facial recognition to identify friends and delivery drivers, while privacy advocates warn of surveillance risks despite convenience benefits.


Video Timeline Modeling For News Story Understanding

Neural Information Processing Systems

In this paper, we present a novel problem, namely video timeline modeling. Our objective is to create a video-associated timeline from a set of videos related to a specific topic, thereby facilitating the content and structure understanding of the story being told. This problem has significant potential in various real-world applications, for instance, news story summarization.


What Kind of New World Is Being Born?

The New Yorker

What Kind of New World Is Being Born? According to the Gospel of Luke, the Virgin Mary first learns that she'll soon give birth to Christ when she gets an unsolicited visit from an angel. Nice messenger service if you can get it. But before trusty Gabriel can dispense the good news upon which Christmas depends he has to calm the girl down. "Fear not," he says, and, in a way, this sombre reassurance is the Yuletide message in drastic miniature.