Appendix (LAION-5B: An open large-scale dataset for training next generation image-text models) A Datasheet for LAION-5B dataset A.1 Motivation Q1

Feb-11-2026, 02:26:53 GMT–Neural Information Processing Systems

For what purpose was the dataset created? Was there a specific task in mind? YFCC with 100 million image/videos and associated metadata. Who created the dataset (e.g., which team, research group) and on behalf of which Who funded the creation of the dataset? This work was sponsored by Hugging Face and Stability AI. What do the instances that comprise the dataset represent (e.g., documents, photos, Are there multiple types of instances (e.g., movies, users, and ratings; We provide 5.8 billion image-text pairs.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Feb-11-2026, 02:26:53 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.46)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (0.93)

Technology:
- Information Technology
  - Communications (1.00)
  - Sensing and Signal Processing > Image Processing (0.93)
  - Security & Privacy (0.93)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.48)

Duplicate Docs Excel Report

Title
a1859debfb3b59d094f3504d5ebb6c25-Supplemental-Datasets_and_Benchmarks.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found