Between Randomness and Arbitrariness: Some Lessons for Reliable Machine Learning at Scale

Jun-13-2024–arXiv.org Machine Learning

To develop rigorous knowledge about ML models -- and the systems in which they are embedded -- we need reliable measurements. But reliable measurement is fundamentally challenging, and touches on issues of reproducibility, scalability, uncertainty quantification, epistemology, and more. This dissertation addresses criteria needed to take reliability seriously: both criteria for designing meaningful metrics, and for methodologies that ensure that we can dependably and efficiently measure these metrics at scale and in practice. In doing so, this dissertation articulates a research vision for a new field of scholarship at the intersection of machine learning, law, and policy. Within this frame, we cover topics that fit under three different themes: (1) quantifying and mitigating sources of arbitrariness in ML, (2) taming randomness in uncertainty estimation and optimization algorithms, in order to achieve scalability without sacrificing reliability, and (3) providing methods for evaluating generative-AI systems, with specific focuses on quantifying memorization in language models and training latent diffusion models on open-licensed data. By making contributions in these three themes, this dissertation serves as an empirical proof by example that research on reliable measurement for machine learning is intimately and inescapably bound up with research in law and policy. These different disciplines pose similar research questions about reliable measurement in machine learning. They are, in fact, two complementary sides of the same research vision, which, broadly construed, aims to construct machine-learning systems that cohere with broader societal values.

large language model, logic & formal reasoning, machine learning, (23 more...)

arXiv.org Machine Learning

Jun-13-2024

arXiv.org PDF

Add feedback

Country:
- Asia (1.00)
- Europe > United Kingdom (0.67)
- North America > United States
  - California (1.00)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Education (1.00)
- Transportation > Ground
  - Road (1.00)
- Health & Medicine > Therapeutic Area
  - Infections and Infectious Diseases (0.92)
  - Immunology (0.92)
- Government > Regional Government
  - North America Government > United States Government (1.00)
- Energy > Oil & Gas
  - Upstream (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Issues > Social & Ethical Issues (1.00)
  - Representation & Reasoning
    - Uncertainty (1.00)
    - Optimization (1.00)
    - Logic & Formal Reasoning (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Performance Analysis > Accuracy (1.00)
    - Neural Networks > Deep Learning
      - Generative AI (0.67)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found