Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms

Avadhanula, Vashist, Baki, Omar Abdul, Bastani, Hamsa, Bastani, Osbert, Gocmen, Caner, Haimovich, Daniel, Hwang, Darren, Karamshuk, Dima, Leeper, Thomas, Ma, Jiayuan, Macnamara, Gregory, Mullett, Jake, Palow, Christopher, Park, Sung, Rajagopal, Varun S, Schaeffer, Kevin, Shah, Parikshit, Sinha, Deeksha, Stier-Moses, Nicolas, Xu, Peng

Nov-11-2022–arXiv.org Artificial Intelligence

We describe the current content moderation strategy employed by Meta to remove policy-violating content from its platforms. Meta relies on both handcrafted and learned risk models to flag potentially violating content for human review. Our approach aggregates these risk models into a single ranking score, calibrating them to prioritize more reliable risk models. A key challenge is that violation trends change over time, affecting which risk models are most reliable. Our system additionally handles production challenges such as changing risk models and novel risk models. We use a contextual bandit to update the calibration in response to such trends. Our approach increases Meta's top-line metric for measuring the effectiveness of its content moderation strategy by 13%.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Nov-11-2022

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- North America > United States
  - Pennsylvania (0.04)

Genre:
- Research Report (0.50)

Industry:
- Information Technology (0.69)
- Leisure & Entertainment > Sports
  - Soccer (0.46)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (0.50)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found