PCS Workflow for Veridical Data Science in the Age of AI
Rewolinski, Zachary T., Yu, Bin
–arXiv.org Artificial Intelligence
Data science is a pillar of artificial intelligence (AI), which is transforming nearly every domain of human activity, from the social and physical sciences to engineering and medicine. While data-driven findings in AI offer unprecedented power to extract insights and guide decision-making, many are difficult or impossible to replicate. A key reason for this challenge is the uncertainty introduced by the many choices made throughout the data science life cycle (DSLC). Traditional statistical frameworks often fail to account for this uncertainty. The Predictability-Computability-Stability (PCS) framework for veridical (truthful) data science offers a principled approach to addressing this challenge throughout the DSLC. This paper presents an updated and streamlined PCS workflow, tailored for practitioners and enhanced with guided use of generative AI. We include a running example to display the PCS framework in action, and conduct a related case study which showcases the uncertainty in downstream predictions caused by judgment calls in the data cleaning stage.
arXiv.org Artificial Intelligence
Dec-4-2025
- Country:
- Asia > Japan (0.04)
- Europe > France (0.04)
- North America > United States
- California (0.04)
- South America > Uruguay
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.68)
- Strength High (0.68)
- Research Report
- Industry:
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.67)
- Performance Analysis > Accuracy (0.47)
- Statistical Learning > Clustering (0.46)
- Natural Language (1.00)
- Machine Learning
- Data Science (1.00)
- Artificial Intelligence
- Information Technology