Workflow Provenance in the Lifecycle of Scientific Machine Learning
Souza, Renan, Azevedo, Leonardo G., Lourenço, Vítor, Soares, Elton, Thiago, Raphael, Brandão, Rafael, Civitarese, Daniel, Brazil, Emilio Vital, Moreno, Marcio, Valduriez, Patrick, Mattoso, Marta, Cerqueira, Renato, Netto, Marco A. S.
–arXiv.org Artificial Intelligence
Machine Learning (ML) has been fundamentally transforming several industries and businesses in numerous ways. More recently, it has also been impacting computational science and engineering domains, such as geoscience, climate science, material science, and health science. Scientific ML, i.e., ML applied to these domains, is characterized by the combination of data-driven techniques with domain-specific data and knowledge to obtain models of physical phenomena [1], [2], [3], [4], [5]. Obtaining models in scientific ML works similarly to conducting traditional large-scale computational experiments [6], which involve a team of scientists and engineers that formulate hypotheses, design the experiment and predefine parameters and input datasets, analyze the experiment data, do observations, and calibrate initial assumptions in a cycle until they are satisfied with the results. Scientific ML is naturally large-scale because multiple people collaborate in a project, using their multidisciplinary domain-specific knowledge to design and perform data-intensive tasks to curate (i.e., understand, clean, enrich with observations) datasets and prepare for learning algorithms. They then plan and execute compute-intensive tasks for computational simulations or training ML models affected by the scientific domain's constraints. They utilize specialized scientific software tools running either on their desktops, on cloud clusters (e.g., Docker-based), or large HPC machines.
arXiv.org Artificial Intelligence
Sep-30-2020
- Country:
- Europe (0.46)
- South America > Brazil
- Rio de Janeiro (0.14)
- Genre:
- Research Report
- Experimental Study (0.48)
- New Finding (0.46)
- Workflow (0.91)
- Research Report
- Technology: