AITopics | mlmodelscope

Collaborating Authors

mlmodelscope

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MLHarness: A Scalable Benchmarking System for MLCommons

Chang, Yen-Hsiang, Pu, Jianhao, Hwu, Wen-mei, Xiong, Jinjun

arXiv.org Artificial IntelligenceApr-24-2025

With the society's growing adoption of machine learning (ML) and deep learning (DL) for various intelligent solutions, it becomes increasingly imperative to standardize a common set of measures for ML/DL models with large scale open datasets under common development practices and resources so that people can benchmark and compare models quality and performance on a common ground. MLCommons has emerged recently as a driving force from both industry and academia to orchestrate such an effort. Despite its wide adoption as standardized benchmarks, MLCommons Inference has only included a limited number of ML/DL models (in fact seven models in total). This significantly limits the generality of MLCommons Inference's benchmarking results because there are many more novel ML/DL models from the research community, solving a wide range of problems with different inputs and outputs modalities. To address such a limitation, we propose MLHarness, a scalable benchmarking harness system for MLCommons Inference with three distinctive features: (1) it codifies the standard benchmark process as defined by MLCommons Inference including the models, datasets, DL frameworks, and software and hardware systems; (2) it provides an easy and declarative approach for model developers to contribute their models and datasets to MLCommons Inference; and (3) it includes the support of a wide range of models with varying inputs/outputs modalities so that we can scalably benchmark these models across different datasets, frameworks, and hardware systems. This harness system is developed on top of the MLModelScope system, and will be open sourced to the community. Our experimental results demonstrate the superior flexibility and scalability of this harness system for MLCommons Inference benchmarking.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.tbench.2021.100002

2111.05231

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale

Dakkak, Abdul, Li, Cheng, Xiong, Jinjun, Hwu, Wen-mei

arXiv.org Machine LearningFeb-19-2020

Machine Learning (ML) and Deep Learning (DL) innovations are being introduced at such a rapid pace that researchers are hard-pressed to analyze and study them. The complicated procedures for evaluating innovations, along with the lack of standard and efficient ways of specifying and provisioning ML/DL evaluation, is a major "pain point" for the community. This paper proposes MLModelScope, an open-source, framework/hardware agnostic, extensible and customizable design that enables repeatable, fair, and scalable model evaluation and benchmarking. We implement the distributed design with support for all major frameworks and hardware, and equip it with web, command-line, and library interfaces. To demonstrate MLModelScope's capabilities we perform parallel evaluation and show how subtle changes to model evaluation pipeline affects the accuracy and HW/SW stack choices affect performance.

evaluation, mlmodelscope, model evaluation, (16 more...)

arXiv.org Machine Learning

2002.08295

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.40)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

The Design and Implementation of a Scalable DL Benchmarking Platform

Li, Cheng, Dakkak, Abdul, Xiong, Jinjun, Hwu, Wen-mei

arXiv.org Machine LearningNov-18-2019

The current Deep Learning (DL) landscape is fast-paced and is rife with non-uniform models, hardware/software (HW/SW) stacks, but lacks a DL benchmarking platform to facilitate evaluation and comparison of DL innovations, be it models, frameworks, libraries, or hardware. Due to the lack of a benchmarking platform, the current practice of evaluating the benefits of proposed DL innovations is both arduous and error-prone - stifling the adoption of the innovations. In this work, we first identify $10$ design features which are desirable within a DL benchmarking platform. These features include: performing the evaluation in a consistent, reproducible, and scalable manner, being framework and hardware agnostic, supporting real-world benchmarking workloads, providing in-depth model execution inspection across the HW/SW stack levels, etc. We then propose MLModelScope, a DL benchmarking platform design that realizes the $10$ objectives. MLModelScope proposes a specification to define DL model evaluations and techniques to provision the evaluation workflow using the user-specified HW/SW stack. MLModelScope defines abstractions for frameworks and supports board range of DL models and evaluation scenarios. We implement MLModelScope as an open-source project with support for all major frameworks and hardware architectures. Through MLModelScope's evaluation and automated analysis workflows, we performed case-study analyses of $37$ models across $4$ systems and show how model, hardware, and framework selection affects model accuracy and performance under different benchmarking scenarios. We further demonstrated how MLModelScope's tracing capability gives a holistic view of model execution and helps pinpoint bottlenecks.

deep learning, mlmodelscope, neural network, (20 more...)

arXiv.org Machine Learning

1911.08031

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Workflow (0.70)

Industry:

Information Technology (0.47)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Software > Programming Languages (0.93)
Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Frustrated with Replicating Claims of a Shared Model? A Solution

Dakkak, Abdul, Li, Cheng, Xiong, Jinjun, Hwu, Wen-Mei

arXiv.org Machine LearningJun-25-2019

Machine Learning (ML) and Deep Learning (DL) innovations are being introduced at such a rapid pace that model owners and evaluators are hard-pressed analyzing and studying them. This is exacerbated by the complicated procedures for evaluation. The lack of standard systems and efficient techniques for specifying and provisioning ML/DL evaluation is the main cause of this "pain point". This work discusses common pitfalls for replicating DL model evaluation, and shows that these subtle pitfalls can affect both accuracy and performance. It then proposes a solution to remedy these pitfalls called MLModelScope, a specification for repeatable model evaluation and a runtime to provision and measure experiments. We show that by easing the model specification and evaluation process, MLModelScope facilitates rapid adoption of ML/DL innovations.

artificial intelligence, machine learning, mlmodelscope, (19 more...)

arXiv.org Machine Learning

1811.09737

Country: North America > United States > Illinois > Champaign County (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Challenges and Pitfalls of Reproducing Machine Learning Artifacts

Li, Cheng, Dakkak, Abdul, Xiong, Jinjun, Hwu, Wen-mei

arXiv.org Artificial IntelligenceApr-28-2019

An increasingly complex and diverse collection of Machine Learning(ML) models as well as hardware/software stacks, collectively referred to as "ML artifacts", are being proposed - leading to a diverse landscape of ML. These ML innovations proposed have outpaced researchers' ability to analyze, study and adapt them. This is exacerbated by the complicated and sometimes non-reproducible procedures for ML evaluation. The current practice of sharing ML artifacts is through repositories where artifact authors post ad-hoc code and some documentation. The authors often fail to reveal critical information for others to reproduce their results. One often fails to reproduce artifact authors' claims, not to mention adapt the model to his/her own use. This article discusses the common challenges and pitfalls of reproducing ML artifacts, which can be used as a guideline for ML researchers when sharing or reproducing artifacts.

artifact, evaluation, library, (15 more...)

arXiv.org Artificial Intelligence

1904.12437

Country: North America > United States > Illinois > Champaign County > Urbana (0.05)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback