AITopics | von Laszewski, Gregor

Plotting

von Laszewski, Gregor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MLCommons Cloud Masking Benchmark with Early Stopping

Chennamsetti, Varshitha, von Laszewski, Gregor, Gu, Ruochen, Mehnaz, Laiba, Papay, Juri, Jackson, Samuel, Thiyagalingam, Jeyan, Samsonau, Sergey V., Fox, Geoffrey C.

arXiv.org Artificial IntelligenceDec-11-2023

In this paper, we report on work performed for the MLCommons Science Working Group on the cloud masking benchmark. MLCommons is a consortium that develops and maintains several scientific benchmarks that aim to benefit developments in AI. The benchmarks are conducted on the High Performance Computing (HPC) Clusters of New York University and University of Virginia, as well as a commodity desktop. We provide a description of the cloud masking benchmark, as well as a summary of our submission to MLCommons on the benchmark experiment we conducted. It includes a modification to the reference implementation of the cloud masking benchmark enabling early stopping. This benchmark is executed on the NYU HPC through a custom batch script that runs the various experiments through the batch queuing system while allowing for variation on the number of epochs trained. Our submission includes the modified code, a custom batch script to modify epochs, documentation, and the benchmark results. We report the highest accuracy (scientific metric) and the average time taken (performance metric) for training and inference that was achieved on NYU HPC Greene. We also provide a comparison of the compute capabilities between different systems by running the benchmark for one epoch. Our submission can be found in a Globus repository that is accessible to MLCommons Science Working Group.

artificial intelligence, benchmark, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2401.08636

Country:

Europe (0.93)
North America > United States > New York (0.35)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)

Genre: Research Report (0.50)

Industry:

Information Technology (0.47)
Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Scientific Computing (0.87)

Add feedback

An Overview of MLCommons Cloud Mask Benchmark: Related Research and Data

von Laszewski, Gregor, Gu, Ruochen

arXiv.org Artificial IntelligenceDec-7-2023

Cloud masking is a crucial task that is well-motivated for meteorology and its applications in environmental and atmospheric sciences. Its goal is, given satellite images, to accurately generate cloud masks that identify each pixel in image to contain either cloud or clear sky. In this paper, we summarize some of the ongoing research activities in cloud masking, with a focus on the research and benchmark currently conducted in MLCommons Science Working Group. This overview is produced with the hope that others will have an easier time getting started and collaborate on the activities related to MLCommons Cloud Mask Benchmark.

artificial intelligence, machine learning, mlcommon cloud mask, (11 more...)

arXiv.org Artificial Intelligence

2312.04799

Country:

Europe (0.93)
North America > United States > Virginia > Albemarle County > Charlottesville (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

In-depth Analysis On Parallel Processing Patterns for High-Performance Dataframes

Perera, Niranda, Sarker, Arup Kumar, Staylor, Mills, von Laszewski, Gregor, Shan, Kaiying, Kamburugamuve, Supun, Widanage, Chathura, Abeykoon, Vibhatha, Kanewela, Thejaka Amila, Fox, Geoffrey

arXiv.org Artificial IntelligenceJul-3-2023

The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more complexities to data engineering applications, which are now integrated into data processing pipelines to process terabytes of data. Typically, a significant amount of time is spent on data preprocessing in these pipelines, and hence improving its e fficiency directly impacts the overall pipeline performance. The community has recently embraced the concept of Dataframes as the de-facto data structure for data representation and manipulation. However, the most widely used serial Dataframes today (R, pandas) experience performance limitations while working on even moderately large data sets. We believe that there is plenty of room for improvement by taking a look at this problem from a high-performance computing point of view. In a prior publication, we presented a set of parallel processing patterns for distributed dataframe operators and the reference runtime implementation, Cylon [1]. In this paper, we are expanding on the initial concept by introducing a cost model for evaluating the said patterns. Furthermore, we evaluate the performance of Cylon on the ORNL Summit supercomputer.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.01394

Country: North America > United States > Virginia > Albemarle County > Charlottesville (0.14)

Genre: Research Report (0.82)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture > Distributed Systems (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback

HPTMT: Operator-Based Architecture for Scalable High-Performance Data-Intensive Frameworks

Kamburugamuve, Supun, Widanage, Chathura, Perera, Niranda, Abeykoon, Vibhatha, Uyar, Ahmet, Kanewala, Thejaka Amila, von Laszewski, Gregor, Fox, Geoffrey

arXiv.org Artificial IntelligenceJul-29-2021

Data-intensive applications impact many domains, and their steadily increasing size and complexity demands high-performance, highly usable environments. We integrate a set of ideas developed in various data science and data engineering frameworks. They employ a set of operators on specific data abstractions that include vectors, matrices, tensors, graphs, and tables. Our key concepts are inspired from systems like MPI, HPF (High-Performance Fortran), NumPy, Pandas, Spark, Modin, PyTorch, TensorFlow, RAPIDS(NVIDIA), and OneAPI (Intel). Further, it is crucial to support different languages in everyday use in the Big Data arena, including Python, R, C++, and Java. We note the importance of Apache Arrow and Parquet for enabling language agnostic high performance and interoperability. In this paper, we propose High-Performance Tensors, Matrices and Tables (HPTMT), an operator-based architecture for data-intensive applications, and identify the fundamental principles needed for performance and usability success. We illustrate these principles by a discussion of examples using our software environments, Cylon and Twister2 that embody HPTMT.

deep learning, neural network, operator, (23 more...)

arXiv.org Artificial Intelligence

2107.12807

Country: North America > United States > Indiana (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.48)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

AICov: An Integrative Deep Learning Framework for COVID-19 Forecasting with Population Covariates

Fox, Geoffrey C., von Laszewski, Gregor, Wang, Fugang, Pyne, Saumyadipta

arXiv.org Machine LearningOct-8-2020

The COVID-19 pandemic has profound global consequences on health, economic, social, political, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of AICov, which provides an integrative deep learning framework for COVID-19 forecasting with population covariates, some of which may serve as putative risk factors. We have integrated multiple different strategies into AICov, including the ability to use deep learning strategies based on LSTM and even modeling. To demonstrate our approach, we have conducted a pilot that integrates population covariates from multiple sources. Thus, AICov not only includes data on COVID-19 cases and deaths but, more importantly, the population's socioeconomic, health and behavioral risk factors at a local level. The compiled data are fed into AICov, and thus we obtain improved prediction by integration of the data to our model as compared to one that only uses case and death data.

deep learning, immunology, risk factor, (22 more...)

arXiv.org Machine Learning

2010.03757

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback