AITopics | Kepner, Jeremy

Collaborating Authors

Kepner, Jeremy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Are ChatGPT and Other Similar Systems the Modern Lernaean Hydras of AI?

Ioannidis, Dimitrios, Kepner, Jeremy, Bowne, Andrew, Bryant, Harriet S.

arXiv.org Artificial IntelligenceJan-30-2024

The rise of Generative Artificial Intelligence systems ("AI systems") has created unprecedented social engagement. AI code generation systems provide responses (output) to questions or requests by accessing the vast library of open-source code created by developers over the past few decades. However, they do so by allegedly stealing the open-source code stored in virtual libraries, known as repositories. This Article focuses on how this happens and whether there is a solution that protects innovation and avoids years of litigation. We also touch upon the array of issues raised by the relationship between AI and copyright. Looking ahead, we propose the following: (a) immediate changes to the licenses for open-source code created by developers that will limit access and/or use of any open-source code to humans only; (b) we suggest revisions to the Massachusetts Institute of Technology ("MIT") license so that AI systems are required to procure appropriate licenses from open-source code developers, which we believe will harmonize standards and build social consensus for the benefit of all of humanity, rather than promote profit-driven centers of innovation; (c) we call for urgent legislative action to protect the future of AI systems while also promoting innovation; and (d) we propose a shift in the burden of proof to AI systems in obfuscation cases.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.09267

Country: North America > United States > Massachusetts (0.34)

Genre: Research Report (0.63)

Industry:

Law > Litigation (1.00)
Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

Add feedback

Testing RadiX-Nets: Advances in Viable Sparse Topologies

Kwak, Kevin, West, Zack, Jananthan, Hayden, Kepner, Jeremy

arXiv.org Artificial IntelligenceNov-6-2023

The exponential growth of data has sparked computational demands on ML research and industry use. Sparsification of hyper-parametrized deep neural networks (DNNs) creates simpler representations of complex data. Past research has shown that some sparse networks achieve similar performance as dense ones, reducing runtime and storage. RadiX-Nets, a subgroup of sparse DNNs, maintain uniformity which counteracts their lack of neural connections. Generation, independent of a dense network, yields faster asymptotic training and removes the need for costly pruning. However, little work has been done on RadiX-Nets, making testing challenging. This paper presents a testing suite for RadiX-Nets in TensorFlow. We test RadiX-Net performance to streamline processing in scalable models, revealing relationships between network topology, initialization, and training behavior. We also encounter "strange models" that train inconsistently and to lower accuracy while models of similar sparsity train well.

artificial intelligence, machine learning, radix-net, (17 more...)

arXiv.org Artificial Intelligence

2311.03609

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (1.00)

Industry: Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Lincoln AI Computing Survey (LAICS) Update

Reuther, Albert, Michaleas, Peter, Jones, Michael, Gadepally, Vijay, Samsi, Siddharth, Kepner, Jeremy

arXiv.org Artificial IntelligenceOct-13-2023

This paper is an update of the survey of AI accelerators and processors from past four years, which is now called the Lincoln AI Computing Survey - LAICS (pronounced "lace"). As in past years, this paper collects and summarizes the current commercial accelerators that have been publicly announced with peak performance and peak power consumption numbers. The performance and power values are plotted on a scatter graph, and a number of dimensions and observations from the trends on this plot are again discussed and analyzed. Market segments are highlighted on the scatter plot, and zoomed plots of each segment are also included. Finally, a brief description of each of the new accelerators that have been added in the survey this year is included.

artificial intelligence, available, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2310.09145

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report (0.70)

Industry: Information Technology (1.00)

Technology:

Information Technology > Cloud Computing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Add feedback

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

Samsi, Siddharth, Zhao, Dan, McDonald, Joseph, Li, Baolin, Michaleas, Adam, Jones, Michael, Bergeron, William, Kepner, Jeremy, Tiwari, Devesh, Gadepally, Vijay

arXiv.org Artificial IntelligenceOct-4-2023

Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs already receive less attention than the energy costs of training LLMs -- despite how often these large models are called on to conduct inference in reality (e.g., ChatGPT). As these state-of-the-art LLMs see increasing usage and deployment in various domains, a better understanding of their resource utilization is crucial for cost-savings, scaling performance, efficient hardware usage, and optimal inference strategies. In this paper, we describe experiments conducted to study the computational and energy utilization of inference with LLMs. We benchmark and conduct a preliminary analysis of the inference performance and inference energy costs of different sizes of LLaMA -- a recent state-of-the-art LLM -- developed by Meta AI on two generations of popular GPUs (NVIDIA V100 \& A100) and two datasets (Alpaca and GSM8K) to reflect the diverse set of tasks/benchmarks for LLMs in research and practice. We present the results of multi-node, multi-GPU inference using model sharding across up to 32 GPUs. To our knowledge, our work is the one of the first to study LLM inference performance from the perspective of computational and energy resources at this scale.

artificial intelligence, large language model, natural language, (5 more...)

arXiv.org Artificial Intelligence

2310.03003

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

AI Enabled Maneuver Identification via the Maneuver Identification Challenge

Samuel, Kaira, LaRosa, Matthew, McAlpin, Kyle, Schaefer, Morgan, Swenson, Brandon, Wasilefsky, Devin, Wu, Yan, Zhao, Dan, Kepner, Jeremy

arXiv.org Artificial IntelligenceNov-28-2022

Artificial intelligence (AI) has enormous potential to improve Air Force pilot training by providing actionable feedback to pilot trainees on the quality of their maneuvers and enabling instructor-less flying familiarization for early-stage trainees in low-cost simulators. Historically, AI challenges consisting of data, problem descriptions, and example code have been critical to fueling AI breakthroughs. The Department of the Air Force-Massachusetts Institute of Technology AI Accelerator (DAF-MIT AI Accelerator) developed such an AI challenge using real-world Air Force flight simulator data. The Maneuver ID challenge assembled thousands of virtual reality simulator flight recordings collected by actual Air Force student pilots at Pilot Training Next (PTN). This dataset has been publicly released at Maneuver-ID.mit.edu and represents the first of its kind public release of USAF flight training data. Using this dataset, we have applied a variety of AI methods to separate "good" vs "bad" simulator data and categorize and characterize maneuvers. These data, algorithms, and software are being released as baselines of model performance for others to build upon to enable the AI ecosystem for flight simulator training.

artificial intelligence, human computer interaction, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.15552

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.37)

Genre: Research Report (1.00)

Industry:

Government > Military > Air Force (1.00)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Maneuver Identification Challenge

Samuel, Kaira, Gadepally, Vijay, Jacobs, David, Jones, Michael, McAlpin, Kyle, Palko, Kyle, Paulk, Ben, Samsi, Sid, Siu, Ho Chit, Yee, Charles, Kepner, Jeremy

arXiv.org Artificial IntelligenceAug-25-2021

AI algorithms that identify maneuvers from trajectory data could play an important role in improving flight safety and pilot training. AI challenges allow diverse teams to work together to solve hard problems and are an effective tool for developing AI solutions. AI challenges are also a key driver of AI computational requirements. The Maneuver Identification Challenge hosted at maneuver-id.mit.edu provides thousands of trajectories collected from pilots practicing in flight simulators, descriptions of maneuvers, and examples of these maneuvers performed by experienced pilots. Each trajectory consists of positions, velocities, and aircraft orientations normalized to a common coordinate system. Construction of the data set required significant data architecture to transform flight simulator logs into AI ready data, which included using a supercomputer for deduplication and data conditioning. There are three proposed challenges. The first challenge is separating physically plausible (good) trajectories from unfeasible (bad) trajectories. Human labeled good and bad trajectories are provided to aid in this task. Subsequent challenges are to label trajectories with their intended maneuvers and to assess the quality of those maneuvers.

air transportation, maneuver, neural network, (19 more...)

arXiv.org Artificial Intelligence

2108.11503

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.25)

Genre: Research Report (0.40)

Industry:

Transportation > Air (1.00)
Government > Military > Air Force (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

The MIT Supercloud Dataset

Samsi, Siddharth, Weiss, Matthew L, Bestor, David, Li, Baolin, Jones, Michael, Reuther, Albert, Edelman, Daniel, Arcand, William, Byun, Chansup, Holodnack, John, Hubbell, Matthew, Kepner, Jeremy, Klein, Anna, McDonald, Joseph, Michaleas, Adam, Michaleas, Peter, Milechin, Lauren, Mullen, Julia, Yee, Charles, Price, Benjamin, Prout, Andrew, Rosa, Antonio, Vanterpool, Allan, McEvoy, Lindsey, Cheng, Anson, Tiwari, Devesh, Gadepally, Vijay

arXiv.org Artificial IntelligenceAug-4-2021

Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to changes in deployment approaches of HPC clusters and the commercial cloud, as well as a new focus on approaches to optimized resource usage, allocations and deployment of new AI frame- works, and capabilities such as Jupyter notebooks to enable rapid prototyping and deployment. With these changes, there is a need to better understand cluster/datacenter operations with the goal of developing improved scheduling policies, identifying inefficiencies in resource utilization, energy/power consumption, failure prediction, and identifying policy violations. In this paper we introduce the MIT Supercloud Dataset which aims to foster innovative AI/ML approaches to the analysis of large scale HPC and datacenter/cloud operations. We provide detailed monitoring logs from the MIT Supercloud system, which include CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data. This paper discusses the details of the dataset, collection methodology, data availability, and discusses potential challenge problems being developed using this data. Datasets and future challenge announcements will be available via https://dcc.mit.edu.

dataset, neural network, us government, (21 more...)

arXiv.org Artificial Intelligence

2108.02037

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.24)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.48)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid

Kirby, Andrew C., Samsi, Siddharth, Jones, Michael, Reuther, Albert, Kepner, Jeremy, Gadepally, Vijay

arXiv.org Machine LearningAug-30-2020

A Multigrid Full Approximation Storage algorithm for solving Deep Residual Networks is developed to enable neural network parallelized layer-wise training and concurrent computational kernel execution on GPUs. This work demonstrates a 10.2x speedup over traditional layer-wise model parallelism techniques using the same number of compute units.

deep residual neural network, layer-parallel training, nonlinear multigrid, (1 more...)

arXiv.org Machine Learning

2007.07336

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Sparse Deep Neural Network Graph Challenge

Kepner, Jeremy, Alford, Simon, Gadepally, Vijay, Jones, Michael, Milechin, Lauren, Robinett, Ryan, Samsi, Sid

arXiv.org Machine LearningSep-1-2019

The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The proposed Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective of emerging sparse AI systems. The Sparse DNN Challenge is based on a mathematically well-defined DNN inference computation and can be implemented in any programming environment. Sparse DNN inference is amenable to both vertex-centric implementations and array-based implementations (e.g., using the GraphBLAS.org standard). The computations are simple enough that performance predictions can be made based on simple computing hardware models. The input data sets are derived from the MNIST handwritten letters. The surrounding I/O and verification provide the context for each sparse DNN inference that allows rigorous definition of both the input and the output. Furthermore, since the proposed sparse DNN challenge is scalable in both problem size and hardware, it can be used to measure and quantitatively compare a wide range of present day and future systems. Reference implementations have been implemented and their serial and parallel performance have been measured. Specifications, data, and software are publicly available at GraphChallenge.org

deep learning, dnn, neural network, (13 more...)

arXiv.org Machine Learning

1909.05631

Genre: Research Report (0.50)

Industry:

Information Technology (0.94)
Leisure & Entertainment > Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI Enabling Technologies: A Survey

Gadepally, Vijay, Goodwin, Justin, Kepner, Jeremy, Reuther, Albert, Reynolds, Hayley, Samsi, Siddharth, Su, Jonathan, Martinez, David

arXiv.org Artificial IntelligenceMay-8-2019

Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together in order to provide capabilities that can be used by decision makers, warfighters and analysts. These pieces include data collection, data conditioning, algorithms, computing, robust artificial intelligence, and human-machine teaming. While much of the popular press today surrounds advances in algorithms and computing, most modern AI systems leverage advances across numerous different fields. Further, while certain components may not be as visible to end-users as others, our experience has shown that each of these interrelated components play a major role in the success or failure of an AI system. This article is meant to highlight many of these technologies that are involved in an end-to-end AI system. The goal of this article is to provide readers with an overview of terminology, technical details and recent highlights from academia, industry and government. Where possible, we indicate relevant resources that can be used for further reading and understanding.

deep learning, neural network, us government, (24 more...)

arXiv.org Artificial Intelligence

1905.03592

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(7 more...)

Add feedback