AITopics | Yu, Sixing

Collaborating Authors

Yu, Sixing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SuperSAM: Crafting a SAM Supernetwork via Structured Pruning and Unstructured Parameter Prioritization

Abebe, Waqwoya, Jafari, Sadegh, Yu, Sixing, Dutta, Akash, Strube, Jan, Tallent, Nathan R., Guo, Luanzheng, Munoz, Pablo, Jannesari, Ali

arXiv.org Artificial IntelligenceJan-14-2025

Neural Architecture Search (NAS) is a powerful approach of automating the design of efficient neural architectures. In contrast to traditional NAS methods, recently proposed one-shot NAS methods prove to be more efficient in performing NAS. One-shot NAS works by generating a singular weight-sharing supernetwork that acts as a search space (container) of subnetworks. Despite its achievements, designing the one-shot search space remains a major challenge. In this work we propose a search space design strategy for Vision Transformer (ViT)-based architectures. In particular, we convert the Segment Anything Model (SAM) into a weight-sharing supernetwork called SuperSAM. Our approach involves automating the search space design via layer-wise structured pruning and parameter prioritization. While the structured pruning applies probabilistic removal of certain transformer layers, parameter prioritization performs weight reordering and slicing of MLP-blocks in the remaining layers. We train supernetworks on several datasets using the sandwich rule. For deployment, we enhance subnetwork discovery by utilizing a program autotuner to identify efficient subnetworks within the search space. The resulting subnetworks are 30-70% smaller in size compared to the original pre-trained SAM ViT-B, yet outperform the pretrained model. Our work introduces a new and effective method for ViT NAS search-space design.

artificial intelligence, machine learning, subnetwork, (15 more...)

arXiv.org Artificial Intelligence

2501.08504

Country:

North America > United States (0.46)
North America > Canada > Quebec (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

Butler, Branden, Yu, Sixing, Mazaheri, Arya, Jannesari, Ali

arXiv.org Artificial IntelligenceJul-16-2024

Inference of Large Language Models (LLMs) across computer clusters has become a focal point of research in recent times, with many acceleration techniques taking inspiration from CPU speculative execution. These techniques reduce bottlenecks associated with memory bandwidth, but also increase end-to-end latency per inference run, requiring high speculation acceptance rates to improve performance. Combined with a variable rate of acceptance across tasks, speculative inference techniques can result in reduced performance. Additionally, pipeline-parallel designs require many user requests to maintain maximum utilization. As a remedy, we propose PipeInfer, a pipelined speculative acceleration technique to reduce inter-token latency and improve system utilization for single-request scenarios while also improving tolerance to low speculation acceptance rates and low-bandwidth interconnects. PipeInfer exhibits up to a 2.15$\times$ improvement in generation speed over standard speculative inference. PipeInfer achieves its improvement through Continuous Asynchronous Speculation and Early Inference Cancellation, the former improving latency and generation speed by running single-token inference simultaneously with several speculative runs, while the latter improves speed and latency by skipping the computation of invalidated runs, even in the middle of inference.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.11798

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Landscape and Challenges of HPC Research and LLMs

Chen, Le, Ahmed, Nesreen K., Dutta, Akash, Bhattacharjee, Arijit, Yu, Sixing, Mahmud, Quazi Ishtiaque, Abebe, Waqwoya, Phan, Hung, Sarkar, Aishwarya, Butler, Branden, Hasabnis, Niranjan, Oren, Gal, Vo, Vy A., Munoz, Juan Pablo, Willke, Theodore L., Mattson, Tim, Jannesari, Ali

arXiv.org Artificial IntelligenceFeb-6-2024

Recently, language models (LMs), especially large language models (LLMs), have revolutionized the field of deep learning. Both encoder-decoder models and prompt-based techniques have shown immense potential for natural language processing and code-based tasks. Over the past several years, many research labs and institutions have invested heavily in high-performance computing, approaching or breaching exascale performance levels. In this paper, we posit that adapting and utilizing such language model-based techniques for tasks in high-performance computing (HPC) would be very beneficial. This study presents our reasoning behind the aforementioned position and highlights how existing ideas can be improved and adapted for HPC tasks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.02018

Country: North America > United States > New York (0.14)

Genre: Research Report > Promising Solution (0.46)

Industry:

Information Technology > Services (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Federated Foundation Models: Privacy-Preserving and Collaborative Learning for Large Models

Yu, Sixing, Muñoz, J. Pablo, Jannesari, Ali

arXiv.org Artificial IntelligenceNov-7-2023

Foundation Models (FMs), such as LLaMA, BERT, GPT, ViT, and CLIP, have demonstrated remarkable success in a wide range of applications, driven by their ability to leverage vast amounts of data for pre-training. However, optimizing FMs often requires access to sensitive data, raising privacy concerns and limiting their applicability in many domains. In this paper, we propose the Federated Foundation Models (FFMs) paradigm, which combines the benefits of FMs and Federated Learning (FL) to enable privacy-preserving and collaborative learning across multiple end-users. We discuss the potential benefits and challenges of integrating FL into the lifespan of FMs, covering pre-training, fine-tuning, and application. We further outline potential future research avenues in FFM, including FFM pre-training, FFM fine-tuning, and federated prompt tuning, which allow the development of more personalized and context-aware models while ensuring data privacy. Moreover, we explore the possibility of continual/lifelong learning in FFMs, as increased computational power at the edge may unlock the potential for optimizing FMs using newly generated private data close to the data source. The proposed FFM concepts offer a flexible and scalable framework for training large language models in a privacy-preserving manner, setting the stage for subsequent advancements in both FM training and federated learning.

large language model, machine learning, natural language, (3 more...)

arXiv.org Artificial Intelligence

2305.11414

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.80)

Add feedback

Bridging the Gap Between Foundation Models and Heterogeneous Federated Learning

Yu, Sixing, Muñoz, J. Pablo, Jannesari, Ali

arXiv.org Artificial IntelligenceOct-4-2023

Federated learning (FL) offers privacy-preserving decentralized machine learning, optimizing models at edge clients without sharing private data. Simultaneously, foundation models (FMs) have gained traction in the artificial intelligence (AI) community due to their exceptional performance across various tasks. However, integrating FMs into FL presents challenges, primarily due to their substantial size and intensive resource requirements. This is especially true when considering the resource heterogeneity in edge FL systems. We present an adaptive framework for Resource-aware Federated Foundation Models (RaFFM) to address these challenges. RaFFM introduces specialized model compression algorithms tailored for FL scenarios, such as salient parameter prioritization and high-performance subnetwork extraction. These algorithms enable dynamic scaling of given transformer-based FMs to fit heterogeneous resource constraints at the network edge during both FL's optimization and deployment stages. Experimental results demonstrate that RaFFM shows significant superiority in resource utilization efficiency and uses fewer resources to deploy FMs to FL. Despite the lower resource consumption, target models optimized by RaFFM achieve performance on par with traditional FL methods applied to full-sized FMs. This is evident across tasks in both natural language processing and computer vision domains.

artificial intelligence, machine learning, natural language, (4 more...)

arXiv.org Artificial Intelligence

2310.00247

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing Heterogeneous Federated Learning with Knowledge Extraction and Multi-Model Fusion

Nguyen, Duy Phuong, Yu, Sixing, Muñoz, J. Pablo, Jannesari, Ali

arXiv.org Artificial IntelligenceSep-30-2023

Concerned with user data privacy, this paper presents a new federated learning (FL) method that trains machine learning models on edge devices without accessing sensitive data. Traditional FL methods, although privacy-protective, fail to manage model heterogeneity and incur high communication costs due to their reliance on aggregation methods. To address this limitation, we propose a resource-aware FL method that aggregates local knowledge from edge models and distills it into robust global knowledge through knowledge distillation. This method allows efficient multi-model knowledge fusion and the deployment of resource-aware models while preserving model heterogeneity. Our method improves communication cost and performance in heterogeneous data and models compared to existing FL algorithms. Notably, it reduces the communication cost of ResNet-32 by up to 50\% and VGG-11 by up to 10$\times$ while delivering superior performance.

artificial intelligence, knowledge network, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2208.07978

Country: North America > United States > Iowa (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Heterogeneous Federated Learning using Dynamic Model Pruning and Adaptive Gradient

Yu, Sixing, Nguyen, Phuong, Anwar, Ali, Jannesari, Ali

arXiv.org Artificial IntelligenceFeb-9-2023

Federated Learning (FL) has emerged as a new paradigm for training machine learning models distributively without sacrificing data security and privacy. Learning models on edge devices such as mobile phones is one of the most common use cases for FL. However, Non-identical independent distributed~(non-IID) data in edge devices easily leads to training failures. Especially, over-parameterized machine learning models can easily be over-fitted on such data, hence, resulting in inefficient federated learning and poor model performance. To overcome the over-fitting issue, we proposed an adaptive dynamic pruning approach for FL, which can dynamically slim the model by dropping out unimportant parameters, hence, preventing over-fittings. Since the machine learning model's parameters react differently for different training samples, adaptive dynamic pruning will evaluate the salience of the model's parameter according to the input training sample, and only retain the salient parameter's gradients when doing back-propagation. We performed comprehensive experiments to evaluate our approach. The results show that our approach by removing the redundant parameters in neural networks can significantly reduce the over-fitting issue and greatly improves the training efficiency. In particular, when training the ResNet-32 on CIFAR-10, our approach reduces the communication cost by 57\%. We further demonstrate the inference acceleration capability of the proposed algorithm. Our approach reduces up to 50\% FLOPs inference of DNNs on edge devices while maintaining the model's quality.

artificial intelligence, machine learning, pruning, (14 more...)

arXiv.org Artificial Intelligence

2106.06921

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Resource-Aware Heterogeneous Federated Learning using Neural Architecture Search

Yu, Sixing, Nguyen, Phuong, Abebe, Waqwoya, Stanley, Justin, Munoz, Pablo, Jannesari, Ali

arXiv.org Artificial IntelligenceNov-9-2022

Federated Learning (FL) is extensively used to train AI/ML models in distributed and privacy-preserving settings. Participant edge devices in FL systems typically contain non-independent and identically distributed~(Non-IID) private data and unevenly distributed computational resources. Preserving user data privacy while optimizing AI/ML models in a heterogeneous federated network requires us to address data heterogeneity and system/resource heterogeneity. Hence, we propose \underline{R}esource-\underline{a}ware \underline{F}ederated \underline{L}earning~(RaFL) to address these challenges. RaFL allocates resource-aware models to edge devices using Neural Architecture Search~(NAS) and allows heterogeneous model architecture deployment by knowledge extraction and fusion. Integrating NAS into FL enables on-demand customized model deployment for resource-diverse edge devices. Furthermore, we propose a multi-model architecture fusion scheme allowing the aggregation of the distributed learning results. Results demonstrate RaFL's superior resource efficiency compared to SoTA.

artificial intelligence, machine learning, rafl, (15 more...)

arXiv.org Artificial Intelligence

2211.05716

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.63)

Add feedback