AITopics | Liu, Junyi

Collaborating Authors

Liu, Junyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Good things come in small packages: Should we adopt Lite-GPUs in AI infrastructure?

Canakci, Burcu, Liu, Junyi, Wu, Xingbo, Cheriere, Nathanaël, Costa, Paolo, Legtchenko, Sergey, Narayanan, Dushyanth, Rowstron, Ant

arXiv.org Artificial IntelligenceJan-17-2025

To match the blooming demand of generative AI workloads, GPU designers have so far been trying to pack more and more compute and memory into single complex and expensive packages. However, there is growing uncertainty about the scalability of individual GPUs and thus AI clusters, as state-of-the-art GPUs are already displaying packaging, yield, and cooling limitations. We propose to rethink the design and scaling of AI clusters through efficiently-connected large clusters of Lite-GPUs, GPUs with single, small dies and a fraction of the capabilities of larger GPUs. We think recent advances in co-packaged optics can be key in overcoming the communication challenges of distributing AI workloads onto more Lite-GPUs. In this paper, we present the key benefits of Lite-GPUs on manufacturing cost, blast radius, yield, and power efficiency; and discuss systems opportunities and challenges around resource, workload, memory, and network management.

lite-gpus, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.10187

Country:

North America > United States (0.70)
Europe (0.46)

Genre: Research Report (0.66)

Industry: Information Technology (0.98)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Managed-Retention Memory: A New Class of Memory for the AI Era

Legtchenko, Sergey, Stefanovici, Ioan, Black, Richard, Rowstron, Antony, Liu, Junyi, Costa, Paolo, Canakci, Burcu, Narayanan, Dushyanth, Wu, Xingbo

arXiv.org Artificial IntelligenceJan-16-2025

AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read bandwidth, and also has significant energy per bit overheads. It is also expensive, with lower yield than DRAM due to manufacturing complexity. We propose a new memory class: Managed-Retention Memory (MRM), which is more optimized to store key data structures for AI inference workloads. We believe that MRM may finally provide a path to viability for technologies that were originally proposed to support Storage Class Memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes long-term data retention and write performance for better potential performance on the metrics important for these workloads.

large language model, natural language, workload, (18 more...)

arXiv.org Artificial Intelligence

2501.09605

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology (0.47)
Semiconductors & Electronics (0.46)

Technology:

Information Technology > Hardware > Memory (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model

Zhang, Song, Wang, Qingzhong, Liu, Junyi, Xiong, Haoyi

arXiv.org Artificial IntelligenceJun-16-2024

In the fast-growing field of Remote Sensing (RS) image analysis, the gap between massive unlabeled datasets and the ability to fully utilize these datasets for advanced RS analytics presents a significant challenge. To fill the gap, our work introduces an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Segmentation), leveraging the Segment Anything Model (SAM) to predict precise pseudo-labels for RS images without necessitating prior annotations or additional prompts. The proposed pipeline significantly reduces the labor and resource demands traditionally associated with annotating RS datasets. By constructing two comprehensive pseudo-labeled RS datasets via ALPS for pre-training purposes, our approach enhances the performance of downstream tasks across various benchmarks, including iSAID and ISPRS Potsdam. Experiments demonstrate the effectiveness of our framework, showcasing its ability to generalize well across multiple tasks even under the scarcity of extensively annotated datasets, offering a scalable solution to automatic segmentation and annotation challenges in the field. In addition, the proposed a pipeline is flexible and can be applied to medical image segmentation, remarkably boosting the performance. Note that ALPS utilizes pre-trained SAM to semi-automatically annotate RS images without additional manual annotations. Though every component in the pipeline has bee well explored, integrating clustering algorithms with SAM and novel pseudo-label alignment significantly enhances RS segmentation, as an off-the-shelf tool for pre-training data preparation. Our source code is available at: https://github.com/StriveZs/ALPS.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2406.10855

Country:

Europe > Germany > Brandenburg > Potsdam (0.26)
Europe > Switzerland > Zürich (0.14)

Genre: Research Report > Promising Solution (0.46)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.62)
Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Data-driven Piecewise Affine Decision Rules for Stochastic Programming with Covariate Information

Zhang, Yiyang, Liu, Junyi, Zhao, Xiaobo

arXiv.org Machine LearningDec-20-2023

Focusing on stochastic programming (SP) with covariate information, this paper proposes an empirical risk minimization (ERM) method embedded within a nonconvex piecewise affine decision rule (PADR), which aims to learn the direct mapping from features to optimal decisions. We establish the nonasymptotic consistency result of our PADR-based ERM model for unconstrained problems and asymptotic consistency result for constrained ones. To solve the nonconvex and nondifferentiable ERM problem, we develop an enhanced stochastic majorization-minimization algorithm and establish the asymptotic convergence to (composite strong) directional stationarity along with complexity analysis. We show that the proposed PADR-based ERM method applies to a broad class of nonconvex SP problems with theoretical consistency guarantees and computational tractability. Our numerical study demonstrates the superior performance of PADR-based ERM methods compared to state-of-the-art approaches under various settings, with significantly lower costs, less computation time, and robustness to feature dimensions and nonlinearity of the underlying dependency.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Machine Learning

2304.13646

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > Florida (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.84)

Add feedback

TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction

Liu, Junyi, Li, Liangzhi, Xiang, Tong, Wang, Bowen, Qian, Yiming

arXiv.org Artificial IntelligenceOct-25-2023

Since ChatGPT released its API for public use, the number of applications built on top of commercial large language models (LLMs) increase exponentially. One popular usage of such models is leveraging its in-context learning ability and generating responses given user queries leveraging knowledge obtained by retrieval augmentation. One problem of deploying commercial retrieval-augmented LLMs is the cost due to the additionally retrieved context that largely increases the input token size of the LLMs. To mitigate this, we propose a token compression scheme that includes two methods: summarization compression and semantic compression. The first method applies a T5-based model that is fine-tuned by datasets generated using self-instruct containing samples with varying lengths and reduce token size by doing summarization. The second method further compresses the token size by removing words with lower impact on the semantic. In order to adequately evaluate the effectiveness of the proposed methods, we propose and utilize a dataset called Food-Recommendation DB (FRDB) focusing on food recommendation for women around pregnancy period or infants. Our summarization compression can reduce 65% of the retrieval token size with further 0.3% improvement on the accuracy; semantic compression provides a more flexible way to trade-off the token size with performance, for which we can reduce the token size by 20% with only 1.6% of accuracy drop.

large language model, natural language, token compression retrieval augmented, (3 more...)

arXiv.org Artificial Intelligence

2310.15556

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback