AITopics | Yu, Xiaodong

Plotting

Yu, Xiaodong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors

Zhang, Chengming, Sun, Baixi, Yu, Xiaodong, Xie, Zhen, Zheng, Weijian, Iskra, Kamil, Beckman, Pete, Tao, Dingwen

arXiv.org Artificial IntelligenceSep-29-2023

Transformer models have achieved remarkable success in various machine learning tasks but suffer from high computational complexity and resource requirements. The quadratic complexity of the self-attention mechanism further exacerbates these challenges when dealing with long sequences and large datasets. Specialized AI hardware accelerators, such as the Habana GAUDI architecture, offer a promising solution to tackle these issues. GAUDI features a Matrix Multiplication Engine (MME) and a cluster of fully programmable Tensor Processing Cores (TPC). This paper explores the untapped potential of using GAUDI processors to accelerate Transformer-based models, addressing key challenges in the process. Firstly, we provide a comprehensive performance comparison between the MME and TPC components, illuminating their relative strengths and weaknesses. Secondly, we explore strategies to optimize MME and TPC utilization, offering practical insights to enhance computational efficiency. Thirdly, we evaluate the performance of Transformers on GAUDI, particularly in handling long sequences and uncovering performance bottlenecks. Lastly, we evaluate the end-to-end performance of two Transformer-based large language models (LLM) on GAUDI. The contributions of this work encompass practical insights for practitioners and researchers alike. We delve into GAUDI's capabilities for Transformers through systematic profiling, analysis, and optimization exploration. Our study bridges a research gap and offers a roadmap for optimizing Transformer-based model training on the GAUDI architecture.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2309.16976

Country: North America > United States > Indiana (0.15)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Building Interpretable and Reliable Open Information Retriever for New Domains Overnight

Yu, Xiaodong, Zhou, Ben, Roth, Dan

arXiv.org Artificial IntelligenceAug-9-2023

Information retrieval (IR) or knowledge retrieval, is a critical component for many down-stream tasks such as open-domain question answering (QA). It is also very challenging, as it requires succinctness, completeness, and correctness. In recent works, dense retrieval models have achieved state-of-the-art (SOTA) performance on in-domain IR and QA benchmarks by representing queries and knowledge passages with dense vectors and learning the lexical and semantic similarity. However, using single dense vectors and end-to-end supervision are not always optimal because queries may require attention to multiple aspects and event implicit knowledge. In this work, we propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query. We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks. It will be the go-to system to use for applications that need to perform IR on a new domain without much dedicated effort, because of its superior interpretability and cross-domain performance.

artificial intelligence, information retrieval, natural language, (13 more...)

arXiv.org Artificial Intelligence

2308.04756

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Industry:

Government > Voting & Elections (0.47)
Energy > Oil & Gas (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.90)

Add feedback

HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs

Zhang, Chengming, Smith, Shaden, Sun, Baixi, Tian, Jiannan, Soifer, Jonathan, Yu, Xiaodong, Song, Shuaiwen Leon, He, Yuxiong, Tao, Dingwen

arXiv.org Artificial IntelligenceMay-3-2023

Collaborative filtering (CF) has been proven to be one of the most effective techniques for recommendation. Among all CF approaches, SimpleX is the state-of-the-art method that adopts a novel loss function and a proper number of negative samples. However, there is no work that optimizes SimpleX on multi-core CPUs, leading to limited performance. To this end, we perform an in-depth profiling and analysis of existing SimpleX implementations and identify their performance bottlenecks including (1) irregular memory accesses, (2) unnecessary memory copies, and (3) redundant computations. To address these issues, we propose an efficient CF training system (called HEAT) that fully enables the multi-level caching and multi-threading capabilities of modern CPUs. Specifically, the optimization of HEAT is threefold: (1) It tiles the embedding matrix to increase data locality and reduce cache misses (thus reduces read latency); (2) It optimizes stochastic gradient descent (SGD) with sampling by parallelizing vector products instead of matrix-matrix multiplications, in particular the similarity computation therein, to avoid memory copies for matrix data preparation; and (3) It aggressively reuses intermediate results from the forward phase in the backward phase to alleviate redundant computation. Evaluation on five widely used datasets with both x86- and ARM-architecture processors shows that HEAT achieves up to 45.2X speedup over existing CPU solution and 4.5X speedup and 7.9X cost reduction in Cloud over existing GPU solution with NVIDIA V100 GPU.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3577193.3593717

2304.07334

Country: North America > United States (1.00)

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Pairwise Representation Learning for Event Coreference

Yu, Xiaodong, Yin, Wenpeng, Roth, Dan

arXiv.org Artificial IntelligenceFeb-15-2023

Natural Language Processing tasks such as resolving the coreference of events require understanding the relations between two text snippets. These tasks are typically formulated as (binary) classification problems over independently induced representations of the text snippets. In this work, we develop a Pairwise Representation Learning (PairwiseRL) scheme for the event mention pairs, in which we jointly encode a pair of text snippets so that the representation of each mention in the pair is induced in the context of the other one. Furthermore, our representation supports a finer, structured representation of the text snippet to facilitate encoding events and their arguments. We show that PairwiseRL, despite its simplicity, outperforms the prior state-of-the-art event coreference systems on both cross-document and within-document event coreference benchmarks. We also conduct in-depth analysis in terms of the improvement and the limitation of pairwise representation so as to provide insights for future work.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2010.12808

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (0.50)

Industry:

Energy (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Event Linking: Grounding Event Mentions to Wikipedia

Yu, Xiaodong, Yin, Wenpeng, Gupta, Nitish, Roth, Dan

arXiv.org Artificial IntelligenceFeb-15-2023

Comprehending an article requires understanding its constituent events. However, the context where an event is mentioned often lacks the details of this event. A question arises: how can the reader obtain more knowledge about this particular event in addition to what is provided by the local context in the article? This work defines Event Linking, a new natural language understanding task at the event level. Event linking tries to link an event mention appearing in an article to the most appropriate Wikipedia page. This page is expected to provide rich knowledge about what the event mention refers to. To standardize the research in this new direction, we contribute in four-fold. First, this is the first work in the community that formally defines Event Linking task. Second, we collect a dataset for this new task. Specifically, we automatically gather training set from Wikipedia, and then create two evaluation sets: one from the Wikipedia domain, reporting the in-domain performance, and a second from the real-world news domain, to evaluate out-of-domain performance. Third, we retrain and evaluate two state-of-the-art (SOTA) entity linking models, showing the challenges of event linking, and we propose an event-specific linking system EVELINK to set a competitive result for the new task. Fourth, we conduct a detailed and insightful analysis to help understand the task and the limitation of the current model. Overall, as our analysis shows, Event Linking is a considerably challenging and essential task requiring more effort from the community. Data and code are available here: https://github.com/CogComp/event-linking.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2112.07888

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.64)

Industry:

Law Enforcement & Public Safety (1.00)
Government > Military (1.00)
Leisure & Entertainment > Sports > Football (0.95)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

Sun, Baixi, Yu, Xiaodong, Zhang, Chengming, Tian, Jiannan, Jin, Sian, Iskra, Kamil, Zhou, Tao, Bicer, Tekin, Beckman, Pete, Tao, Dingwen

arXiv.org Artificial IntelligenceNov-3-2022

CNN-based surrogates have become prevalent in scientific applications to replace conventional time-consuming physical approaches. Although these surrogates can yield satisfactory results with significantly lower computation costs over small training datasets, our benchmarking results show that data-loading overhead becomes the major performance bottleneck when training surrogates with large datasets. In practice, surrogates are usually trained with high-resolution scientific data, which can easily reach the terabyte scale. Several state-of-the-art data loaders are proposed to improve the loading throughput in general CNN training; however, they are sub-optimal when applied to the surrogate training. In this work, we propose SOLAR, a surrogate data loader, that can ultimately increase loading throughput during the training. It leverages our three key observations during the benchmarking and contains three novel designs. Specifically, SOLAR first generates a pre-determined shuffled index list and accordingly optimizes the global access order and the buffer eviction scheme to maximize the data reuse and the buffer hit rate. It then proposes a tradeoff between lightweight computational imbalance and heavyweight loading workload imbalance to speed up the overall training. It finally optimizes its data access pattern with HDF5 to achieve a better parallel I/O throughput. Our evaluation with three scientific surrogates and 32 GPUs illustrates that SOLAR can achieve up to 24.4X speedup over PyTorch Data Loader and 3.52X speedup over state-of-the-art data loaders.

artificial intelligence, machine learning, optimization, (18 more...)

arXiv.org Artificial Intelligence

2211.00224

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.66)

Industry:

Energy (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Visual Scene Interpretation as a Dialogue between Vision and Language

Yu, Xiaodong (University of Maryland) | Fermuller, Cornelia M. (University of Maryland) | Aloimonos, Yiannis (University of Maryland)

AAAI ConferencesAug-8-2011

We present a framework for semantic visual scene interpretation in a system with vision and language. In this framework the system consists of two modules, a language module and a vision module that communicate with each other in a form of a dialogue to actively interpret the scene. The language module is responsible for obtaining domain knowledge from linguistic resources and reasoning on the basis of this knowledge and the visual input. It iteratively creates questions that amount to an attention mechanism for the vision module which in turn shifts its focus to selected parts of the scene and applies selective segmentation and feature extraction. As a formalism for optimizing this dialogue we use information theory. We demonstrate the framework on the problem of recognizing a static scene from its objects and show preliminary results for the problem of human activity recognition from video. Experiments demonstrate the effectiveness of the active paradigm in introducing attention and additional constraints into the sensing process.

module, neural network, survey article, (17 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Maryland (0.14)
Europe > United Kingdom > England (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)

Add feedback