AITopics | meda

Collaborating Authors

meda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference

Wan, Zhongwei, Shen, Hui, Wang, Xin, Liu, Che, Mai, Zheda, Zhang, Mi

arXiv.org Artificial IntelligenceMar-13-2025

Long-context Multimodal Large Language Models (MLLMs) that incorporate long text-image and text-video modalities, demand substantial resources as their multimodal Key-Value (KV) caches grow with increasing input lengths, challenging inference efficiency. Existing methods for KV cache compression, in both text-only and multimodal LLMs, have neglected attention density variations across layers, thus often adopting uniform or progressive reduction strategies for layer-wise cache allocation. In this work, we propose MEDA, a dynamic layer-wise KV cache allocation method for efficient multimodal long-context inference. As its core, MEDA utilizes cross-modal attention entropy to determine the KV cache size at each MLLMs layer. Given the dynamically allocated KV cache size at each layer, MEDA also employs a KV pair selection scheme to identify which KV pairs to select and a KV pair merging strategy that merges the selected and non-selected ones to preserve information from the entire context. MEDA achieves up to 72% KV cache memory reduction and 2.82 times faster decoding speed, while maintaining or enhancing performance on various multimodal tasks in long-context settings, including multi-images and long-video scenarios. Our code is released at https://github.com/AIoT-MLSys-Lab/MEDA.

arxiv preprint arxiv, meda, zhang, (15 more...)

arXiv.org Artificial Intelligence

2502.17599

Country: North America > United States > Ohio (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction

Fan, Zhongxiang, Liu, Zhaocheng, Liang, Jian, Kong, Dongying, Li, Han, Jiang, Peng, Li, Shuang, Gai, Kun

arXiv.org Machine LearningJun-27-2024

This paper investigates the one-epoch overfitting phenomenon in Click-Through Rate (CTR) models, where performance notably declines at the start of the second epoch. Despite extensive research, the efficacy of multi-epoch training over the conventional one-epoch approach remains unclear. We identify the overfitting of the embedding layer, caused by high-dimensional data sparsity, as the primary issue. To address this, we introduce a novel and simple Multi-Epoch learning with Data Augmentation (MEDA) framework, suitable for both non-continual and continual learning scenarios, which can be seamlessly integrated into existing deep CTR models and may have potential applications to handle the "forgetting or overfitting" dilemma in the retraining and the well-known catastrophic forgetting problems. MEDA minimizes overfitting by reducing the dependency of the embedding layer on subsequent training data or the Multi-Layer Perceptron (MLP) layers, and achieves data augmentation through training the MLP with varied embedding spaces. Our findings confirm that pre-trained MLP layers can adapt to new embedding spaces, enhancing performance without overfitting. This adaptability underscores the MLP layers' role in learning a matching function focused on the relative relationships among embeddings rather than their absolute positions. To our knowledge, MEDA represents the first multi-epoch training strategy tailored for deep CTR prediction models. We conduct extensive experiments on several public and business datasets, and the effectiveness of data augmentation and superiority over conventional single-epoch training are fully demonstrated. Besides, MEDA has exhibited significant benefits in a real-world online advertising system.

dataset, epoch, meda, (14 more...)

arXiv.org Machine Learning

2407.01607

Country:

Asia > China > Shandong Province > Dongying (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

Multi-Epoch Learning for Deep Click-Through Rate Prediction Models

Liu, Zhaocheng, Fan, Zhongxiang, Liang, Jian, Kong, Dongying, Li, Han

arXiv.org Artificial IntelligenceMay-30-2023

The one-epoch overfitting phenomenon has been widely observed in industrial Click-Through Rate (CTR) applications, where the model performance experiences a significant degradation at the beginning of the second epoch. Recent advances try to understand the underlying factors behind this phenomenon through extensive experiments. However, it is still unknown whether a multi-epoch training paradigm could achieve better results, as the best performance is usually achieved by one-epoch training. In this paper, we hypothesize that the emergence of this phenomenon may be attributed to the susceptibility of the embedding layer to overfitting, which can stem from the high-dimensional sparsity of data. To maintain feature sparsity while simultaneously avoiding overfitting of embeddings, we propose a novel Multi-Epoch learning with Data Augmentation (MEDA), which can be directly applied to most deep CTR models. MEDA achieves data augmentation by reinitializing the embedding layer in each epoch, thereby avoiding embedding overfitting and simultaneously improving convergence. To our best knowledge, MEDA is the first multi-epoch training paradigm designed for deep CTR prediction models. We conduct extensive experiments on several public datasets, and the effectiveness of our proposed MEDA is fully verified. Notably, the results show that MEDA can significantly outperform the conventional one-epoch training. Besides, MEDA has exhibited significant benefits in a real-world scene on Kuaishou.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.19531

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Shandong Province > Dongying (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

NASA's Perseverance rover snaps selfies of its 'head' and 'face'

Daily Mail - Science & techApr-7-2021, 09:36:47 GMT

NASA's Perseverance rover has sent back two selfies of its camera-laden'face' and'head' from the Jezero Crater on the surface of Mars. The two snaps show Perseverance's remote sensing mast, which hosts many of the rover's cameras and scientific instruments. They were taken with the SHERLOC WATSON camera, located on the turret at the end of the rover's robotic arm. Perseverance touched down on the Red Planet on February 18 after a nearly seven-month journey through space. It is tasked with seeking traces of fossilised microbial life from Mars' ancient past and to collect rock specimens for return to Earth through future missions to the Red Planet.

nasa, perseverance, rover, (12 more...)

Daily Mail - Science & tech

Country: North America > United States (0.96)

Industry:

Government > Space Agency (0.96)
Government > Regional Government > North America Government > United States Government (0.96)

Technology: Information Technology > Artificial Intelligence > Robots (0.58)

Add feedback