AITopics | Liang, Dong

Collaborating Authors

Liang, Dong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The day-ahead scenario generation method for new energy based on an improved conditional generative diffusion model

Wang, Changgang, Liu, Wei, Cao, Yu, Liang, Dong, Li, Yang, Mo, Jingshan

arXiv.org Artificial IntelligenceMar-6-2025

In the context of the rising share of new energy generation, accurately generating new energy output scenarios is crucial for day-ahead power system scheduling. Deep learning-based scenario generation methods can address this need, but their black-box nature raises concerns about interpretability. To tackle this issue, this paper introduces a method for day-ahead new energy scenario generation based on an improved conditional generative diffusion model. This method is built on the theoretical framework of Markov chains and variational inference. It first transforms historical data into pure noise through a diffusion process, then uses conditional information to guide the denoising process, ultimately generating scenarios that satisfy the conditional distribution. Additionally, the noise table is improved to a cosine form, enhancing the quality of the generated scenarios. When applied to actual wind and solar output data, the results demonstrate that this method effectively generates new energy output scenarios with good adaptability.

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.13335/j.1000-3673.pst.2024.1399

2503.07648

Country: Asia > China (0.29)

Genre: Research Report (0.71)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Renewable > Wind (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Gu, Shuhao, Zhang, Jialing, Zhou, Siyuan, Yu, Kevin, Xing, Zhaohu, Wang, Liangdong, Cao, Zhou, Jia, Jintao, Zhang, Zhuoyi, Wang, Yixuan, Hu, Zhenchong, Zhang, Bo-Wen, Li, Jijie, Liang, Dong, Zhao, Yingli, Wang, Songjing, Ao, Yulong, Ju, Yiming, Ma, Huanhuan, Li, Xiaotong, Diao, Haiwen, Cui, Yufeng, Wang, Xinlong, Liu, Yaoqi, Feng, Fangxiang, Liu, Guang

arXiv.org Artificial IntelligenceJan-6-2025

Recently, Vision-Language Models (VLMs) have achieved remarkable progress in multimodal tasks, and multimodal instruction data serves as the foundation for enhancing VLM capabilities. Despite the availability of several open-source multimodal datasets, limitations in the scale and quality of open-source instruction data hinder the performance of VLMs trained on these datasets, leading to a significant gap compared to models trained on closed-source data. To address this challenge, we introduce Infinity-MM, a large-scale multimodal instruction dataset. We collected the available multimodal instruction datasets and performed unified preprocessing, resulting in a dataset with over 40 million samples that ensures diversity and accuracy. Furthermore, to enable large-scale expansion of instruction data and support the continuous acquisition of high-quality data, we propose a synthetic instruction generation method based on a tagging system and open-source VLMs. By establishing correspondences between different types of images and associated instruction types, this method can provide essential guidance during data synthesis. Leveraging this high-quality data, we have trained a 2-billion-parameter Vision-Language Model, Aquila-VL-2B, which achieves state-of-the-art (SOTA) performance among models of similar scale. The data is available at: https://huggingface.co/datasets/BAAI/Infinity-MM.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.18558

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Group & Reweight: A Novel Cost-Sensitive Approach to Mitigating Class Imbalance in Network Traffic Classification

Du, Wumei, Liang, Dong, Lv, Yiqin, Liang, Xingxing, Wu, Guanlin, Wang, Qi, Xie, Zheng

arXiv.org Machine LearningDec-11-2024

Internet services have led to the eruption of network traffic, and machine learning on these Internet data has become an indispensable tool, especially when the application is risk-sensitive. This paper focuses on network traffic classification in the presence of severe class imbalance. Such a distributional trait mostly drifts the optimal decision boundary and results in an unsatisfactory solution. This raises safety concerns in the network traffic field when previous class imbalance methods hardly deal with numerous minority malicious classes. To alleviate these effects, we design a \textit{group \& reweight} strategy for alleviating class imbalance. Inspired by the group distributionally optimization framework, our approach heuristically clusters classes into groups, iteratively updates the non-parametric weights for separate classes, and optimizes the learning model by minimizing reweighted losses. We theoretically interpret the optimization process from a Stackelberg game and perform extensive experiments on typical benchmarks. Results show that our approach can not only suppress the negative effect of class imbalance but also improve the comprehensive performance in prediction.

artificial intelligence, auc, machine learning, (16 more...)

arXiv.org Machine Learning

2409.19214

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction

Guan, Yu, Cai, Qinrong, Li, Wei, Fan, Qiuyun, Liang, Dong, Liu, Qiegen

arXiv.org Artificial IntelligenceNov-6-2024

Diffusion model-based approaches recently achieved re-markable success in MRI reconstruction, but integration into clinical routine remains challenging due to its time-consuming convergence. This phenomenon is partic-ularly notable when directly apply conventional diffusion process to k-space data without considering the inherent properties of k-space sampling, limiting k-space learning efficiency and image reconstruction quality. To tackle these challenges, we introduce subspace diffusion model with orthogonal decomposition, a method (referred to as Sub-DM) that restrict the diffusion process via projections onto subspace as the k-space data distribution evolves toward noise. Particularly, the subspace diffusion model circumvents the inference challenges posed by the com-plex and high-dimensional characteristics of k-space data, so the highly compact subspace ensures that diffusion process requires only a few simple iterations to produce accurate prior information. Furthermore, the orthogonal decomposition strategy based on wavelet transform hin-ders the information loss during the migration of the vanilla diffusion process to the subspace. Considering the strate-gy is approximately reversible, such that the entire pro-cess can be reversed. As a result, it allows the diffusion processes in different spaces to refine models through a mutual feedback mechanism, enabling the learning of ac-curate prior even when dealing with complex k-space data. Comprehensive experiments on different datasets clearly demonstrate that the superiority of Sub-DM against state of-the-art methods in terms of reconstruction speed and quality.

artificial intelligence, data quality, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.03758

Country: Asia > China (0.47)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.69)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Theoretical Investigations and Practical Enhancements on Tail Task Risk Minimization in Meta Learning

Lv, Yiqin, Wang, Qi, Liang, Dong, Xie, Zheng

arXiv.org Artificial IntelligenceOct-30-2024

Meta learning is a promising paradigm in the era of large models and task distributional robustness has become an indispensable consideration in real-world scenarios. Recent advances have examined the effectiveness of tail task risk minimization in fast adaptation robustness improvement \citep{wang2023simple}. This work contributes to more theoretical investigations and practical enhancements in the field. Specifically, we reduce the distributionally robust strategy to a max-min optimization problem, constitute the Stackelberg equilibrium as the solution concept, and estimate the convergence rate. In the presence of tail risk, we further derive the generalization bound, establish connections with estimated quantiles, and practically improve the studied strategy. Accordingly, extensive evaluations demonstrate the significance of our proposal and its scalability to multimodal large models in boosting robustness.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.22788

Country:

Asia > China (0.14)
North America > Canada (0.14)
Europe > France (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

RS-MOCO: A deep learning-based topology-preserving image registration method for cardiac T1 mapping

Huang, Chiyi, Sun, Longwei, Liang, Dong, Liang, Haifeng, Zeng, Hongwu, Zhu, Yanjie

arXiv.org Artificial IntelligenceOct-15-2024

Cardiac T1 mapping can evaluate various clinical symptoms of myocardial tissue. However, there is currently a lack of effective, robust, and efficient methods for motion correction in cardiac T1 mapping. In this paper, we propose a deep learning-based and topology-preserving image registration framework for motion correction in cardiac T1 mapping. Notably, our proposed implicit consistency constraint dubbed BLOC, to some extent preserves the image topology in registration by bidirectional consistency constraint and local anti-folding constraint. To address the contrast variation issue, we introduce a weighted image similarity metric for multimodal registration of cardiac T1-weighted images. Besides, a semi-supervised myocardium segmentation network and a dual-domain attention module are integrated into the framework to further improve the performance of the registration. Numerous comparative experiments, as well as ablation studies, demonstrated the effectiveness and high robustness of our method. The results also indicate that the proposed weighted image similarity metric, specifically crafted for our network, contributes a lot to the enhancement of the motion correction efficacy, while the bidirectional consistency constraint combined with the local anti-folding constraint ensures a more desirable topology-preserving registration mapping.

artificial intelligence, machine learning, registration, (17 more...)

arXiv.org Artificial Intelligence

2410.11651

Country:

Europe (1.00)
Asia > China (0.29)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.96)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System

Zhou, Fang, Huang, Yaning, Liang, Dong, Li, Dai, Zhang, Zhongke, Wang, Kai, Xin, Xiao, Aboelela, Abdallah, Jiang, Zheliang, Wang, Yang, Song, Jeff, Zhang, Wei, Liang, Chen, Li, Huayu, Sun, ChongLin, Yang, Hang, Qu, Lei, Shu, Zhan, Yuan, Mindi, Maccherani, Emanuele, Hayat, Taha, Guo, John, Puvvada, Varna, Pashkevich, Uladzimir

arXiv.org Artificial IntelligenceOct-8-2024

The increasing complexity of deep learning models used for calculating user representations presents significant challenges, particularly with limited computational resources and strict service-level agreements (SLAs). Previous research efforts have focused on optimizing model inference but have overlooked a critical question: is it necessary to perform user model inference for every ad request in large-scale social networks? To address this question and these challenges, we first analyze user access patterns at Meta and find that most user model inferences occur within a short timeframe. T his observation reveals a triangular relationship among model complexity, embedding freshness, and service SLAs. Building on this insight, we designed, implemented, and evaluated ERCache, an efficient and robust caching framework for large-scale user representations in ads recommendation systems on social networks. ERCache categorizes cache into direct and failover types and applies customized settings and eviction policies for each model, effectively balancing model complexity, embedding freshness, and service SLAs, even considering the staleness introduced by caching. ERCache has been deployed at Meta for over six months, supporting more than 30 ranking models while efficiently conserving computational resources and complying with service SLA requirements.

artificial intelligence, machine learning, social media, (16 more...)

arXiv.org Artificial Intelligence

2410.06497

Country:

Europe (1.00)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.87)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DP-MDM: Detail-Preserving MR Reconstruction via Multiple Diffusion Models

Geng, Mengxiao, Zhu, Jiahao, Zhu, Xiaolin, Liu, Qiqing, Liang, Dong, Liu, Qiegen

arXiv.org Artificial IntelligenceMay-9-2024

Detail features of magnetic resonance images play a cru-cial role in accurate medical diagnosis and treatment, as they capture subtle changes that pose challenges for doc-tors when performing precise judgments. However, the widely utilized naive diffusion model has limitations, as it fails to accurately capture more intricate details. To en-hance the quality of MRI reconstruction, we propose a comprehensive detail-preserving reconstruction method using multiple diffusion models to extract structure and detail features in k-space domain instead of image do-main. Moreover, virtual binary modal masks are utilized to refine the range of values in k-space data through highly adaptive center windows, which allows the model to focus its attention more efficiently. Last but not least, an inverted pyramid structure is employed, where the top-down image information gradually decreases, ena-bling a cascade representation. The framework effective-ly represents multi-scale sampled data, taking into ac-count the sparsity of the inverted pyramid architecture, and utilizes cascade training data distribution to repre-sent multi-scale data. Through a step-by-step refinement approach, the method refines the approximation of de-tails. Finally, the proposed method was evaluated by con-ducting experiments on clinical and public datasets. The results demonstrate that the proposed method outper-forms other methods.

artificial intelligence, machine learning, reconstruction, (20 more...)

arXiv.org Artificial Intelligence

2405.05763

Country: Asia > China > Jiangxi Province (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis

Zhang, Zeyu, Zhao, Yuanshen, Duan, Jingxian, Liu, Yaou, Zheng, Hairong, Liang, Dong, Zhang, Zhenyu, Li, Zhi-Cheng

arXiv.org Artificial IntelligenceApr-11-2024

The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histopathology and transcriptomics remains challenging. In this paper, we propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis. The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph. The representation learning network utilizes the biological prior knowledge of intra-modal and inter-modal data associations to guide the feature extraction. The node features of each modality are updated through attention-based graph learning strategy. Unimodal features and bi-modal fused features are extracted via attention pooling module and then used for survival prediction. We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas (TCGA) and the First Affiliated Hospital of Zhengzhou University (FAHZU). Extensive experimental results demonstrate that the proposed method outperforms both unimodal and other multi-modal fusion models. For demonstrating the model interpretability, we also visualize the attention heatmap of pathological images and utilize integrated gradient algorithm to identify important tissue structure, biological pathways and key genes.

artificial intelligence, machine learning, pathway, (16 more...)

arXiv.org Artificial Intelligence

2404.08023

Country: Asia > China > Henan Province > Zhengzhou (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

Zhang, Wei, Li, Dai, Liang, Chen, Zhou, Fang, Zhang, Zhongke, Wang, Xuewei, Li, Ru, Zhou, Yi, Huang, Yaning, Liang, Dong, Wang, Kai, Wang, Zhangyuan, Chen, Zhengxing, Li, Min, Wu, Fenggang, Chen, Minghai, Li, Huayu, Wu, Yunnan, Shu, Zhan, Yuan, Mindi, Reddy, Sri

arXiv.org Artificial IntelligenceNov-15-2023

Effective user representations are pivotal in personalized advertising. However, stringent constraints on training throughput, serving latency, and memory, often limit the complexity and input feature set of online ads ranking models. This challenge is magnified in extensive systems like Meta's, which encompass hundreds of models with diverse specifications, rendering the tailoring of user representation learning for each model impractical. To address these challenges, we present Scaling User Modeling (SUM), a framework widely deployed in Meta's ads ranking system, designed to facilitate efficient and scalable sharing of online user representation across hundreds of ads models. SUM leverages a few designated upstream user models to synthesize user embeddings from massive amounts of user features with advanced modeling techniques. These embeddings then serve as inputs to downstream online ads ranking models, promoting efficient representation sharing. To adapt to the dynamic nature of user features and ensure embedding freshness, we designed SUM Online Asynchronous Platform (SOAP), a latency free online serving system complemented with model freshness and embedding stabilization, which enables frequent user model updates and online inference of user embeddings upon each user request. We share our hands-on deployment experiences for the SUM framework and validate its superiority through comprehensive experiments. To date, SUM has been launched to hundreds of ads ranking models in Meta, processing hundreds of billions of user requests daily, yielding significant online metric gains and infrastructure cost savings.

artificial intelligence, machine learning, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2311.09544

Country:

Europe (0.94)
North America > United States > New York > New York County > New York City (0.17)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry:

Marketing (0.86)
Information Technology > Services (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Human Computer Interaction (0.72)

Add feedback