AITopics | He, Jingrui

Collaborating Authors

He, Jingrui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tensor Convolutional Network for Higher-Order Interaction Prediction in Sparse Tensors

Jang, Jun-Gi, He, Jingrui, Margenot, Andrew, Tong, Hanghang

arXiv.org Artificial IntelligenceMar-14-2025

Many real-world data, such as recommendation data and temporal graphs, can be represented as incomplete sparse tensors where most entries are unobserved. For such sparse tensors, identifying the top-k higher-order interactions that are most likely to occur among unobserved ones is crucial. Tensor factorization (TF) has gained significant attention in various tensor-based applications, serving as an effective method for finding these top-k potential interactions. However, existing TF methods primarily focus on effectively fusing latent vectors of entities, which limits their expressiveness. Since most entities in sparse tensors have only a few interactions, their latent representations are often insufficiently trained. In this paper, we propose TCN, an accurate and compatible tensor convolutional network that integrates seamlessly with existing TF methods for predicting higher-order interactions. We design a highly effective encoder to generate expressive latent vectors of entities. To achieve this, we propose to (1) construct a graph structure derived from a sparse tensor and (2) develop a relation-aware encoder, TCN, that learns latent representations of entities by leveraging the graph structure. Since TCN complements traditional TF methods, we seamlessly integrate TCN with existing TF methods, enhancing the performance of predicting top-k interactions. Extensive experiments show that TCN integrated with a TF method outperforms competitors, including TF methods and a hyperedge prediction method. Moreover, TCN is broadly compatible with various TF methods and GNNs (Graph Neural Networks), making it a versatile solution.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.11786

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration

Ai, Mengting, Wei, Tianxin, Chen, Yifan, Zeng, Zhichen, Zhao, Ritchie, Varatkar, Girish, Rouhani, Bita Darvish, Tang, Xianfeng, Tong, Hanghang, He, Jingrui

arXiv.org Artificial IntelligenceMar-9-2025

Mixture-of-Experts (MoE) Transformer, the backbone architecture The profound impact of the Transformer architecture in the domain of multiple phenomenal language models, leverages sparsity of machine learning is undeniable, for the fields including by activating only a fraction of model parameters for each input natural language processing [3, 14, 18, 45, 48, 61] and computer token. The sparse structure, while allowing constant time costs, vision [17, 39, 64], to name a few. To further improve the capabilities results in space inefficiency: we still need to load all the model of pre-trained large language models (LLMs), one general parameters during inference. We introduce ResMoE, an innovative strategy is to scale up their parameters. Mixture-of-Experts (MoE) MoE approximation framework that utilizes Wasserstein barycenter [52] extends the traditional feedforward neural network (FFN) layer to extract a common expert (barycenter expert) and approximate by replacing a single multilayer perceptron (MLP) with multiple the residuals between this barycenter expert and the original ones. MLPs, referred to as "experts". While enhancing the performance, ResMoE enhances the space efficiency for inference of large-scale sparse MoE keeps computing costs (FLOPs) comparable to the original MoE Transformers in a one-shot and data-agnostic manner without dense model, as only a few selected experts will be activated retraining while maintaining minimal accuracy loss, thereby each time. The framework of an MoE layer is demonstrated in paving the way for broader accessibility to large language models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3690624.3709196

2503.06881

Country:

Europe (1.00)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Food & Agriculture (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative

Li, Zihao, Lin, Xiao, Liu, Zhining, Zou, Jiaru, Wu, Ziwei, Zheng, Lecheng, Fu, Dongqi, Zhu, Yada, Hamann, Hendrik, Tong, Hanghang, He, Jingrui

arXiv.org Artificial IntelligenceFeb-12-2025

While many advances in time series models focus exclusively on numerical data, research on multimodal time series, particularly those involving contextual textual information commonly encountered in real-world scenarios, remains in its infancy. Consequently, effectively integrating the text modality remains challenging. In this work, we highlight an intuitive yet significant observation that has been overlooked by existing works: time-series-paired texts exhibit periodic properties that closely mirror those of the original time series. Building on this insight, we propose a novel framework, Texts as Time Series (TaTS), which considers the time-series-paired texts to be auxiliary variables of the time series. TaTS can be plugged into any existing numerical-only time series models and enable them to handle time series data with paired texts effectively. Through extensive experiments on both multimodal time series forecasting and imputation tasks across benchmark datasets with various existing time series models, we demonstrate that TaTS can enhance predictive performance and achieve outperformance without modifying model architectures.

data mining, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.08942

Country:

Europe (0.92)
North America > United States > Illinois (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.27)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry: Banking & Finance > Economy (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation

He, Xinrui, Ban, Yikun, Zou, Jiaru, Wei, Tianxin, Cook, Curtiss B., He, Jingrui

arXiv.org Artificial IntelligenceJan-4-2025

Missing data imputation is a critical challenge in various domains, such as healthcare and finance, where data completeness is vital for accurate analysis. Large language models (LLMs), trained on vast corpora, have shown strong potential in data generation, making them a promising tool for data imputation. However, challenges persist in designing effective prompts for a finetuning-free process and in mitigating the risk of LLM hallucinations. To address these issues, we propose a novel framework, LLM-Forest, which introduces a "forest" of few-shot learning LLM "trees" with confidence-based weighted voting, inspired by ensemble learning (Random Forest). This framework is established on a new concept of bipartite information graphs to identify high-quality relevant neighboring entries with both feature and value granularity. Extensive experiments on 9 real-world datasets demonstrate the effectiveness and efficiency of LLM-Forest.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.2152

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area (0.70)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PyG-SSL: A Graph Self-Supervised Learning Toolkit

Zheng, Lecheng, Jing, Baoyu, Li, Zihao, Zeng, Zhichen, Wei, Tianxin, Ai, Mengting, He, Xinrui, Liu, Lihui, Fu, Dongqi, You, Jiaxuan, Tong, Hanghang, He, Jingrui

arXiv.org Artificial IntelligenceDec-30-2024

Graph Self-Supervised Learning (SSL) has emerged as a pivotal area of research in recent years. By engaging in pretext tasks to learn the intricate topological structures and properties of graphs using unlabeled data, these graph SSL models achieve enhanced performance, improved generalization, and heightened robustness. Despite the remarkable achievements of these graph SSL methods, their current implementation poses significant challenges for beginners and practitioners due to the complex nature of graph structures, inconsistent evaluation metrics, and concerns regarding reproducibility hinder further progress in this field. Recognizing the growing interest within the research community, there is an urgent need for a comprehensive, beginner-friendly, and accessible toolkit consisting of the most representative graph SSL algorithms. To address these challenges, we present a Graph SSL toolkit named PyG-SSL, which is built upon PyTorch and is compatible with various deep learning and scientific computing backends. Within the toolkit, we offer a unified framework encompassing dataset loading, hyper-parameter configuration, model training, and comprehensive performance evaluation for diverse downstream tasks. Moreover, we provide beginner-friendly tutorials and the best hyper-parameters of each graph SSL algorithm on different graph datasets, facilitating the reproduction of results.

artificial intelligence, machine learning, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2412.21151

Country:

North America > United States (1.00)
Europe (0.93)
Asia (0.69)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

APEX$^2$: Adaptive and Extreme Summarization for Personalized Knowledge Graphs

Li, Zihao, Fu, Dongqi, Ai, Mengting, He, Jingrui

arXiv.org Artificial IntelligenceDec-23-2024

Knowledge graphs (KGs), which store an extensive number of relational facts, serve various applications. Recently, personalized knowledge graphs (PKGs) have emerged as a solution to optimize storage costs by customizing their content to align with users' specific interests within particular domains. In the real world, on one hand, user queries and their underlying interests are inherently evolving, requiring PKGs to adapt continuously; on the other hand, the summarization is constantly expected to be as small as possible in terms of storage cost. However, the existing PKG summarization methods implicitly assume that the user's interests are constant and do not shift. Furthermore, when the size constraint of PKG is extremely small, the existing methods cannot distinguish which facts are more of immediate interest and guarantee the utility of the summarized PKG. To address these limitations, we propose APEX$^2$, a highly scalable PKG summarization framework designed with robust theoretical guarantees to excel in adaptive summarization tasks with extremely small size constraints. To be specific, after constructing an initial PKG, APEX$^2$ continuously tracks the interest shift and adjusts the previous summary. We evaluate APEX$^2$ under an evolving query setting on benchmark KGs containing up to 12 million triples, summarizing with compression ratios $\leq 0.1\%$. The experiments show that APEX outperforms state-of-the-art baselines in terms of both query-answering accuracy and efficiency.

apex 2, artificial intelligence, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3690624.3709213

2412.17336

Country:

Europe (1.00)
Asia (1.00)
North America > Canada (0.96)
North America > United States > California (0.67)

Genre: Research Report (1.00)

Industry:

Media > Film (0.67)
Health & Medicine (0.67)
Leisure & Entertainment > Sports > Soccer (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.84)

Add feedback

Trustworthy Transfer Learning: A Survey

Wu, Jun, He, Jingrui

arXiv.org Artificial IntelligenceDec-18-2024

Transfer learning aims to transfer knowledge or information from a source domain to a relevant target domain. In this paper, we understand transfer learning from the perspectives of knowledge transferability and trustworthiness. This involves two research questions: How is knowledge transferability quantitatively measured and enhanced across domains? Can we trust the transferred knowledge in the transfer learning process? To answer these questions, this paper provides a comprehensive review of trustworthy transfer learning from various aspects, including problem definitions, theoretical analysis, empirical algorithms, and real-world applications. Specifically, we summarize recent theories and algorithms for understanding knowledge transferability under (within-domain) IID and non-IID assumptions. In addition to knowledge transferability, we review the impact of trustworthiness on transfer learning, e.g., whether the transferred knowledge is adversarially robust or algorithmically fair, how to transfer the knowledge under privacy-preserving constraints, etc. Beyond discussing the current advancements, we highlight the open questions and future directions for understanding transfer learning in a reliable and trustworthy manner.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.14116

Country:

Europe (1.00)
North America > United States > Michigan (0.27)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.47)
Research Report > New Finding (0.34)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
(2 more...)

Add feedback

Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?

Li, Zihao, Zheng, Lecheng, Jin, Bowen, Fu, Dongqi, Jing, Baoyu, Ban, Yikun, He, Jingrui, Han, Jiawei

arXiv.org Artificial IntelligenceDec-15-2024

While great success has been achieved in building vision models with Contrastive Language-Image Pre-training (CLIP) over internet-scale image-text pairs, building transferable Graph Neural Networks (GNNs) with CLIP pipeline is challenging because of three fundamental issues: the scarcity of labeled data and text supervision, different levels of downstream tasks, and the conceptual gaps between domains. In this work, to address these issues, we leverage multi-modal prompt learning to effectively adapt pre-trained GNN to downstream tasks and data, given only a few semantically labeled samples, each with extremely weak text supervision. Our new paradigm embeds the graphs directly in the same space as the Large Language Models (LLMs) by learning both graph prompts and text prompts simultaneously. To accomplish this, we improve state-of-the-art graph prompt method, and then propose the first graph-language multi-modal prompt learning approach for exploiting the knowledge in pre-trained models. Notably, due to the insufficient supervision for fine-tuning, in our paradigm, the pre-trained GNN and the LLM are kept frozen, so the learnable parameters are much fewer than fine-tuning any pre-trained model. Through extensive experiments on real-world datasets, we demonstrate the superior performance of our paradigm in few-shot, multi-task-level, and cross-domain settings. Moreover, we build the first CLIP-style zero-shot classification prototype that can generalize GNNs to unseen classes with extremely weak text supervision.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.08174

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.28)

Genre:

Research Report (0.63)
Overview (0.46)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Provably Extending PageRank-based Local Clustering Algorithm to Weighted Directed Graphs with Self-Loops and to Hypergraphs

Li, Zihao, Fu, Dongqi, Liu, Hengyu, He, Jingrui

arXiv.org Artificial IntelligenceDec-3-2024

Local clustering aims to find a compact cluster near the given starting instances. This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. While most existing studies on local graph clustering adopt the discrete graph setting (i.e., unweighted graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs. Specifically, leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability. On the property of hypergraphs, we address a fundamental gap in the literature by defining conductance for hypergraphs from the perspective of hypergraph random walks. Additionally, we provide experiments to validate our theoretical findings.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.03008

Country:

Europe (1.00)
North America > United States > California (0.93)
North America > Canada (0.67)
Asia (0.67)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(3 more...)

Add feedback

Transforming the Hybrid Cloud for Emerging AI Workloads

Chen, Deming, Youssef, Alaa, Pendse, Ruchi, Schleife, André, Clark, Bryan K., Hamann, Hendrik, He, Jingrui, Laino, Teodoro, Varshney, Lav, Wang, Yuxiong, Sil, Avirup, Jabbarvand, Reyhaneh, Xu, Tianyin, Kindratenko, Volodymyr, Costa, Carlos, Adve, Sarita, Mendis, Charith, Zhang, Minjia, Núñez-Corrales, Santiago, Ganti, Raghu, Srivatsa, Mudhakar, Kim, Nam Sung, Torrellas, Josep, Huang, Jian, Seelam, Seetharami, Nahrstedt, Klara, Abdelzaher, Tarek, Eilam, Tamar, Zhao, Huimin, Manica, Matteo, Iyer, Ravishankar, Hirzel, Martin, Adve, Vikram, Marinov, Darko, Franke, Hubertus, Tong, Hanghang, Ainsworth, Elizabeth, Zhao, Han, Vasisht, Deepak, Do, Minh, Oliveira, Fabio, Pacifici, Giovanni, Puri, Ruchir, Nagpurkar, Priya

arXiv.org Artificial IntelligenceNov-20-2024

This white paper, developed through close collaboration between IBM Research and UIUC researchers within the IIDAI Institute, envisions transforming hybrid cloud systems to meet the growing complexity of AI workloads through innovative, full-stack co-design approaches, emphasizing usability, manageability, affordability, adaptability, efficiency, and scalability. By integrating cutting-edge technologies such as generative and agentic AI, cross-layer automation and optimization, unified control plane, and composable and adaptive system architecture, the proposed framework addresses critical challenges in energy efficiency, performance, and cost-effectiveness. Incorporating quantum computing as it matures will enable quantum-accelerated simulations for materials science, climate modeling, and other high-impact domains. Collaborative efforts between academia and industry are central to this vision, driving advancements in foundation models for material design and climate solutions, scalable multimodal data processing, and enhanced physics-based AI emulators for applications like weather forecasting and carbon sequestration. Research priorities include advancing AI agentic systems, LLM as an Abstraction (LLMaaA), AI model optimization and unified abstractions across heterogeneous infrastructure, end-to-end edge-cloud transformation, efficient programming model, middleware and platform, secure infrastructure, application-adaptive cloud systems, and new quantum-classical collaborative workflows. These ideas and solutions encompass both theoretical and practical research questions, requiring coordinated input and support from the research community. This joint initiative aims to establish hybrid clouds as secure, efficient, and sustainable platforms, fostering breakthroughs in AI-driven applications and scientific discovery across academia, industry, and society.

data mining, large language model, machine learning, (24 more...)

arXiv.org Artificial Intelligence

2411.13239

Country:

Asia (0.67)
North America > United States > California (0.27)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview > Innovation (1.00)

Industry:

Information Technology > Services (1.00)
Energy > Oil & Gas > Upstream (0.65)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Cloud Computing (1.00)
(8 more...)

Add feedback