AITopics | Ji, Shihao

Collaborating Authors

Ji, Shihao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HyperGCL: Multi-Modal Graph Contrastive Learning via Learnable Hypergraph Views

Saifuddin, Khaled Mohammed, Ji, Shihao, Akbas, Esra

arXiv.org Artificial IntelligenceFeb-26-2025

--Recent advancements in Graph Contrastive Learning (GCL) have demonstrated remarkable effectiveness in improving graph representations. However, relying on predefined augmentations (e.g., node dropping, edge perturbation, attribute masking) may result in the loss of task-relevant information and a lack of adaptability to diverse input data. Furthermore, the selection of negative samples remains rarely explored. In this paper, we introduce HyperGCL, a novel multimodal GCL framework from a hypergraph perspective. HyperGCL constructs three distinct hypergraph views by jointly utilizing the input graph's structure and attributes, enabling a comprehensive integration of multiple modalities in contrastive learning. A learnable adaptive topology augmentation technique enhances these views by preserving important relations and filtering out noise. View-specific encoders capture essential characteristics from each view, while a network-aware contrastive loss leverages the underlying topology to define positive and negative samples effectively. Extensive experiments on benchmark datasets demonstrate that HyperGCL achieves state-of-the-art node classification performance. Building on the success of contrastive learning (CL) in computer vision and natural language processing [1], [2], CL approaches have been extended to graph data--known as Graph Contrastive Learning (GCL)--where Graph Neural Networks (GNNs) learn robust representations by maximizing agreement between augmented graph views [3]-[6]. First, they often depend on handcrafted augmentations such as node dropping, edge perturbation, and attribute masking.

artificial intelligence, machine learning, node, (13 more...)

arXiv.org Artificial Intelligence

2502.13277

Country: North America > United States > Connecticut > Tolland County > Storrs (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

OpenGrok: Enhancing SNS Data Processing with Distilled Knowledge and Mask-like Mechanisms

AI, Lumen, School, Zaozhuang No. 28 Middle, Ji, Shihao, Song, Zihui, Zhong, Fucheng, Jia, Jisen, Wu, Zhaobo, Cao, Zheyi, Xu, Tianhao

arXiv.org Artificial IntelligenceFeb-11-2025

This report details Lumen Labs' novel approach to processing Social Networking Service (SNS) data. We leverage knowledge distillation, specifically a simple distillation method inspired by DeepSeek-R1's CoT acquisition, combined with prompt hacking, to extract valuable training data from the Grok model. This data is then used to fine-tune a Phi-3-mini model, augmented with a mask-like mechanism specifically designed for handling the nuances of SNS data. Our method demonstrates state-of-the-art (SOTA) performance on several SNS data processing tasks, outperforming existing models like Grok, Phi-3, and GPT-4. We provide a comprehensive analysis of our approach, including mathematical formulations, engineering details, ablation studies, and comparative evaluations.

large language model, machine learning, mechanism, (14 more...)

arXiv.org Artificial Intelligence

2502.07312

Genre: Research Report (1.00)

Industry: Information Technology > Software (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Enhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability

AI, Lumen, School, Tengzhou No. 1 Middle, Ji, Shihao, Song, Zihui, Zhong, Fucheng, Jia, Jisen, Wu, Zhaobo, Cao, Zheyi, Xu, Tianhao

arXiv.org Artificial IntelligenceJan-30-2025

This paper proposes a formal framework based on symbolic compression, integrating combinatory logic, information-theoretic optimal encoding, and context-aware inference techniques to achieve a step-change improvement in token efficiency while preserving semantic integrity. We establish a mathematical framework within a functional programming paradigm, derive the quantitative relationship between symbolic density and model interpretability, and propose a differentiable compression factor metric to evaluate encoding efficiency. Furthermore, we leverage parameter-efficient fine-tuning (PEFT) techniques to achieve a low-cost application of the GAEL language. Experimental results show that this method achieves a 78.3% token compression rate in code generation tasks while improving logical traceability by 62% through structural explicitness. This research provides new theoretical tools for efficient inference in LLMs and opens a symbolic path for model interpretability research.

interpretability, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.18657

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)

Add feedback

Chinese Stock Prediction Based on a Multi-Modal Transformer Framework: Macro-Micro Information Fusion

AI, Lumen, School, Tengzhou No. 1 Middle, Ji, Shihao, Song, Zihui, Zhong, Fucheng, Jia, Jisen, Wu, Zhaobo, Cao, Zheyi, Tianhao, Xu

arXiv.org Artificial IntelligenceJan-27-2025

This paper proposes an innovative Multi-Modal Transformer framework (MMF-Trans) designed to significantly improve the prediction accuracy of the Chinese stock market by integrating multi-source heterogeneous information including macroeconomy, micro-market, financial text, and event knowledge. The framework consists of four core modules: (1) A four-channel parallel encoder that processes technical indicators, financial text, macro data, and event knowledge graph respectively for independent feature extraction of multi-modal data; (2) A dynamic gated cross-modal fusion mechanism that adaptively learns the importance of different modalities through differentiable weight allocation for effective information integration; (3) A time-aligned mixed-frequency processing layer that uses an innovative position encoding method to effectively fuse data of different time frequencies and solves the time alignment problem of heterogeneous data; (4) A graph attention-based event impact quantification module that captures the dynamic impact of events on the market through event knowledge graph and quantifies the event impact coefficient. We introduce a hybrid-frequency Transformer and Event2Vec algorithm to effectively fuse data of different frequencies and quantify the event impact. Experimental results show that in the prediction task of CSI 300 constituent stocks, the root mean square error (RMSE) of the MMF-Trans framework is reduced by 23.7% compared to the baseline model, the event response prediction accuracy is improved by 41.2%, and the Sharpe ratio is improved by 32.6%.

artificial intelligence, information fusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.16621

Genre: Research Report (0.71)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Transformer^-1: Input-Adaptive Computation for Resource-Constrained Deployment

AI, Lumen, School, Tengzhou No. 1 Middle, Ji, Shihao, Song, Zihui, Zhong, Fucheng, Jia, Jisen, Wu, Zhaobo, Cao, Zheyi, Tianhao, Xu

arXiv.org Artificial IntelligenceJan-26-2025

Addressing the resource waste caused by fixed computation paradigms in deep learning models under dynamic scenarios, this paper proposes a Transformer$^{-1}$ architecture based on the principle of deep adaptivity. This architecture achieves dynamic matching between input features and computational resources by establishing a joint optimization model for complexity and computation. Our core contributions include: (1) designing a two-layer control mechanism, composed of a complexity predictor and a reinforcement learning policy network, enabling end-to-end optimization of computation paths; (2) deriving a lower bound theory for dynamic computation, proving the system's theoretical reach to optimal efficiency; and (3) proposing a layer folding technique and a CUDA Graph pre-compilation scheme, overcoming the engineering bottlenecks of dynamic architectures. In the ImageNet-1K benchmark test, our method reduces FLOPs by 42.7\% and peak memory usage by 34.1\% compared to the standard Transformer, while maintaining comparable accuracy ($\pm$0.3\%). Furthermore, we conducted practical deployment on the Jetson AGX Xavier platform, verifying the effectiveness and practical value of this method in resource-constrained environments. To further validate the generality of the method, we also conducted experiments on several natural language processing tasks and achieved significant improvements in resource efficiency.

deep learning, machine learning, resource-constrained deployment, (3 more...)

arXiv.org Artificial Intelligence

2501.16394

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

MyGO Multiplex CoT: A Method for Self-Reflection in Large Language Models via Double Chain of Thought Thinking

Ji, Shihao, Song, Zihui, Zhong, Fucheng, Jia, Jisen, Wu, Zhaobo, Cao, Zheyi, Xu, Tianhao

arXiv.org Artificial IntelligenceJan-20-2025

Recent advancements in large language models (LLMs) have demonstrated their impressive abilities in various reasoning and decision-making tasks. However, the quality and coherence of the reasoning process can still benefit from enhanced introspection and self-reflection. In this paper, we introduce Multiplex CoT (Chain of Thought), a method that enables LLMs to simulate a form of self-review while reasoning, by initiating double Chain of Thought (CoT) thinking. Multiplex CoT leverages the power of iterative reasoning, where the model generates an initial chain of thought and subsequently critiques and refines this reasoning with a second round of thought generation. This recursive approach allows for more coherent, logical, and robust answers, improving the overall decision-making process. We demonstrate how this method can be effectively implemented using simple prompt engineering in existing LLM architectures, achieving an effect similar to that of the Learning-Refinement Model (LRM) without the need for additional training. Additionally, we present a practical guide for implementing the method in Google Colab, enabling easy integration into real-world applications.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2501.13117

Country: Europe (0.16)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks

Li, Yang, Han, Shaobo, Ji, Shihao

arXiv.org Artificial IntelligenceMay-27-2024

As the adoption of large language models increases and the need for per-user or per-task model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and its variants, incur substantial storage and transmission costs. To further reduce stored parameters, we introduce a "divide-and-share" paradigm that breaks the barriers of low-rank decomposition across matrix dimensions, modules and layers by sharing parameters globally via a vector bank. As an instantiation of the paradigm to LoRA, our proposed VB-LoRA composites all the low-rank matrices of LoRA from a shared vector bank with a differentiable top-$k$ admixture module. VB-LoRA achieves extreme parameter efficiency while maintaining comparable or better performance compared to state-of-the-art PEFT methods. Extensive experiments demonstrate the effectiveness of VB-LoRA on natural language understanding, natural language generation, and instruction tuning tasks. When fine-tuning the Llama2-13B model, VB-LoRA only uses 0.4% of LoRA's stored parameters, yet achieves superior results. Our source code is available at https://github.com/leo-yangli/VB-LoRA.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.15179

Country:

Europe (0.93)
North America > United States > Hawaii > Honolulu County (0.14)

Genre:

Research Report (0.82)
Overview (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

FLSL: Feature-level Self-supervised Learning

Su, Qing, Netchaev, Anton, Li, Hai, Ji, Shihao

arXiv.org Artificial IntelligenceNov-6-2023

Current self-supervised learning (SSL) methods (e.g., SimCLR, DINO, VICReg,MOCOv3) target primarily on representations at instance level and do not generalize well to dense prediction tasks, such as object detection and segmentation.Towards aligning SSL with dense predictions, this paper demonstrates for the first time the underlying mean-shift clustering process of Vision Transformers (ViT), which aligns well with natural image semantics (e.g., a world of objects and stuffs). By employing transformer for joint embedding and clustering, we propose a two-level feature clustering SSL method, coined Feature-Level Self-supervised Learning (FLSL). We present the formal definition of the FLSL problem and construct the objectives from the mean-shift and k-means perspectives. We show that FLSL promotes remarkable semantic cluster representations and learns an embedding scheme amenable to intra-view and inter-view feature clustering. Experiments show that FLSL yields significant improvements in dense prediction tasks, achieving 44.9 (+2.8)% AP and 46.5% AP in object detection, as well as 40.8 (+2.3)% AP and 42.1% AP in instance segmentation on MS-COCO, using Mask R-CNN with ViT-S/16 and ViT-S/8 as backbone, respectively. FLSL consistently outperforms existing SSL methods across additional benchmarks, including UAV17 object detection on UAVDT, and video instance segmentation on DAVIS 2017.We conclude by presenting visualization and various ablation studies to better understand the success of FLSL. The source code is available at https://github.com/ISL-CV/FLSL.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2306.06203

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

Ye, Hui, Sunderraman, Rajshekhar, Ji, Shihao

arXiv.org Artificial IntelligenceAug-24-2023

The eXtreme Multi-label text Classification(XMC) refers to training a classifier that assigns a text sample with relevant labels from an extremely large-scale label set (e.g., millions of labels). We propose MatchXML, an efficient text-label matching framework for XMC. We observe that the label embeddings generated from the sparse Term Frequency-Inverse Document Frequency(TF-IDF) features have several limitations. We thus propose label2vec to effectively train the semantic dense label embeddings by the Skip-gram model. The dense label embeddings are then used to build a Hierarchical Label Tree by clustering. In fine-tuning the pre-trained encoder Transformer, we formulate the multi-label text classification as a text-label matching problem in a bipartite graph. We then extract the dense text representations from the fine-tuned Transformer. Besides the fine-tuned dense text embeddings, we also extract the static dense sentence embeddings from a pre-trained Sentence Transformer. Finally, a linear ranker is trained by utilizing the sparse TF-IDF features, the fine-tuned dense text representations and static dense sentence features. Experimental results demonstrate that MatchXML achieves state-of-the-art accuracy on five out of six datasets. As for the speed, MatchXML outperforms the competing methods on all the six datasets. Our source code is publicly available at https://github.com/huiyegit/MatchXML.

classification, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2308.13139

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

Accounting for Temporal Variability in Functional Magnetic Resonance Imaging Improves Prediction of Intelligence

Li, Yang, Ma, Xin, Sunderraman, Raj, Ji, Shihao, Kundu, Suprateek

arXiv.org Artificial IntelligenceDec-14-2022

Neuroimaging-based prediction methods for intelligence and cognitive abilities have seen a rapid development in literature. Among different neuroimaging modalities, prediction based on functional connectivity (FC) has shown great promise. Most literature has focused on prediction using static FC, but there are limited investigations on the merits of such analysis compared to prediction based on dynamic FC or region level functional magnetic resonance imaging (fMRI) times series that encode temporal variability. To account for the temporal dynamics in fMRI data, we propose a deep neural network involving bi-directional long short-term memory (bi-LSTM) approach that also incorporates feature selection mechanism. The proposed pipeline is implemented via an efficient GPU computation framework and applied to predict intelligence scores based on region level fMRI time series as well as dynamic FC. We compare the prediction performance for different intelligence measures based on static FC, dynamic FC, and region level time series acquired from the Adolescent Brain Cognitive Development (ABCD) study involving close to 7000 individuals. Our detailed analysis illustrates that static FC consistently has inferior prediction performance compared to region level time series or dynamic FC for unimodal rest and task fMRI experiments, and in almost all cases using a combination of task and rest features. In addition, the proposed bi-LSTM pipeline based on region level time series identifies several shared and differential important brain regions across task and rest fMRI experiments that drive intelligence prediction. A test-retest analysis of the selected features shows strong reliability across cross-validation folds. Given the large sample size from ABCD study, our results provide strong evidence that superior prediction of intelligence can be achieved by accounting for temporal variations in fMRI.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1002/hbm.26415

2211.07429

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback