AITopics | Ao, Xiang

Collaborating Authors

Ao, Xiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Panoramic Interests: Stylistic-Content Aware Personalized Headline Generation

Lian, Junhong, Ao, Xiang, Liu, Xinyu, Liu, Yang, He, Qing

arXiv.org Artificial IntelligenceJan-27-2025

Personalized news headline generation aims to provide users with attention-grabbing headlines that are tailored to their preferences. Prevailing methods focus on user-oriented content preferences, but most of them overlook the fact that diverse stylistic preferences are integral to users' panoramic interests, leading to suboptimal personalization. In view of this, we propose a novel Stylistic-Content Aware Personalized Headline Generation (SCAPE) framework. SCAPE extracts both content and stylistic features from headlines with the aid of large language model (LLM) collaboration. It further adaptively integrates users' long- and short-term interests through a contrastive learning-based hierarchical fusion network. By incorporating the panoramic interests into the headline generator, SCAPE reflects users' stylistic-content preferences during the generation process. Extensive experiments on the real-world dataset PENS demonstrate the superiority of SCAPE over baselines.

headline generation, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3701716.3715539

2501.119

Country:

North America > United States (0.28)
Asia > China (0.17)

Genre: Research Report (0.50)

Industry: Government > Voting & Elections (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fact-Preserved Personalized News Headline Generation

Yang, Zhao, Lian, Junhong, Ao, Xiang

arXiv.org Artificial IntelligenceJan-20-2025

Personalized news headline generation, aiming at generating user-specific headlines based on readers' preferences, burgeons a recent flourishing research direction. Existing studies generally inject a user interest embedding into an encoderdecoder headline generator to make the output personalized, while the factual consistency of headlines is inadequate to be verified. In this paper, we propose a framework Fact-Preserved Personalized News Headline Generation (short for FPG), to prompt a tradeoff between personalization and consistency. In FPG, the similarity between the candidate news to be exposed and the historical clicked news is used to give different levels of attention to key facts in the candidate news, and the similarity scores help to learn a fact-aware global user embedding. Besides, an additional training procedure based on contrastive learning is devised to further enhance the factual consistency of generated headlines. Extensive experiments conducted on a real-world benchmark PENS validate the superiority of FPG, especially on the tradeoff between personalization and factual consistency.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDM58522.2023.00197

2501.11828

Country: Asia > China (0.15)

Genre: Research Report (0.82)

Industry:

Media (0.68)
Leisure & Entertainment > Sports > Golf (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Controlling Large Language Models Through Concept Activation Vectors

Zhang, Hanyu, Wang, Xiting, Li, Chengao, Ao, Xiang, He, Qing

arXiv.org Artificial IntelligenceJan-10-2025

As large language models (LLMs) are widely deployed across various domains, the ability to control their generated outputs has become more critical. This control involves aligning LLMs outputs with human values and ethical principles or customizing LLMs on specific topics or styles for individual users. Existing controlled generation methods either require significant computational resources and extensive trial-and-error or provide coarse-grained control. In this paper, we propose Generation with Concept Activation Vector (GCAV), a lightweight model control framework that ensures accurate control without requiring resource-extensive fine-tuning. Specifically, GCAV first trains a concept activation vector for specified concepts to be controlled, such as toxicity. During inference, GCAV steers the concept vector in LLMs, for example, by removing the toxicity concept vector from the activation layers. Control experiments from different perspectives, including toxicity reduction, sentiment control, linguistic style, and topic control, demonstrate that our framework achieves state-of-the-art performance with granular control, allowing for fine-grained adjustments of both the steering layers and the steering magnitudes for individual samples.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2501.05764

Country:

Asia > China (0.15)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors

Shi, Hao, Luo, Cuicui, Song, Weili, Zhang, Xinting, Ao, Xiang

arXiv.org Artificial IntelligenceJun-26-2024

The variability and low signal-to-noise ratio in financial data, combined with the necessity for interpretability, make the alpha factor mining workflow a crucial component of quantitative investment. Transitioning from early manual extraction to genetic programming, the most advanced approach in this domain currently employs reinforcement learning to mine a set of combination factors with fixed weights. However, the performance of resultant alpha factors exhibits inconsistency, and the inflexibility of fixed factor weights proves insufficient in adapting to the dynamic nature of financial markets. To address this issue, this paper proposes a two-stage formulaic alpha generating framework AlphaForge, for alpha factor mining and factor combination. This framework employs a generative-predictive neural network to generate factors, leveraging the robust spatial exploration capabilities inherent in deep learning while concurrently preserving diversity. The combination model within the framework incorporates the temporal performance of factors for selection and dynamically adjusts the weights assigned to each component alpha factor. Experiments conducted on real-world datasets demonstrate that our proposed model outperforms contemporary benchmarks in formulaic alpha factor mining. Furthermore, our model exhibits a notable enhancement in portfolio returns within the realm of quantitative investment.

alpha factor, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.18394

Country:

Asia > China (0.30)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework

Qiao, Yiran, Ao, Xiang, Liu, Yang, Xu, Jiarong, Sun, Xiaoqian, He, Qing

arXiv.org Artificial IntelligenceJun-6-2024

Recent prevailing works on graph machine learning typically follow a similar methodology that involves designing advanced variants of graph neural networks (GNNs) to maintain the superior performance of GNNs on different graphs. In this paper, we aim to streamline the GNN design process and leverage the advantages of Large Language Models (LLMs) to improve the performance of GNNs on downstream tasks. We formulate a new paradigm, coined "LLMs-as-Consultants," which integrates LLMs with GNNs in an interactive manner. A framework named LOGIN (LLM Consulted GNN training) is instantiated, empowering the interactive utilization of LLMs within the GNN training process. First, we attentively craft concise prompts for spotted nodes, carrying comprehensive semantic and topological information, and serving as input to LLMs. Second, we refine GNNs by devising a complementary coping mechanism that utilizes the responses from LLMs, depending on their correctness. We empirically evaluate the effectiveness of LOGIN on node classification tasks across both homophilic and heterophilic graphs. The results illustrate that even basic GNN architectures, when employed within the proposed LLMs-as-Consultants paradigm, can achieve comparable performance to advanced GNNs with intricate designs. Our codes are available at https://github.com/QiaoYRan/LOGIN.

large language model, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

2405.13902

Country:

North America > United States (0.47)
Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EFSA: Towards Event-Level Financial Sentiment Analysis

Chen, Tianyu, Zhang, Yiming, Yu, Guoxin, Zhang, Dapeng, Zeng, Li, He, Qing, Ao, Xiang

arXiv.org Artificial IntelligenceApr-8-2024

In this paper, we extend financial sentiment analysis~(FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extraction as a classification task by designing a categorization comprising coarse-grained and fine-grained event categories. Under this setting, we formulate the \textbf{E}vent-Level \textbf{F}inancial \textbf{S}entiment \textbf{A}nalysis~(\textbf{EFSA} for short) task that outputs quintuples consisting of (company, industry, coarse-grained event, fine-grained event, sentiment) from financial text. A large-scale Chinese dataset containing $12,160$ news articles and $13,725$ quintuples is publicized as a brand new testbed for our task. A four-hop Chain-of-Thought LLM-based approach is devised for this task. Systematically investigations are conducted on our dataset, and the empirical results demonstrate the benchmarking scores of existing methods and our proposed method can reach the current state-of-the-art. Our dataset and framework implementation are available at https://anonymous.4open.science/r/EFSA-645E

large language model, machine learning, sentiment, (20 more...)

arXiv.org Artificial Intelligence

2404.08681

Country: Asia > China (0.46)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Defending Against Backdoor Attacks in Natural Language Generation

Sun, Xiaofei, Li, Xiaoya, Meng, Yuxian, Ao, Xiang, Lyu, Lingjuan, Li, Jiwei, Zhang, Tianwei

arXiv.org Artificial IntelligenceOct-9-2023

The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive. Unfortunately, little effort has been invested to how backdoor attacks can affect current NLG models and how to defend against these attacks. In this work, by giving a formal definition of backdoor attack and defense, we investigate this problem on two important NLG tasks, machine translation and dialog generation. Tailored to the inherent nature of NLG models (e.g., producing a sequence of coherent words given contexts), we design defending strategies against attacks. We find that testing the backward probability of generating sources given targets yields effective defense performance against all different types of attacks, and is able to handle the {\it one-to-many} issue in many NLG tasks such as dialog generation. We hope that this work can raise the awareness of backdoor risks concealed in deep NLG systems and inspire more future work (both attack and defense) towards this direction.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2106.0181

Country:

North America > United States (0.28)
Europe (0.28)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning

Yu, Shuo, Xue, Hongyan, Ao, Xiang, Pan, Feiyang, He, Jia, Tu, Dandan, He, Qing

arXiv.org Artificial IntelligenceMay-25-2023

In the field of quantitative trading, it is common practice to transform raw historical stock data into indicative signals for the market trend. Such signals are called alpha factors. Alphas in formula forms are more interpretable and thus favored by practitioners concerned with risk. In practice, a set of formulaic alphas is often used together for better modeling precision, so we need to find synergistic formulaic alpha sets that work well together. However, most traditional alpha generators mine alphas one by one separately, overlooking the fact that the alphas would be combined later. In this paper, we propose a new alpha-mining framework that prioritizes mining a synergistic set of alphas, i.e., it directly uses the performance of the downstream combination model to optimize the alpha generator. Our framework also leverages the strong exploratory capabilities of reinforcement learning~(RL) to better explore the vast search space of formulaic alphas. The contribution to the combination models' performance is assigned to be the return used in the RL process, driving the alpha generator to find better alphas that improve upon the current set. Experimental evaluations on real-world stock market data demonstrate both the effectiveness and the efficiency of our framework for stock trend forecasting. The investment simulation results show that our framework is able to achieve higher returns compared to previous approaches.

data mining, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3580305.3599831

2306.12964

Country:

Europe (0.67)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Reliable Representations Make A Stronger Defender: Unsupervised Structure Refinement for Robust GNN

Li, Kuan, Liu, Yang, Ao, Xiang, Chi, Jianfeng, Feng, Jinghua, Yang, Hao, He, Qing

arXiv.org Artificial IntelligenceApr-21-2023

Benefiting from the message passing mechanism, Graph Neural Networks (GNNs) have been successful on flourish tasks over graph data. However, recent studies have shown that attackers can catastrophically degrade the performance of GNNs by maliciously modifying the graph structure. A straightforward solution to remedy this issue is to model the edge weights by learning a metric function between pairwise representations of two end nodes, which attempts to assign low weights to adversarial edges. The existing methods use either raw features or representations learned by supervised GNNs to model the edge weights. However, both strategies are faced with some immediate problems: raw features cannot represent various properties of nodes (e.g., structure information), and representations learned by supervised GNN may suffer from the poor performance of the classifier on the poisoned graph. We need representations that carry both feature information and as mush correct structure information as possible and are insensitive to structural perturbations. To this end, we propose an unsupervised pipeline, named STABLE, to optimize the graph structure. Finally, we input the well-refined graph into a downstream classifier. For this part, we design an advanced GCN that significantly enhances the robustness of vanilla GCN without increasing the time complexity. Extensive experiments on four real-world graph benchmarks demonstrate that STABLE outperforms the state-of-the-art methods and successfully defends against various attacks.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3534678.3539484

2207.00012

Country:

North America > United States (0.30)
Asia > China (0.29)

Genre: Research Report > Promising Solution (0.66)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Security & Privacy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Layer-wise Model Pruning based on Mutual Information

Fan, Chun, Li, Jiwei, Ao, Xiang, Wu, Fei, Meng, Yuxian, Sun, Xiaofei

arXiv.org Artificial IntelligenceAug-28-2021

The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level. Extensive experiments show that at the same sparsity level, the proposed strategy offers both greater speedup and higher performances than weight-based pruning methods (e.g., magnitude pruning, movement pruning).

artificial intelligence, neural network, pruning, (14 more...)

arXiv.org Artificial Intelligence

2108.12594

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback