Goto

Collaborating Authors

 Huang, Xiaocheng


Exploring applications of topological data analysis in stock index movement prediction

arXiv.org Artificial Intelligence

Topological Data Analysis (TDA) has recently gained significant attention in the field of financial prediction. However, the choice of point cloud construction methods, topological feature representations, and classification models has a substantial impact on prediction results. This paper addresses the classification problem of stock index movement. First, we construct point clouds for stock indices using three different methods. Next, we apply TDA to extract topological structures from the point clouds. Four distinct topological features are computed to represent the patterns in the data, and 15 combinations of these features are enumerated and input into six different machine learning models. We evaluate the predictive performance of various TDA configurations by conducting index movement classification tasks on datasets such as CSI, DAX, HSI and FTSE providing insights into the efficiency of different TDA setups.


Enabling Public Access to Non-Open Access Biomedical Literature via Idea-Expression Dichotomy and Fact Extraction

AAAI Conferences

The general public shows great potential for utilizing scientific research. For example, a singer discovered her ectopic pregnancy by looking up clinical case reports. However, an exorbitant paywall impedes the public’s access to scientific literature. Our case study on a social network demonstrates a growing need for non-open access publications, especially for biomedical literature. The challenge is that non-open access papers are protected by copyright licenses that bar free distribution. In this paper, we propose a technical framework that leverages the doctrine of "idea-expression dichotomy" to bring ideas across paywalls. Idea-expression dichotomy prevents copyright holders from monopolizing ideas, theories, facts, and concepts. Therefore facts may pass through paywalls unencumbered by copyright license restrictions. Existing fact extraction methods (such as information extraction) require either large training sets or domain knowledge, which is intractable for the diverse biomedical scope spanning from clinical findings to genomics. We therefore develop a rule-based system to represent and extract facts. Social networkers and academics validated the effectiveness of our approach. 7 out of 9 users rated the paper’s information from the facts to be above average (≥6/10). Only 7% of the extracted facts were rated misleading.