Optimal Text-Based Time-Series Indices

May-16-2024–arXiv.org Artificial Intelligence

This integration is typically done by (i) selecting, (ii) transforming, and (iii) aggregating textual content into a time-series representation (see Ardia et al., 2019; Algaba et al., 2020, for a general overview of these steps). While many studies have focused on steps (ii) and (iii)-- transforming and aggregating textual data into a quantitative measure such as sentiment (see e.g., Loughran and McDonald, 2014; Jegadeesh and Wu, 2013; Manela and Moreira, 2017)--the essential selection step (i), which usually relies on subjective ad-hoc rules, has not received much attention yet. We aim to fill this gap in this article by proposing an approach to construct text-based time-series indices optimally. Specifically, our algorithm determines which set of texts, among a large corpus, leads to a text-based index that is optimal for a specific objective--typically, an index that maximizes the contemporaneous relation or the predictive performance with respect to a target variable, such as inflation. Our methodology relies on binary selection matrices that, applied to the vocabulary of tokens, select the relevant texts in the corpus.

dimension, matrix, selection matrix, (15 more...)

arXiv.org Artificial Intelligence

May-16-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Michigan (0.04)
  - Canada > Quebec
    - Montreal (0.04)
    - Estrie Region > Sherbrooke (0.04)

Genre:
- Research Report (1.00)

Industry:
- Government (1.00)
- Banking & Finance > Economy (1.00)
- Media > News (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Evolutionary Systems (0.47)
  - Representation & Reasoning > Search (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found