Overview
Time Series Language Model for Descriptive Caption Generation
Trabelsi, Mohamed, Boyd, Aidan, Cao, Jin, Uzunalioglu, Huseyin
The automatic generation of representative natural language descriptions for observable patterns in time series data enhances interpretability, simplifies analysis and increases cross-domain utility of temporal data. While pre-trained foundation models have made considerable progress in natural language processing (NLP) and computer vision (CV), their application to time series analysis has been hindered by data scarcity. Although several large language model (LLM)-based methods have been proposed for time series forecasting, time series captioning is under-explored in the context of LLMs. In this paper, we introduce TSLM, a novel time series language model designed specifically for time series captioning. TSLM operates as an encoder-decoder model, leveraging both text prompts and time series data representations to capture subtle temporal patterns across multiple phases and generate precise textual descriptions of time series inputs. TSLM addresses the data scarcity problem in time series captioning by first leveraging an in-context prompting synthetic data generation, and second denoising the generated data via a novel cross-modal dense retrieval scoring applied to time series-caption pairs. Experimental findings on various time series captioning datasets demonstrate that TSLM outperforms existing state-of-the-art approaches from multiple data modalities by a significant margin.
BERT4MIMO: A Foundation Model using BERT Architecture for Massive MIMO Channel State Information Prediction
Catak, Ferhat Ozgur, Kuzlu, Murat, Cali, Umit
Massive MIMO (Multiple-Input Multiple-Output) is an advanced wireless communication technology, using a large number of antennas to improve the overall performance of the communication system in terms of capacity, spectral, and energy efficiency. The performance of MIMO systems is highly dependent on the quality of channel state information (CSI). Predicting CSI is, therefore, essential for improving communication system performance, particularly in MIMO systems, since it represents key characteristics of a wireless channel, including propagation, fading, scattering, and path loss. This study proposes a foundation model inspired by BERT, called BERT4MIMO, which is specifically designed to process high-dimensional CSI data from massive MIMO systems. BERT4MIMO offers superior performance in reconstructing CSI under varying mobility scenarios and channel conditions through deep learning and attention mechanisms. The experimental results demonstrate the effectiveness of BERT4MIMO in a variety of wireless environments.
Quantifying A Firm's AI Engagement: Constructing Objective, Data-Driven, AI Stock Indices Using 10-K Filings
Following an analysis of existing AI-related exchange-traded funds (ETFs), we reveal the selection criteria for determining which stocks qualify as AI-related are often opaque and rely on vague phrases and subjective judgments. This paper proposes a new, objective, data-driven approach using natural language processing (NLP) techniques to classify AI stocks by analyzing annual 10-K filings from 3,395 NASDAQ-listed firms between 2011 and 2023. This analysis quantifies each company's engagement with AI through binary indicators and weighted AI scores based on the frequency and context of AI-related terms. Using these metrics, we construct four AI stock indices-the Equally Weighted AI Index (AII), the Size-Weighted AI Index (SAII), and two Time-Discounted AI Indices (TAII05 and TAII5X)-offering different perspectives on AI investment. We validate our methodology through an event study on the launch of OpenAI's ChatGPT, demonstrating that companies with higher AI engagement saw significantly greater positive abnormal returns, with analyses supporting the predictive power of our AI measures. Our indices perform on par with or surpass 14 existing AI-themed ETFs and the Nasdaq Composite Index in risk-return profiles, market responsiveness, and overall performance, achieving higher average daily returns and risk-adjusted metrics without increased volatility. These results suggest our NLP-based approach offers a reliable, market-responsive, and cost-effective alternative to existing AI-related ETF products. Our innovative methodology can also guide investors, asset managers, and policymakers in using corporate data to construct other thematic portfolios, contributing to a more transparent, data-driven, and competitive approach.
Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning
Humanoid robots must master numerous tasks with sparse rewards, posing a challenge for reinforcement learning (RL). We propose a method combining RL and automated planning to address this. Our approach uses short goal-conditioned policies (GCPs) organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs). Instead of primitive actions, the planning process generates HLAs. A single plan-tree, maintained during the agent's lifetime, holds knowledge about goal achievement. This hierarchy enhances sample efficiency and speeds up reasoning by reusing HLAs and anticipating future actions. Our Hierarchical Goal-Conditioned Policy Planning (HGCPP) framework uniquely integrates GCPs, MCTS, and hierarchical RL, potentially improving exploration and planning in complex tasks.
Enhancing Large Vision Model in Street Scene Semantic Understanding through Leveraging Posterior Optimization Trajectory
Kou, Wei-Bin, Lin, Qingfeng, Tang, Ming, Wang, Shuai, Ye, Rongguang, Zhu, Guangxu, Wu, Yik-Chung
To improve the generalization of the autonomous driving (AD) perception model, vehicles need to update the model over time based on the continuously collected data. As time progresses, the amount of data fitted by the AD model expands, which helps to improve the AD model generalization substantially. However, such ever-expanding data is a double-edged sword for the AD model. Specifically, as the fitted data volume grows to exceed the the AD model's fitting capacities, the AD model is prone to under-fitting. To address this issue, we propose to use a pretrained Large Vision Models (LVMs) as backbone coupled with downstream perception head to understand AD semantic information. This design can not only surmount the aforementioned under-fitting problem due to LVMs' powerful fitting capabilities, but also enhance the perception generalization thanks to LVMs' vast and diverse training data. On the other hand, to mitigate vehicles' computational burden of training the perception head while running LVM backbone, we introduce a Posterior Optimization Trajectory (POT)-Guided optimization scheme (POTGui) to accelerate the convergence. Concretely, we propose a POT Generator (POTGen) to generate posterior (future) optimization direction in advance to guide the current optimization iteration, through which the model can generally converge within 10 epochs. Extensive experiments demonstrate that the proposed method improves the performance by over 66.48\% and converges faster over 6 times, compared to the existing state-of-the-art approach.
BARTPredict: Empowering IoT Security with LLM-Driven Cyber Threat Prediction
Diaf, Alaeddine, Korba, Abdelaziz Amara, Karabadji, Nour Elislem, Ghamri-Doudane, Yacine
The integration of Internet of Things (IoT) technology in various domains has led to operational advancements, but it has also introduced new vulnerabilities to cybersecurity threats, as evidenced by recent widespread cyberattacks on IoT devices. Intrusion detection systems are often reactive, triggered by specific patterns or anomalies observed within the network. To address this challenge, this work proposes a proactive approach to anticipate and preemptively mitigate malicious activities, aiming to prevent potential damage before it occurs. This paper proposes an innovative intrusion prediction framework empowered by Pre-trained Large Language Models (LLMs). The framework incorporates two LLMs: a fine-tuned Bidirectional and AutoRegressive Transformers (BART) model for predicting network traffic and a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model for evaluating the predicted traffic. By harnessing the bidirectional capabilities of BART the framework then identifies malicious packets among these predictions. Evaluated using the CICIoT2023 IoT attack dataset, our framework showcases a notable enhancement in predictive performance, attaining an impressive 98% overall accuracy, providing a powerful response to the cybersecurity challenges that confront IoT networks.
Large Language Model Based Multi-Agent System Augmented Complex Event Processing Pipeline for Internet of Multimedia Things
Zeeshan, Talha, Kumar, Abhishek, Pirttikangas, Susanna, Tarkoma, Sasu
The rapid advancement of artificial intelligence (AI) technologies has revolutionized the way we process and analyze data, particularly in the field of complex event processing, such as video query analysis. Traditional CEP systems often struggle with the dynamic demands of modern applications such as real-time or near realtime video analytics that require the integration of diverse data sources, for example, thousands of surveillance cameras deployed in a city, leading to limitations in their performance and applicability. Modern CEP pipelines are domain-specific and often struggle to adapt to dynamic changes in the environment in a timely manner. State-of-the-art applications (such as live video streaming on TikTok, YouTube etc.) generate an increasing volume of diverse, complex data that needs to be handled in the appropriate manner depending on the use case. Large Language Models (LLMs), also known as foundation models, inherently possess the ability to handle and analyze dynamic forms of data and therefore provide the necessary foundation upon which a dynamic CEP pipeline can be created which can support a diverse range of domains.
Hype-Adjusted Probability Measure for NLP Stock Return Forecasting
This manuscript introduces the Hype-Adjusted Probability Measure developed in the context of a new Natural Language Processing (NLP) approach for stock return and volatility forecasting. A novel sentiment score equation is presented to capture component and memory effects and assign dynamic parameters, enhancing the impact of intraday news data on forecasting next-period volatility for selected U.S. semiconductor tickers. This approach integrates machine learning techniques to analyze and improve the predictive value of news. Building on the research of Geman et al [6], this work improves forecast accuracy by addressing news bias, memory, and weight, and incorporating shifts in senti-ment direction. Finally, we propose the Hype-Adjusted Probability Measure, proving its existence and uniqueness, and discuss its theoretical applications in finance for NLP-based stock return forecasting, outlining future research pathways inspired by its concepts.
Recommender systems and reinforcement learning for human-building interaction and context-aware support: A text mining-driven review of scientific literature
Zhang, Wenhao, Quintana, Matias, Miller, Clayton
The indoor environment significantly impacts human health and well-being; enhancing health and reducing energy consumption in these settings is a central research focus. With the advancement of Information and Communication Technology (ICT), recommendation systems and reinforcement learning (RL) have emerged as promising approaches to induce behavioral changes to improve the indoor environment and energy efficiency of buildings. This study aims to employ text mining and Natural Language Processing (NLP) techniques to thoroughly examine the connections among these approaches in the context of human-building interaction and occupant context-aware support. The study analyzed 27,595 articles from the ScienceDirect database, revealing extensive use of recommendation systems and RL for space optimization, location recommendations, and personalized control suggestions. Furthermore, this review underscores the vast potential for expanding recommender systems and RL applications in buildings and indoor environments. Fields ripe for innovation include predictive maintenance, building-related product recommendation, and optimization of environments tailored for specific needs, such as sleep and productivity enhancements based on user feedback. The study also notes the limitations of the method in capturing subtle academic nuances. Future improvements could involve integrating and fine-tuning pre-trained language models to better interpret complex texts.
Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry
Zimmermann, Yoel, Bazgir, Adib, Afzal, Zartashia, Agbere, Fariha, Ai, Qianxiang, Alampara, Nawaf, Al-Feghali, Alexander, Ansari, Mehrad, Antypov, Dmytro, Aswad, Amro, Bai, Jiaru, Baibakova, Viktoriia, Biswajeet, Devi Dutta, Bitzek, Erik, Bocarsly, Joshua D., Borisova, Anna, Bran, Andres M, Brinson, L. Catherine, Calderon, Marcel Moran, Canalicchio, Alessandro, Chen, Victor, Chiang, Yuan, Circi, Defne, Charmes, Benjamin, Chaudhary, Vikrant, Chen, Zizhang, Chiu, Min-Hsueh, Clymo, Judith, Dabhadkar, Kedar, Daelman, Nathan, Datar, Archit, de Jong, Wibe A., Evans, Matthew L., Fard, Maryam Ghazizade, Fisicaro, Giuseppe, Gangan, Abhijeet Sadashiv, George, Janine, Gonzalez, Jose D. Cojal, Götte, Michael, Gupta, Ankur K., Harb, Hassan, Hong, Pengyu, Ibrahim, Abdelrahman, Ilyas, Ahmed, Imran, Alishba, Ishimwe, Kevin, Issa, Ramsey, Jablonka, Kevin Maik, Jones, Colin, Josephson, Tyler R., Juhasz, Greg, Kapoor, Sarthak, Kang, Rongda, Khalighinejad, Ghazal, Khan, Sartaaj, Klawohn, Sascha, Kuman, Suneel, Ladines, Alvin Noe, Leang, Sarom, Lederbauer, Magdalena, Sheng-Lun, null, Liao, null, Liu, Hao, Liu, Xuefeng, Lo, Stanley, Madireddy, Sandeep, Maharana, Piyush Ranjan, Maheshwari, Shagun, Mahjoubi, Soroush, Márquez, José A., Mills, Rob, Mohanty, Trupti, Mohr, Bernadette, Moosavi, Seyed Mohamad, Moßhammer, Alexander, Naghdi, Amirhossein D., Naik, Aakash, Narykov, Oleksandr, Näsström, Hampus, Nguyen, Xuan Vu, Ni, Xinyi, O'Connor, Dana, Olayiwola, Teslim, Ottomano, Federico, Ozhan, Aleyna Beste, Pagel, Sebastian, Parida, Chiku, Park, Jaehee, Patel, Vraj, Patyukova, Elena, Petersen, Martin Hoffmann, Pinto, Luis, Pizarro, José M., Plessers, Dieter, Pradhan, Tapashree, Pratiush, Utkarsh, Puli, Charishma, Qin, Andrew, Rajabi, Mahyar, Ricci, Francesco, Risch, Elliot, Ríos-García, Martiño, Roy, Aritra, Rug, Tehseen, Sayeed, Hasan M, Scheidgen, Markus, Schilling-Wilhelmi, Mara, Schloz, Marcel, Schöppach, Fabian, Schumann, Julia, Schwaller, Philippe, Schwarting, Marcus, Sharlin, Samiha, Shen, Kevin, Shi, Jiale, Si, Pradip, D'Souza, Jennifer, Sparks, Taylor, Sudhakar, Suraj, Talirz, Leopold, Tang, Dandan, Taran, Olga, Terboven, Carla, Tropin, Mark, Tsymbal, Anastasiia, Ueltzen, Katharina, Unzueta, Pablo Andres, Vasan, Archit, Vinchurkar, Tirtha, Vo, Trung, Vogel, Gabriel, Völker, Christoph, Weinreich, Jan, Yang, Faradawn, Zaki, Mohd, Zhang, Chi, Zhang, Sylvester, Zhang, Weijie, Zhu, Ruijie, Zhu, Shang, Janssen, Jan, Li, Calvin, Foster, Ian, Blaiszik, Ben
Here, we present the outcomes from the second Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry, which engaged participants across global hybrid locations, resulting in 34 team submissions. The submissions spanned seven key application areas and demonstrated the diverse utility of LLMs for applications in (1) molecular and material property prediction; (2) molecular and material design; (3) automation and novel interfaces; (4) scientific communication and education; (5) research data management and automation; (6) hypothesis generation and evaluation; and (7) knowledge extraction and reasoning from scientific literature. Each team submission is presented in a summary table with links to the code and as brief papers in the appendix. Beyond team results, we discuss the hackathon event and its hybrid format, which included physical hubs in Toronto, Montreal, San Francisco, Berlin, Lausanne, and Tokyo, alongside a global online hub to enable local and virtual collaboration. Overall, the event highlighted significant improvements in LLM capabilities since the previous year's hackathon, suggesting continued expansion of LLMs for applications in materials science and chemistry research. These outcomes demonstrate the dual utility of LLMs as both multipurpose models for diverse machine learning tasks and platforms for rapid prototyping custom applications in scientific research.