Oceania
Panoramic Interests: Stylistic-Content Aware Personalized Headline Generation
Lian, Junhong, Ao, Xiang, Liu, Xinyu, Liu, Yang, He, Qing
Personalized news headline generation aims to provide users with attention-grabbing headlines that are tailored to their preferences. Prevailing methods focus on user-oriented content preferences, but most of them overlook the fact that diverse stylistic preferences are integral to users' panoramic interests, leading to suboptimal personalization. In view of this, we propose a novel Stylistic-Content Aware Personalized Headline Generation (SCAPE) framework. SCAPE extracts both content and stylistic features from headlines with the aid of large language model (LLM) collaboration. It further adaptively integrates users' long- and short-term interests through a contrastive learning-based hierarchical fusion network. By incorporating the panoramic interests into the headline generator, SCAPE reflects users' stylistic-content preferences during the generation process. Extensive experiments on the real-world dataset PENS demonstrate the superiority of SCAPE over baselines.
Data Mining in Transportation Networks with Graph Neural Networks: A Review and Outlook
Xue, Jiawei, Tan, Ruichen, Ma, Jianzhu, Ukkusuri, Satish V.
Data mining in transportation networks (DMTNs) refers to using diverse types of spatio-temporal data for various transportation tasks, including pattern analysis, traffic prediction, and traffic controls. Graph neural networks (GNNs) are essential in many DMTN problems due to their capability to represent spatial correlations between entities. Between 2016 and 2024, the notable applications of GNNs in DMTNs have extended to multiple fields such as traffic prediction and operation. However, existing reviews have primarily focused on traffic prediction tasks. To fill this gap, this study provides a timely and insightful summary of GNNs in DMTNs, highlighting new progress in prediction and operation from academic and industry perspectives since 2023. First, we present and analyze various DMTN problems, followed by classical and recent GNN models. Second, we delve into key works in three areas: (1) traffic prediction, (2) traffic operation, and (3) industry involvement, such as Google Maps, Amap, and Baidu Maps. Along these directions, we discuss new research opportunities based on the significance of transportation problems and data availability. Finally, we compile resources such as data, code, and other learning materials to foster interdisciplinary communication. This review, driven by recent trends in GNNs in DMTN studies since 2023, could democratize abundant datasets and efficient GNN methods for various transportation problems including prediction and operation.
Consistent support recovery for high-dimensional diffusions
Marushkevych, Dmytro, Pina, Francisco, Podolskij, Mark
Over the past decades, statistical inference for stochastic processes has garnered increasing attention, driven by their extensive applications across diverse scientific fields. In particular, stochastic differential equations (SDEs) have proven fundamental in disciplines such as biology [38], epidemiology [6], physics [37], economics [5], neurology [25], and mathematical finance [29]. This wide applicability has spurred significant advancements in both parametric and non-parametric inference methods under various frameworks. Simultaneously, the growing importance of high-dimensional data has introduced new complexities to statistical modeling. Researchers have explored scenarios where the number of model parameters far exceeds the available observations or where most parameters exhibit specific asymptotic behavior, departing from the classical approach that assumes only the number of observations grows asymptotically. While substantial progress has been made in understanding high-dimensional frameworks for simpler models [10, 28, 42, 24], the study of high-dimensional stochastic processes remains relatively scarce. Existing work on high-dimensional diffusions has predominantly focused on particle interaction systems within mean field theory, with notable parametric and non-parametric results explored in [2, 3, 7, 8, 9, 11, 15, 23, 24, 32, 41], among others. However, most studies have restricted parameter spaces to finite dimensions, leaving the case of infinitedimensional parameter spaces underexplored. Expanding both theoretical and methodological knowledge at the intersection of high-dimensional frameworks and stochastic processes has thus become a topic of significant scientific interest.
FuzzyLight: A Robust Two-Stage Fuzzy Approach for Traffic Signal Control Works in Real Cities
Li, Mingyuan, Wang, Jiahao, Du, Bo, Shen, Jun, Wu, Qiang
Effective traffic signal control (TSC) is crucial in mitigating urban congestion and reducing emissions. Recently, reinforcement learning (RL) has been the research trend for TSC. However, existing RL algorithms face several real-world challenges that hinder their practical deployment in TSC: (1) Sensor accuracy deteriorates with increased sensor detection range, and data transmission is prone to noise, potentially resulting in unsafe TSC decisions. (2) During the training of online RL, interactions with the environment could be unstable, potentially leading to inappropriate traffic signal phase (TSP) selection and traffic congestion. (3) Most current TSC algorithms focus only on TSP decisions, overlooking the critical aspect of phase duration, affecting safety and efficiency. To overcome these challenges, we propose a robust two-stage fuzzy approach called FuzzyLight, which integrates compressed sensing and RL for TSC deployment. FuzzyLight offers several key contributions: (1) It employs fuzzy logic and compressed sensing to address sensor noise and enhances the efficiency of TSP decisions. (2) It maintains stable performance during training and combines fuzzy logic with RL to generate precise phases. (3) It works in real cities across 22 intersections and demonstrates superior performance in both real-world and simulated environments. Experimental results indicate that FuzzyLight enhances traffic efficiency by 48% compared to expert-designed timings in the real world. Furthermore, it achieves state-of-the-art (SOTA) performance in simulated environments using six real-world datasets with transmission noise. The code and deployment video are available at the URL1
FUNU: Boosting Machine Unlearning Efficiency by Filtering Unnecessary Unlearning
Li, Zitong, Ye, Qingqing, Hu, Haibo
Machine unlearning is an emerging field that selectively removes specific data samples from a trained model. This capability is crucial for addressing privacy concerns, complying with data protection regulations, and correcting errors or biases introduced by certain data. Unlike traditional machine learning, where models are typically static once trained, machine unlearning facilitates dynamic updates that enable the model to ``forget'' information without requiring complete retraining from scratch. There are various machine unlearning methods, some of which are more time-efficient when data removal requests are fewer. To decrease the execution time of such machine unlearning methods, we aim to reduce the size of data removal requests based on the fundamental assumption that the removal of certain data would not result in a distinguishable retrained model. We first propose the concept of unnecessary unlearning, which indicates that the model would not alter noticeably after removing some data points. Subsequently, we review existing solutions that can be used to solve our problem. We highlight their limitations in adaptability to different unlearning scenarios and their reliance on manually selected parameters. We consequently put forward FUNU, a method to identify data points that lead to unnecessary unlearning. FUNU circumvents the limitations of existing solutions. The idea is to discover data points within the removal requests that have similar neighbors in the remaining dataset. We utilize a reference model to set parameters for finding neighbors, inspired from the area of model memorization. We provide a theoretical analysis of the privacy guarantee offered by FUNU and conduct extensive experiments to validate its efficacy.
Long-Term Interest Clock: Fine-Grained Time Perception in Streaming Recommendation System
Zhu, Yongchun, Jiang, Guanyu, Chen, Jingwu, Zhang, Feng, Yang, Xiao, Liu, Zuotao
User interests manifest a dynamic pattern within the course of a day, e.g., a user usually favors soft music at 8 a.m. but may turn to ambient music at 10 p.m. To model dynamic interests in a day, hour embedding is widely used in traditional daily-trained industrial recommendation systems. However, its discreteness can cause periodical online patterns and instability in recent streaming recommendation systems. Recently, Interest Clock has achieved remarkable performance in streaming recommendation systems. Nevertheless, it models users' dynamic interests in a coarse-grained manner, merely encoding users' discrete interests of 24 hours from short-term behaviors. In this paper, we propose a fine-grained method for perceiving time information for streaming recommendation systems, named Long-term Interest Clock (LIC). The key idea of LIC is adaptively calculating current user interests by taking into consideration the relevance of long-term behaviors around current time (e.g., 8 a.m.) given a candidate item. LIC consists of two modules: (1) Clock-GSU retrieves a sub-sequence by searching through long-term behaviors, using query information from a candidate item and current time, (2) Clock-ESU employs a time-gap-aware attention mechanism to aggregate sub-sequence with the candidate item. With Clock-GSU and Clock-ESU, LIC is capable of capturing users' dynamic fine-grained interests from long-term behaviors. We conduct online A/B tests, obtaining +0.122% improvements on user active days. Besides, the extended offline experiments show improvements as well. Long-term Interest Clock has been integrated into Douyin Music App's recommendation system.
Open Problems in Mechanistic Interpretability
Sharkey, Lee, Chughtai, Bilal, Batson, Joshua, Lindsey, Jack, Wu, Jeff, Bushnaq, Lucius, Goldowsky-Dill, Nicholas, Heimersheim, Stefan, Ortega, Alejandro, Bloom, Joseph, Biderman, Stella, Garriga-Alonso, Adria, Conmy, Arthur, Nanda, Neel, Rumbelow, Jessica, Wattenberg, Martin, Schoots, Nandi, Miller, Joseph, Michaud, Eric J., Casper, Stephen, Tegmark, Max, Saunders, William, Bau, David, Todd, Eric, Geiger, Atticus, Geva, Mor, Hoogland, Jesse, Murfet, Daniel, McGrath, Tom
Mechanistic interpretability aims to understand the computational mechanisms underlying neural networks' capabilities in order to accomplish concrete scientific and engineering goals. Progress in this field thus promises to provide greater assurance over AI system behavior and shed light on exciting scientific questions about the nature of intelligence. Despite recent progress toward these goals, there are many open problems in the field that require solutions before many scientific and practical benefits can be realized: Our methods require both conceptual and practical improvements to reveal deeper insights; we must figure out how best to apply our methods in pursuit of specific goals; and the field must grapple with socio-technical challenges that influence and are influenced by our work. This forward-facing review discusses the current frontier of mechanistic interpretability and the open problems that the field may benefit from prioritizing. This review collects the perspectives of its various authors and represents a synthesis of their views by Apollo Research on behalf of Schmidt Sciences. The perspectives presented here do not necessarily reflect the views of any individual author or the institutions with which they are affiliated.
DialUp! Modeling the Language Continuum by Adapting Models to Dialects and Dialects to Models
Bafna, Niyati, Chang, Emily, Robinson, Nathaniel R., Mortensen, David R., Murray, Kenton, Yarowsky, David, Sirin, Hale
Most of the world's languages and dialects are low-resource, and lack support in mainstream machine translation (MT) models. However, many of them have a closely-related high-resource language (HRL) neighbor, and differ in linguistically regular ways from it. This underscores the importance of model robustness to dialectical variation and cross-lingual generalization to the HRL dialect continuum. We present DialUp, consisting of a training-time technique for adapting a pretrained model to dialectical data (M->D), and an inference-time intervention adapting dialectical data to the model expertise (D->M). M->D induces model robustness to potentially unseen and unknown dialects by exposure to synthetic data exemplifying linguistic mechanisms of dialectical variation, whereas D->M treats dialectical divergence for known target dialects. These methods show considerable performance gains for several dialects from four language families, and modest gains for two other language families. We also conduct feature and error analyses, which show that language varieties with low baseline MT performance are more likely to benefit from these approaches.
An FPGA-Based Neuro-Fuzzy Sensor for Personalized Driving Assistance
Mata-Carballeira, Óscar, Gutiérrez-Zaballa, Jon, del Campo, Inés, Martínez, Victoria
Depending on their sophistication level, sensors can be classified ranging from simple sensors that directly measure single physical parameters (e.g., ambient light sensors and temperature sensors) to complex intelligent sensors, which determine parameters of the surrounding environment through wide spectrum signals (e.g., radio frequency/radar and light/video); besides measuring, they perform data processing and are enabled to carry out actuations. Whereas intelligent sensors make use of data of a different nature underneath, in which complex and nonlinear behaviors are codified; data-mining techniques used jointly with machine learning (ML) algorithms have shown adequate performance for modeling this hidden information. As intelligent sensors often rely on complex sensors and sensor fusion techniques, the data processing power they need can only be provided by high-performance computational platforms such as microprocessors, graphics-processing units (GPUs), or field-programmable gate arrays (FPGAs). In particular, FPGA-based implementations stand out due to the extremely high operational frequencies and low power consumption they can achieve, even for complex, multilayered algorithms [1]. In the context of the automotive field, intelligent sensors are key components of current assistance systems.
Can Molecular Evolution Mechanism Enhance Molecular Representation?
Li, Kun, Hu, Longtao, Cai, Xiantao, Wu, Jia, Hu, Wenbin
Molecular evolution is the process of simulating the natural evolution of molecules in chemical space to explore potential molecular structures and properties. The relationships between similar molecules are often described through transformations such as adding, deleting, and modifying atoms and chemical bonds, reflecting specific evolutionary paths. Existing molecular representation methods mainly focus on mining data, such as atomic-level structures and chemical bonds directly from the molecules, often overlooking their evolutionary history. Consequently, we aim to explore the possibility of enhancing molecular representations by simulating the evolutionary process. We extract and analyze the changes in the evolutionary pathway and explore combining it with existing molecular representations. Therefore, this paper proposes the molecular evolutionary network (MEvoN) for molecular representations. First, we construct the MEvoN using molecules with a small number of atoms and generate evolutionary paths utilizing similarity calculations. Then, by modeling the atomic-level changes, MEvoN reveals their impact on molecular properties. Experimental results show that the MEvoN-based molecular property prediction method significantly improves the performance of traditional end-to-end algorithms on several molecular datasets. The code is available at https://anonymous.4open.science/r/MEvoN-7416/.