enhancing
Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting
Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. In this paper, we propose to tackle such forecasting problem with Transformer. Although impressed by its performance in our preliminary study, we found its two major weaknesses: (1) locality-agnostics: the point-wise dot-product self-attention in canonical Transformer architecture is insensitive to local context, which can make the model prone to anomalies in time series; (2) memory bottleneck: space complexity of canonical Transformer grows quadratically with sequence length L, making directly modeling long time series infeasible. In order to solve these two issues, we first propose convolutional self-attention by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism. Then, we propose LogSparse Transformer with only O(L(log L)^2) memory cost, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget. Our experiments on both synthetic data and real-world datasets show that it compares favorably to the state-of-the-art.
CoreSPECT: Enhancing Clustering Algorithms via an Interplay of Density and Geometry
Mukherjee, Chandra Sekhar, Bae, Joonyoung, Zhang, Jiapeng
In this paper, we provide a novel perspective on the underlying structure of real-world data with ground-truth clusters via characterization of an abundantly observed yet often overlooked density-geometry correlation, that manifests itself as a multi-layered manifold structure. We leverage this correlation to design CoreSPECT (Core Space Projection based Enhancement of Clustering Techniques), a general framework that improves the performance of generic clustering algorithms. Our framework boosts the performance of clustering algorithms by applying them to strategically selected regions, then extending the partial partition to a complete partition for the dataset using a novel neighborhood graph based multi-layer propagation procedure. We provide initial theoretical support of the functionality of our framework under the assumption of our model, and then provide large-scale real-world experiments on 19 datasets that include standard image datasets as well as genomics datasets. We observe two notable improvements. First, CoreSPECT improves the NMI of K-Means by 20% on average, making it competitive to (and in some cases surpassing) the state-of-the-art manifold-based clustering algorithms, while being orders of magnitude faster. Secondly, our framework boosts the NMI of HDBSCAN by more than 100% on average, making it competitive to the state-of-the-art in several cases without requiring the true number of clusters and hyper-parameter tuning. The overall ARI improvements are higher.
- North America > United States > California (0.14)
- Asia (0.04)
- Research Report (0.63)
- Workflow (0.47)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
Enhancing the Accuracy and Fairness of Human Decision Making
Societies often rely on human experts to take a wide variety of decisions affecting their members, from jail-or-release decisions taken by judges and stop-and-frisk decisions taken by police officers to accept-or-reject decisions taken by academics. In this context, each decision is taken by an expert who is typically chosen uniformly at random from a pool of experts. However, these decisions may be imperfect due to limited experience, implicit biases, or faulty probabilistic reasoning. Can we improve the accuracy and fairness of the overall decision making process by optimizing the assignment between experts and decisions? In this paper, we address the above problem from the perspective of sequential decision making and show that, for different fairness notions from the literature, it reduces to a sequence of (constrained) weighted bipartite matchings, which can be solved efficiently using algorithms with approximation guarantees. Moreover, these algorithms also benefit from posterior sampling to actively trade off exploitation---selecting expert assignments which lead to accurate and fair decisions---and exploration---selecting expert assignments to learn about the experts' preferences and biases. We demonstrate the effectiveness of our algorithms on both synthetic and real-world data and show that they can significantly improve both the accuracy and fairness of the decisions taken by pools of experts.
Enhancing the development of Cherenkov Telescope Array control software with Large Language Models
Kostunin, Dmitriy, Jones, Elisa, Sotnikov, Vladimir, Sotnikov, Valery, Golovachev, Sergo, Strube, Alexandre
We develop AI agents based on instruction-finetuned large language models (LLMs) to assist in the engineering and operation of the Cherenkov Telescope Array Observatory (CT AO) Control and Data Acquisition Software (ACADA). These agents align with project-specific documentation and codebases, understand contextual information, interact with external APIs, and communicate with users in natural language.
- Europe > Middle East > Cyprus > Pafos > Paphos (0.05)
- Europe > Italy > Sardinia > Cagliari (0.05)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.05)
LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model
Compared to traditional Artificial Neural Network (ANN), Spiking Neural Network (SNN) has garnered widespread academic interest for its intrinsic ability to transmit information in a more energy-efficient manner. The recently proposed multi-threshold model provides more possibilities for further enhancing the learning capability of SNNs. In this paper, we rigorously analyze the relationship among the multi-threshold model, vanilla spiking model and quantized ANNs from a mathematical perspective, then propose a novel LM-HT model, which is an equidistant multi-threshold model that can dynamically regulate the global input current and membrane potential leakage on the time dimension. The LM-HT model can also be transformed into a vanilla single threshold model through reparameterization, thereby achieving more flexible hardware deployment. In addition, we note that the LM-HT model can seamlessly integrate with ANN-SNN Conversion framework under special initialization.
EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature
Spiking neural networks (SNNs) have gained more and more interest as one of the energy-efficient alternatives of conventional artificial neural networks (ANNs). They exchange 0/1 spikes for processing information, thus most of the multiplications in networks can be replaced by additions. However, binary spike feature maps will limit the expressiveness of the SNN and result in unsatisfactory performance compared with ANNs. It is shown that a rich output feature representation, i.e., the feature vector before classifier) is beneficial to training an accurate model in ANNs for classification. We wonder if it also does for SNNs and how to improve the feature representation of the SNN.To this end, we materialize this idea in two special designed methods for SNNs.First, inspired by some ANN-SNN methods that directly copy-paste the weight parameters from trained ANN with light modification to homogeneous SNN can obtain a well-performed SNN, we use rich information of the weight parameters from the trained ANN counterpart to guide the feature representation learning of the SNN. In particular, we present the SNN's and ANN's feature representation from the same input to ANN's classifier to product SNN's and ANN's outputs respectively and then align the feature with the KL-divergence loss as in knowledge distillation methods, called L_ AF loss.It can be seen as a novel and effective knowledge distillation method specially designed for the SNN that comes from both the knowledge distillation and ANN-SNN methods.
Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery
Sun, Yifan, Wang, Danding, Sheng, Qiang, Cao, Juan, Li, Jintao
Concept-based explainable approaches have emerged as a promising method in explainable AI because they can interpret models in a way that aligns with human reasoning. However, their adaption in the text domain remains limited. Most existing methods rely on predefined concept annotations and cannot discover unseen concepts, while other methods that extract concepts without supervision often produce explanations that are not intuitively comprehensible to humans, potentially diminishing user trust. These methods fall short of discovering comprehensible concepts automatically. To address this issue, we propose \textbf{ECO-Concept}, an intrinsically interpretable framework to discover comprehensible concepts with no concept annotations. ECO-Concept first utilizes an object-centric architecture to extract semantic concepts automatically. Then the comprehensibility of the extracted concepts is evaluated by large language models. Finally, the evaluation result guides the subsequent model fine-tuning to obtain more understandable explanations. Experiments show that our method achieves superior performance across diverse tasks. Further concept evaluations validate that the concepts learned by ECO-Concept surpassed current counterparts in comprehensibility.
- Media > Film (0.46)
- Consumer Products & Services (0.46)
Enhancing tutoring systems by leveraging tailored promptings and domain knowledge with Large Language Models
Balavar, Mohsen, Yang, Wenli, Herbert, David, Yeom, Soonja
Recent advancements in artificial intelligence (AI) and machine learning have reignited interest in their impact on Computer-based Learning (CBL). AI-driven tools like ChatGPT and Intelligent Tutoring Systems (ITS) have enhanced learning experiences through personalisation and flexibility. ITSs can adapt to individual learning needs and provide customised feedback based on a student's performance, cognitive state, and learning path. Despite these advances, challenges remain in accommodating diverse learning styles and delivering real-time, context-aware feedback. Our research aims to address these gaps by integrating skill-aligned feedback via Retrieval Augmented Generation (RAG) into prompt engineering for Large Language Models (LLMs) and developing an application to enhance learning through personalised tutoring in a computer science programming context. The pilot study evaluated a proposed system using three quantitative metrics: readability score, response time, and feedback depth, across three programming tasks of varying complexity. The system successfully sorted simulated students into three skill-level categories and provided context-aware feedback. This targeted approach demonstrated better effectiveness and adaptability compared to general methods.
- Europe > Switzerland (0.05)
- Oceania > Australia > Tasmania (0.04)
CuPID: Leveraging Masked Single-Lead ECG Modelling for Enhancing the Representations
Atienza, Adtian, Manimaran, Gouthamaan, Bardram, Jakob E., Puthusserypady, Sadasivan
Wearable sensing devices, such as Electrocardiogram (ECG) heart-rate monitors, will play a crucial role in the future of digital health. This continuous monitoring leads to massive unlabeled data, incentivizing the development of unsupervised learning frameworks. While Masked Data Modelling (MDM) techniques have enjoyed wide use, their direct application to single-lead ECG data is suboptimal due to the decoder's difficulty handling irregular heartbeat intervals when no contextual information is provided. In this paper, we present Cueing the Predictor Increments the Detailing (CuPID), a novel MDM method tailored to single-lead ECGs. CuPID enhances existing MDM techniques by cueing spectrogram-derived context to the decoder, thus incentivizing the encoder to produce more detailed representations. This has a significant impact on the encoder's performance across a wide range of different configurations, leading CuPID to outperform state-of-the-art methods in a variety of downstream tasks.
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Enhancing controlled query evaluation through epistemic policies
In an era where data privacy is paramount, the challenge of sharing information without compromising sensitive details has become more relevant than ever. Here, we consider the framework known as Controlled Query Evaluation (CQE), an innovative approach that safeguards confidentiality while still providing maximal query answers. We present an extension of this framework that enhances its expressivity through the use of rich forms of data protection rules. We explore the practical importance of these rules and some of the technical underpinnings that make this system effective. We then study some computational properties when data are managed through ontologies specified in DL-LiteR, a popular language designed for efficient reasoning in data-intensive applications.