AITopics | data segment

Collaborating Authors

data segment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Trustworthy Prediction with Gaussian Process Knowledge Scores

Butler, Kurt, Feng, Guanchao, Chen, Tong, Djuric, Petar

arXiv.org Machine LearningJun-24-2025

--Probabilistic models are often used to make predictions in regions of the data space where no observations are available, but it is not always clear whether such predictions are well-informed by previously seen data. In this paper, we propose a knowledge score for predictions from Gaussian process regression (GPR) models that quantifies the extent to which observing data have reduced our uncertainty about a prediction. The knowledge score is interpretable and naturally bounded between 0 and 1. We demonstrate in several experiments that the knowledge score can anticipate when predictions from a GPR model are accurate, and that this anticipation improves performance in tasks such as anomaly detection, extrapolation, and missing data imputation. Index T erms --anomaly detection, Gaussian processes, regression models, trustworthy machine learning, predictive distributions. The task of prediction is of fundamental importance in many domains.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2506.1863

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
Europe > Italy (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.49)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

SCFNet:A Transferable IIIC EEG Classification Network

Xu, Weijin

arXiv.org Artificial IntelligenceDec-15-2024

Epilepsy and epileptiform discharges are common harmful brain activities, and electroencephalogram (EEG) signals are widely used to monitor the onset status of patients. However, due to the lack of unified EEG signal acquisition standards, there are many obstacles in practical applications, especially the difficulty in transferring and using models trained on different numbers of channels. To address this issue, we proposes a neural network architecture with a single-channel feature extraction (Singal Channel Feature) model backend fusion (SCFNet). The feature extractor of the model is an RCNN network with single-channel input, which does not depend on other channels, thereby enabling easier migration to data with different numbers of channels. Experimental results show that on the IIIC-Seizure dataset, the accuracy of EEG-SCFNet has improved by 4% compared to the baseline model and also increased by 1.3% compared to the original RCNN neural network model. Even with only fine-tuning the classification head, its performance can still maintain a level comparable to the baseline. In addition, in terms of cross-dataset transfer, EEG-SCFNet can still maintain certain performance even if the channel leads are different.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.17835

Country:

Asia > India (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation

Yarabolu, Vennela, Waghmare, Govind, Gupta, Sonia, Asthana, Siddhartha

arXiv.org Artificial IntelligenceNov-23-2024

In many real-world applications, continuous machine learning (ML) systems are crucial but prone to data drift, a phenomenon where discrepancies between historical training data and future test data lead to significant performance degradation and operational inefficiencies. Traditional drift adaptation methods typically update models using ensemble techniques, often discarding drifted historical data, and focus primarily on either covariate drift or concept drift. These methods face issues such as high resource demands, inability to manage all types of drifts effectively, and neglecting the valuable context that historical data can provide. We contend that explicitly incorporating drifted data into the model training process significantly enhances model accuracy and robustness. This paper introduces an advanced framework that integrates the strengths of data-centric approaches with adaptive management of both covariate and concept drift in a scalable and efficient manner. Our framework employs sophisticated data segmentation techniques to identify optimal data batches that accurately reflect test data patterns. These data batches are then utilized for training on test data, ensuring that the models remain relevant and accurate over time. By leveraging the advantages of both data segmentation and scalable drift management, our solution ensures robust model accuracy and operational efficiency in large-scale ML deployments. It also minimizes resource consumption and computational overhead by selecting and utilizing relevant data subsets, leading to significant cost savings. Experimental results on classification task on real-world and synthetic datasets show our approach improves model accuracy while reducing operational costs and latency. This practical solution overcomes inefficiencies in current methods, providing a robust, adaptable, and scalable approach.

artificial intelligence, concept drift, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3703323.3703337

2411.15616

Country:

South America > Brazil > Maranhão (0.04)
Oceania > Australia > New South Wales (0.04)
North America > United States > Nebraska > Sarpy County > Bellevue (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Applying Fine-Tuned LLMs for Reducing Data Needs in Load Profile Analysis

Hu, Yi, Kim, Hyeonjin, Ye, Kai, Lu, Ning

arXiv.org Artificial IntelligenceJun-2-2024

This paper presents a novel method for utilizing fine-tuned Large Language Models (LLMs) to minimize data requirements in load profile analysis, demonstrated through the restoration of missing data in power system load profiles. A two-stage fine-tuning strategy is proposed to adapt a pre-trained LLMs, i.e., GPT-3.5, for missing data restoration tasks. Through empirical evaluation, we demonstrate the effectiveness of the fine-tuned model in accurately restoring missing data, achieving comparable performance to state-of-the-art specifically designed models such as BERT-PIN. Key findings include the importance of prompt engineering and the optimal utilization of fine-tuning samples, highlighting the efficiency of few-shot learning in transferring knowledge from general user cases to specific target users. Furthermore, the proposed approach demonstrates notable cost-effectiveness and time efficiency compared to training models from scratch, making it a practical solution for scenarios with limited data availability and computing resources. This research has significant potential for application to other power system load profile analysis tasks. Consequently, it advances the use of LLMs in power system analytics, offering promising implications for enhancing the resilience and efficiency of power distribution systems.

chatgpt, fine-tuning, load profile, (12 more...)

arXiv.org Artificial Intelligence

2406.02479

Country:

Asia (0.04)
North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Energy > Renewable (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A novel multi-layer modular approach for real-time fuzzy-identification of gravitational-wave signals

Barone, Francesco Pio, Dell'Aquila, Daniele, Russo, Marco

arXiv.org Artificial IntelligenceDec-16-2023

Advanced LIGO and Advanced Virgo ground-based interferometers are instruments capable to detect gravitational wave signals exploiting advanced laser interferometry techniques. The underlying data analysis task consists in identifying specific patterns in noisy timeseries, but it is made extremely complex by the incredibly small amplitude of the target signals. In this scenario, the development of effective gravitational wave detection algorithms is crucial. We propose a novel layered framework for real-time detection of gravitational waves inspired by speech processing techniques and, in the present implementation, based on a state-of-the-art machine learning approach involving a hybridization of genetic programming and neural networks. The key aspects of the newly proposed framework are: the well structured, layered approach, and the low computational complexity. The paper describes the basic concepts of the framework and the derivation of the first three layers. Even if the layers are based on models derived using a machine learning approach, the proposed layered structure has a universal nature. Compared to more complex approaches, such as convolutional neural networks, which comprise a parameter set of several tens of MB and were tested exclusively for fixed length data samples, our framework has lower accuracy (e.g., it identifies 45% of low signal-to-noise-ration gravitational wave signals, against 65% of the state-of-the-art, at a false alarm probability of $10^{-2}$), but has a much lower computational complexity and a higher degree of modularity. Furthermore, the exploitation of short-term features makes the results of the new framework virtually independent against time-position of gravitational wave signals, simplifying its future exploitation in real-time multi-layer pipelines for gravitational-wave detection with new generation interferometers.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/2632-2153/ad1200

2206.06004

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Quilt: Robust Data Segment Selection against Concept Drifts

Kim, Minsu, Hwang, Seong-Hyeon, Whang, Steven Euijong

arXiv.org Artificial IntelligenceDec-15-2023

Continuous machine learning pipelines are common in industrial settings where models are periodically trained on data streams. Unfortunately, concept drifts may occur in data streams where the joint distribution of the data X and label y, P(X, y), changes over time and possibly degrade model accuracy. Existing concept drift adaptation approaches mostly focus on updating the model to the new data possibly using ensemble techniques of previous models and tend to discard the drifted historical data. However, we contend that explicitly utilizing the drifted data together leads to much better model accuracy and propose Quilt, a data-centric framework for identifying and selecting data segments that maximize model accuracy. To address the potential downside of efficiency, Quilt extends existing data subset selection techniques, which can be used to reduce the training data without compromising model accuracy. These techniques cannot be used as is because they only assume virtual drifts where the posterior probabilities P(y|X) are assumed not to change. In contrast, a key challenge in our setup is to also discard undesirable data segments with concept drifts. Quilt thus discards drifted data segments and selects data segment subsets holistically for accurate and efficient model training. The two operations use gradient-based scores, which have little computation overhead. In our experiments, we show that Quilt outperforms state-of-the-art drift adaptation and data selection baselines on synthetic and real datasets.

concept drift, data segment, dataset, (15 more...)

arXiv.org Artificial Intelligence

2312.09691

Country:

Oceania > Australia > New South Wales (0.04)
North America > United States > Nebraska > Sarpy County > Bellevue (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

BERT-PIN: A BERT-based Framework for Recovering Missing Data Segments in Time-series Load Profiles

Hu, Yi, Ye, Kai, Kim, Hyeonjin, Lu, Ning

arXiv.org Artificial IntelligenceOct-26-2023

Inspired by the success of the Transformer model in natural language processing and computer vision, this paper introduces BERT-PIN, a Bidirectional Encoder Representations from Transformers (BERT) powered Profile Inpainting Network. BERT-PIN recovers multiple missing data segments (MDSs) using load and temperature time-series profiles as inputs. To adopt a standard Transformer model structure for profile inpainting, we segment the load and temperature profiles into line segments, treating each segment as a word and the entire profile as a sentence. We incorporate a top candidates selection process in BERT-PIN, enabling it to produce a sequence of probability distributions, based on which users can generate multiple plausible imputed data sets, each reflecting different confidence levels. We develop and evaluate BERT-PIN using real-world dataset for two applications: multiple MDSs recovery and demand response baseline estimation. Simulation results show that BERT-PIN outperforms the existing methods in accuracy while is capable of restoring multiple MDSs within a longer window. BERT-PIN, served as a pre-trained model, can be fine-tuned for conducting many downstream tasks, such as classification and super resolution.

bert-pin, data segment, load profile, (14 more...)

arXiv.org Artificial Intelligence

2310.17742

Country:

North America > Canada > Quebec (0.04)
Europe > France (0.04)
Africa > Cameroon (0.04)
(5 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Power Industry (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
Energy > Renewable (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Joint Microseismic Event Detection and Location with a Detection Transformer

Yang, Yuanyuan, Birnie, Claire, Alkhalifah, Tariq

arXiv.org Artificial IntelligenceJul-16-2023

During the processes of reservoir stimulation, fluids are injected into a specific area underground. The high-pressure condition created by the fluid injection causes rocks to crack to release the built-up stress, resulting in small earthquakes called microseismic events. Detecting these events in seismic recordings and locating them back to their subsurface locations are important for understanding the subsurface conditions such as fracture networks and fluid flow pathways. This knowledge is critical for applications like carbon storage, geothermal energy extraction, and oil/gas production. Traditional approaches for microseismic event detection and location often suffer from manual intervention and/or heavy computation, while current machine learning-assisted approaches typically address detection and location separately. These limitations prevent the potential for real-time microseismic monitoring, which is crucial for scientists and engineers to make instant, informed decisions, like optimization of injection strategies. Here, we proposed a machine learning-based procedure for simultaneously detecting and locating microseismic events within a single framework, using a conventional Convolutional Neural Network and an encoder-decoder Transformer. Tests on synthetically-generated and field-collected passive seismic data illustrate the accuracy, efficiency, and potential of the proposed method, which could pave the way for real-time monitoring of microseismic events in the future.

artificial intelligence, machine learning, microseismic event, (19 more...)

arXiv.org Artificial Intelligence

2307.09207

Country:

North America > United States > Texas (0.28)
North America > United States > Oklahoma (0.14)
North America > United States > Illinois (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Renewable > Geothermal (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MSED: a multi-modal sleep event detection model for clinical sleep analysis

Olesen, Alexander Neergaard, Jennum, Poul, Mignot, Emmanuel, Sorensen, Helge B. D.

arXiv.org Machine LearningJan-7-2021

Study objective: Clinical sleep analysis require manual analysis of sleep patterns for correct diagnosis of sleep disorders. Several studies show significant variability in scoring discrete sleep events. We wished to investigate, whether an automatic method could be used for detection of arousals (Ar), leg movements (LM) and sleep disordered breathing (SDB) events, and if the joint detection of these events performed better than having three separate models. Methods: We designed a single deep neural network architecture to jointly detect sleep events in a polysomnogram. We trained the model on 1653 recordings of individuals, and tested the optimized model on 1000 separate recordings. The performance of the model was quantified by F1, precision, and recall scores, and by correlating index values to clinical values using Pearson's correlation coefficient. Results: F1 scores for the optimized model was 0.70, 0.63, and 0.62 for Ar, LM, and SDB, respectively. The performance was higher, when detecting events jointly compared to corresponding single-event models. Index values computed from detected events correlated well with manual annotations ($r^2$ = 0.73, $r^2$ = 0.77, $r^2$ = 0.78, respectively). Conclusion: Detecting arousals, leg movements and sleep disordered breathing events jointly is possible, and the computed index values correlates well with human annotations.

arousal, event window, prediction, (15 more...)

arXiv.org Machine Learning

2101.0253

Country:

North America > United States > Illinois > DuPage County > Darien (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > District of Columbia > Washington (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Sleep (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving The Performance Of The K-means Algorithm

Nguyen, Tien-Dung

arXiv.org Machine LearningMay-10-2020

The Incremental K-means (IKM), an improved version of K-means (KM), was introduced to improve the clustering quality of KM significantly. However, the speed of IKM is slower than KM. My thesis proposes two algorithms to speed up IKM while remaining the quality of its clustering result approximately. The first algorithm, called Divisive K-means, improves the speed of IKM by speeding up its splitting process of clusters. Testing with UCI Machine Learning data sets, the new algorithm achieves the empirically global optimum as IKM and has lower complexity, $O(k*log_{2}k*n)$, than IKM, $O(k^{2}n)$. The second algorithm, called Parallel Two-Phase K-means (Par2PK-means), parallelizes IKM by employing the model of Two-Phase K-means. Testing with large data sets, this algorithm attains a good speedup ratio, closing to the linearly speed-up ratio.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2005.04689

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Orange County > Irvine (0.14)
(4 more...)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)

Add feedback