Goto

Collaborating Authors

 rfc


Integrated Pipeline for Coronary Angiography With Automated Lesion Profiling, Virtual Stenting, and 100-Vessel FFR Validation

Kopanitsa, Georgy, Metsker, Oleg, Yakovlev, Alexey

arXiv.org Artificial Intelligence

Coronary angiography is the main tool for assessing coronary artery disease, but visual grading of stenosis is variable and only moderately related to ischaemia. Wire based fractional flow reserve (FFR) improves lesion selection but is not used systematically. Angiography derived indices such as quantitative flow ratio (QFR) offer wire free physiology, yet many tools are workflow intensive and separate from automated anatomy analysis and virtual PCI planning. We developed AngioAI-QFR, an end to end angiography only pipeline combining deep learning stenosis detection, lumen segmentation, centreline and diameter extraction, per millimetre Relative Flow Capacity profiling, and virtual stenting with automatic recomputation of angiography derived QFR. The system was evaluated in 100 consecutive vessels with invasive FFR as reference. Primary endpoints were agreement with FFR (correlation, mean absolute error) and diagnostic performance for FFR <= 0.80. On held out frames, stenosis detection achieved precision 0.97 and lumen segmentation Dice 0.78. Across 100 vessels, AngioAI-QFR correlated strongly with FFR (r = 0.89, MAE 0.045). The AUC for detecting FFR <= 0.80 was 0.93, with sensitivity 0.88 and specificity 0.86. The pipeline completed fully automatically in 93 percent of vessels, with median time to result 41 s. RFC profiling distinguished focal from diffuse capacity loss, and virtual stenting predicted larger QFR gain in focal than in diffuse disease. AngioAI-QFR provides a practical, near real time pipeline that unifies computer vision, functional profiling, and virtual PCI with automated angiography derived physiology.


RFSeek and Ye Shall Find

Rotman, Noga H., Ferreira, Tiago, Peleg, Hila, Silberstein, Mark, Silva, Alexandra

arXiv.org Artificial Intelligence

Requests for Comments (RFCs) are extensive specification documents for network protocols, but their prose-based format and their considerable length often impede precise operational understanding. We present RFSeek, an interactive tool that automatically extracts visual summaries of protocol logic from RFCs. RFSeek leverages large language models (LLMs) to generate provenance-linked, explorable diagrams, surfacing both official state machines and additional logic found only in the RFC text. Compared to existing RFC visualizations, RFSeek's visual summaries are more transparent and easier to audit against their textual source. We showcase the tool's potential through a series of use cases, including guided knowledge extraction and semantic diffing, applied to protocols such as TCP, QUIC, PPTP, and DCCP. In practice, RFSeek not only reconstructs the RFC diagrams included in some specifications, but, more interestingly, also uncovers important logic such as nodes or edges described in the text but missing from those diagrams. RFSeek further derives new visualization diagrams for complex RFCs, with QUIC as a representative case. Our approach, which we term \emph{Summary Visualization}, highlights a promising direction: combining LLMs with formal, user-customized visualizations to enhance protocol comprehension and support robust implementations.


Aligning Tutor Discourse Supporting Rigorous Thinking with Tutee Content Mastery for Predicting Math Achievement

Abdelshiheed, Mark, Jacobs, Jennifer K., D'Mello, Sidney K.

arXiv.org Artificial Intelligence

This work investigates how tutoring discourse interacts with students' proximal knowledge to explain and predict students' learning outcomes. Our work is conducted in the context of high-dosage human tutoring where 9th-grade students (N = 1080) attended small group tutorials and individually practiced problems on an Intelligent Tutoring System (ITS). We analyzed whether tutors' talk moves and students' performance on the ITS predicted scores on math learning assessments. We trained Random Forest Classifiers (RFCs) to distinguish high and low assessment scores based on tutor talk moves, student's ITS performance metrics, and their combination. A decision tree was extracted from each RFC to yield an interpretable model. We found AUCs of 0.63 for talk moves, 0.66 for ITS, and 0.77 for their combination, suggesting interactivity among the two feature sources. Specifically, the best decision tree emerged from combining the tutor talk moves that encouraged rigorous thinking and students' ITS mastery. In essence, tutor talk that encouraged mathematical reasoning predicted achievement for students who demonstrated high mastery on the ITS, whereas tutors' revoicing of students' mathematical ideas and contributions was predictive for students with low ITS mastery. Implications for practice are discussed.


Light Curve Classification with DistClassiPy: a new distance-based classifier

Chaini, Siddharth, Mahabal, Ashish, Kembhavi, Ajit, Bianco, Federica B.

arXiv.org Artificial Intelligence

The rise of synoptic sky surveys has ushered in an era of big data in time-domain astronomy, making data science and machine learning essential tools for studying celestial objects. Tree-based (e.g. Random Forests) and deep learning models represent the current standard in the field. We explore the use of different distance metrics to aid in the classification of objects. For this, we developed a new distance metric based classifier called DistClassiPy. The direct use of distance metrics is an approach that has not been explored in time-domain astronomy, but distance-based methods can aid in increasing the interpretability of the classification result and decrease the computational costs. In particular, we classify light curves of variable stars by comparing the distances between objects of different classes. Using 18 distance metrics applied to a catalog of 6,000 variable stars in 10 classes, we demonstrate classification and dimensionality reduction. We show that this classifier meets state-of-the-art performance but has lower computational requirements and improved interpretability. We have made DistClassiPy open-source and accessible at https://pypi.org/project/distclassipy/ with the goal of broadening its applications to other classification scenarios within and beyond astronomy.


IT Intrusion Detection Using Statistical Learning and Testbed Measurements

Wang, Xiaoxuan, Stadler, Rolf

arXiv.org Artificial Intelligence

We study automated intrusion detection in an IT infrastructure, specifically the problem of identifying the start of an attack, the type of attack, and the sequence of actions an attacker takes, based on continuous measurements from the infrastructure. We apply statistical learning methods, including Hidden Markov Model (HMM), Long Short-Term Memory (LSTM), and Random Forest Classifier (RFC) to map sequences of observations to sequences of predicted attack actions. In contrast to most related research, we have abundant data to train the models and evaluate their predictive power. The data comes from traces we generate on an in-house testbed where we run attacks against an emulated IT infrastructure. Central to our work is a machine-learning pipeline that maps measurements from a high-dimensional observation space to a space of low dimensionality or to a small set of observation symbols. Investigating intrusions in offline as well as online scenarios, we find that both HMM and LSTM can be effective in predicting attack start time, attack type, and attack actions. If sufficient training data is available, LSTM achieves higher prediction accuracy than HMM. HMM, on the other hand, requires less computational resources and less training data for effective prediction. Also, we find that the methods we study benefit from data produced by traditional intrusion detection systems like SNORT.


Dependence of Physiochemical Features on Marine Chlorophyll Analysis with Learning Techniques

Adhikary, Subhrangshu, Chaturvedi, Sudhir Kumar, Banerjee, Saikat, Basu, Sourav

arXiv.org Artificial Intelligence

Marine chlorophyll which is present within phytoplankton are the basis of photosynthesis and they have a high significance in sustaining ecological balance as they highly contribute toward global primary productivity and comes under the food chain of many marine organisms. Imbalance in the concentrations of phytoplankton can disrupt the ecological balance. The growth of phytoplankton depends upon the optimum concentrations of physiochemical constituents like iron, nitrates, phosphates, pH level, salinity, etc. and deviations from an ideal concentration can affect the growth of phytoplankton which can ultimately disrupt the ecosystem at a large scale. Thus the analysis of such constituents has high significance to estimate the probable growth of marine phytoplankton. The advancements of remote sensing technologies have improved the scope to remotely study the physiochemical constituents on a global scale. The machine learning techniques have made it possible to predict the marine chlorophyll levels based on physiochemical properties and deep learning helped to do the same but in a more advanced manner simulating the working principle of a human brain. In this study, we have used machine learning and deep learning for the Bay of Bengal to establish a regression model of chlorophyll levels based on physiochemical features and discussed its reliability and performance for different regression models. This could help to estimate the amount of chlorophyll present in water bodies based on physiochemical features so we can plan early in case there arises a possibility of disruption in the ecosystem due to imbalance in marine phytoplankton.


Towards Fair Classification against Poisoning Attacks

Xu, Han, Liu, Xiaorui, Wan, Yuxuan, Tang, Jiliang

arXiv.org Artificial Intelligence

Fair classification aims to stress the classification models to achieve the equality (treatment or prediction quality) among different sensitive groups. However, fair classification can be under the risk of poisoning attacks that deliberately insert malicious training samples to manipulate the trained classifiers' performance. In this work, we study the poisoning scenario where the attacker can insert a small fraction of samples into training data, with arbitrary sensitive attributes as well as other predictive features. We demonstrate that the fairly trained classifiers can be greatly vulnerable to such poisoning attacks, with much worse accuracy & fairness trade-off, even when we apply some of the most effective defenses (originally proposed to defend traditional classification tasks). As countermeasures to defend fair classification tasks, we propose a general and theoretically guaranteed framework which accommodates traditional defense methods to fair classification against poisoning attacks. Through extensive experiments, the results validate that the proposed defense framework obtains better robustness in terms of accuracy and fairness than representative baseline methods.


Machine learning approach in the development of building occupant personas

Anik, Sheik Murad Hassan, Gao, Xinghua, Meng, Na

arXiv.org Artificial Intelligence

The user persona is a communication tool for designers to generate a mental model that describes the archetype of users. Developing building occupant personas is proven to be an effective method for human-centered smart building design, which considers occupant comfort, behavior, and energy consumption. Optimization of building energy consumption also requires a deep understanding of occupants' preferences and behaviors. The current approaches to developing building occupant personas face a major obstruction of manual data processing and analysis. In this study, we propose and evaluate a machine learning-based semi-automated approach to generate building occupant personas. We investigate the 2015 Residential Energy Consumption Dataset with five machine learning techniques - Linear Discriminant Analysis, K Nearest Neighbors, Decision Tree (Random Forest), Support Vector Machine, and AdaBoost classifier - for the prediction of 16 occupant characteristics, such as age, education, and, thermal comfort. The models achieve an average accuracy of 61% and accuracy over 90% for attributes including the number of occupants in the household, their age group, and preferred usage of heating or cooling equipment. The results of the study show the feasibility of using machine learning techniques for the development of building occupant persona to minimize human effort.


Predicting Electricity Outages Caused by Convective Storms

Tervo, Roope, Karjalainen, Joonas, Jung, Alexander

arXiv.org Artificial Intelligence

We consider the problem of predicting power outages in an electrical power grid due to hazards produced by convective storms. These storms produce extreme weather phenomena such as intense wind, tornadoes and lightning over a small area. In this paper, we discuss the application of state-of-the-art machine learning techniques, such as random forest classifiers and deep neural networks, to predict the amount of damage caused by storms. We cast this application as a classification problem where the goal is to classify storm cells into a finite number of classes, each corresponding to a certain amount of expected damage. The classification method use as input features estimates for storm cell location and movement which has to be extracted from the raw data. A main challenge of this application is that the training data is heavily imbalanced as the occurrence of extreme weather events is rare. In order to address this issue, we applied SMOTE technique.