arf
Deep Representation Learning-Based Dynamic Trajectory Phenotyping for Acute Respiratory Failure in Medical Intensive Care Units
Wu, Alan, Choudhary, Tilendra, Upadhyaya, Pulakesh, Ali, Ayman, Yang, Philip, Kamaleswaran, Rishikesan
Sepsis-induced acute respiratory failure (ARF) is a serious complication with a poor prognosis. This paper presents a deep representation learningbased phenotyping method to identify distinct groups of clinical trajectories of septic patients with ARF. For this retrospective study, we created a dataset from electronic medical records (EMR) consisting of data from sepsis patients admitted to medical intensive care units who required at least 24 hours of invasive mechanical ventilation at a quarternary care academic hospital in southeast USA for the years 2016-2021. A total of N=3349 patient encounters were included in this study. Clustering Representation Learning on Incomplete Time Series Data (CRLI) algorithm was applied to a parsimonious set of EMR variables in this data set. To validate the optimal number of clusters, the K-means algorithm was used in conjunction with dynamic time warping. Our model yielded four distinct patient phenotypes that were characterized as liver dysfunction/heterogeneous, hypercapnia, hypoxemia, and multiple organ dysfunction syndrome by a critical care expert. A Kaplan-Meier analysis to compare the 28-day mortality trends exhibited significant differences (p < 0.005) between the four phenotypes. The study demonstrates the utility of our deep representation learning-based approach in unraveling phenotypes that reflect the heterogeneity in sepsis-induced ARF in terms of different mortality outcomes and severity. These phenotypes might reveal important clinical insights into an effective prognosis and tailored treatment strategies.
Methods for Generating Drift in Text Streams
Garcia, Cristiano Mesquita, Koerich, Alessandro Lameiras, Britto, Alceu de Souza Jr, Barddal, Jean Paul
Systems and individuals produce data continuously. On the Internet, people share their knowledge, sentiments, and opinions, provide reviews about services and products, and so on. Automatically learning from these textual data can provide insights to organizations and institutions, thus preventing financial impacts, for example. To learn from textual data over time, the machine learning system must account for concept drift. Concept drift is a frequent phenomenon in real-world datasets and corresponds to changes in data distribution over time. For instance, a concept drift occurs when sentiments change or a word's meaning is adjusted over time. Although concept drift is frequent in real-world applications, benchmark datasets with labeled drifts are rare in the literature. To bridge this gap, this paper provides four textual drift generation methods to ease the production of datasets with labeled drifts. These methods were applied to Yelp and Airbnb datasets and tested using incremental classifiers respecting the stream mining paradigm to evaluate their ability to recover from the drifts. Results show that all methods have their performance degraded right after the drifts, and the incremental SVM is the fastest to run and recover the previous performance levels regarding accuracy and Macro F1-Score.
Generative Forests
Nock, Richard, Guillame-Bert, Mathieu
Tabular data represents one of the most prevalent form of data. When it comes to data generation, many approaches would learn a density for the data generation process, but would not necessarily end up with a sampler, even less so being exact with respect to the underlying density. A second issue is on models: while complex modeling based on neural nets thrives in image or text generation (etc.), less is known for powerful generative models on tabular data. A third problem is the visible chasm on tabular data between training algorithms for supervised learning with remarkable properties (e.g. boosting), and a comparative lack of guarantees when it comes to data generation. In this paper, we tackle the three problems, introducing new tree-based generative models convenient for density modeling and tabular data generation that improve on modeling capabilities of recent proposals, and a training algorithm which simplifies the training setting of previous approaches and displays boosting-compliant convergence. This algorithm has the convenient property to rely on a supervised training scheme that can be implemented by a few tweaks to the most popular induction scheme for decision tree induction with two classes. Experiments are provided on missing data imputation and comparing generated data to real data, displaying the quality of the results obtained by our approach, in particular against state of the art.
Streaming Active Deep Forest for Evolving Data Stream Classification
Luong, Anh Vu, Nguyen, Tien Thanh, Liew, Alan Wee-Chung
In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite of its huge success in the batch learning setting, no effort has been made to adapt Deep Forest to the context of evolving data streams. In this work, we introduce the Streaming Deep Forest (SDF) algorithm, a high-performance deep ensemble method specially adapted to stream classification. We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context. We compare the proposed methods to state-of-the-art streaming algorithms in a wide range of datasets. The results show that by following the AVU active learning strategy, SDF with only 70\% of labeling budget significantly outperforms other methods trained with all instances.
Women in Analytics: AI โ The ARF
There is no debate that AI is fundamentally changing the world through innovations like search through speech recognition, movie and music consumption and of course self-driving cars. But there is a dark side: AI could threaten society by contributing to data that deepens gender and racial bias. What does this mean for women in analytics? How do we develop the right skills to explore tools and approaches that can help minimize biases?
EVERDAY
Some languages have gender-like categories for Humans/Minds, others avoid any grammatization of sentients; in a traditionally gendered language, Minds usually prefer the feminine form. The earliest entities relatable to today's Minds were built -- grown -- by Humans but eventually Minds took over spawning and raising other Minds in a process "unhelpfully reminiscent" of human reproduction (heredity, recombination, generational dynamics). "Are we still a single civilization" is hardly disputed anymore -- not that the question has been settled either way, more of a distaste for sweeping reality under the carpet of definitions; try "orthogonal universes" (metaphors fare better), equally far from the primitive master/slave and "allied empires" models, but with a "shared library wall" of Knowledge (even if active reading across the divide is in decline -- but so it is across other divides in both worlds). More metaphors: male/female (touchy for humans even if the gender roles aren't fixed), past/future ("of our civilization"?), senses/memory (with neither or both claiming the role of reason), body/soul; plenty of such dualities generalize out of individual Human/Mind couples -- cherished, ideally lifelong associations, orthogonal to and coexistible with families, emerging from and evolving into agile, nearly wordless ("tenderly tense," "surpriseful") collaborations: "unburdened with each other," the two wade in disrelated lifescapes but always with a sense of what the other thinks, a thought on what the other senses -- "a human as a perfect mind window" and vice versa. Many such couples are in gardening, combining Humans' "bodily intelligence" ("children playing outdoors") with Minds' multithreaded, "effortlessly nomogenic" -- undistorted by their own biological adaptations -- background perspective; regnant, with retinues of agencies, over their extensive domains, they "give a glimpse of the three-way symbiosis" (humans, minds, arf) "stretching to embrace all nature."
A Rigorous Analysis of Linsker-type Hebbian Learning
Feng, J., Pan, H., Roychowdhury, V. P.
We propose a novel rigorous approach for the analysis of Linsker's unsupervised Hebbian learning network. The behavior of this model is determined by the underlying nonlinear dynamics which are parameterized by a set of parameters originating from the Hebbian rule and the arbor density of the synapses. These parameters determine the presence or absence of a specific receptive field (also referred to as a'connection pattern') as a saturated fixed point attractor of the model. In this paper, we perform a qualitative analysis of the underlying nonlinear dynamics over the parameter space, determine the effects of the system parameters on the emergence of various receptive fields, and predict precisely within which parameter regime the network will have the potential to develop a specially designated connection pattern. In particular, this approach exposes, for the first time, the crucial role played by the synaptic density functions, and provides a complete precise picture of the parameter space that defines the relationships among the different receptive fields. Our theoretical predictions are confirmed by numerical simulations.
A Rigorous Analysis of Linsker-type Hebbian Learning
Feng, J., Pan, H., Roychowdhury, V. P.
We propose a novel rigorous approach for the analysis of Linsker's unsupervised Hebbian learning network. The behavior of this model is determined by the underlying nonlinear dynamics which are parameterized by a set of parameters originating from the Hebbian rule and the arbor density of the synapses. These parameters determine the presence or absence of a specific receptive field (also referred to as a'connection pattern') as a saturated fixed point attractor of the model. In this paper, we perform a qualitative analysis of the underlying nonlinear dynamics over the parameter space, determine the effects of the system parameters on the emergence of various receptive fields, and predict precisely within which parameter regime the network will have the potential to develop a specially designated connection pattern. In particular, this approach exposes, for the first time, the crucial role played by the synaptic density functions, and provides a complete precise picture of the parameter space that defines the relationships among the different receptive fields. Our theoretical predictions are confirmed by numerical simulations.
A Rigorous Analysis of Linsker-type Hebbian Learning
Feng, J., Pan, H., Roychowdhury, V. P.
His simulations have shown that for appropriate parameter regimes, several structured connection patterns (e.g., centre-surround and oriented afferent receptive fields (aRFs)) occur progressively as the Hebbian evolution of the weights is carried out layer by layer. The behavior of Linsker's model is determined by the underlying nonlinear dynamics which are parameterized by a set of parameters originating from the Hebbian rule and the arbor density of the synapses.