associative pattern
MARS: A neurosymbolic approach for interpretable drug discovery
DeLong, Lauren Nicole, Gadiya, Yojana, Galdi, Paola, Fleuriot, Jacques D., Domingo-Fernández, Daniel
Neurosymbolic (NeSy) artificial intelligence describes the combination of logic or rule-based techniques with neural networks. Compared to neural approaches, NeSy methods often possess enhanced interpretability, which is particularly promising for biomedical applications like drug discovery. However, since interpretability is broadly defined, there are no clear guidelines for assessing the biological plausibility of model interpretations. To assess interpretability in the context of drug discovery, we devise a novel prediction task, called drug mechanismof-action (MoA) deconvolution, with an associated, tailored knowledge graph (KG), MoA-net. We then develop the MoA Retrieval System (MARS), a NeSy approach for drug discovery which leverages logical rules with learned rule weights. Using this interpretable feature alongside domain knowledge, we find that MARS and other NeSy approaches on KGs are susceptible to reasoning shortcuts, in which the prediction of true labels is driven by "degree-bias" rather than the domain-based rules. Subsequently, we demonstrate ways to identify and mitigate this. Thereafter, MARS achieves performance on par with current state-of-the-art models while producing model interpretations aligned with known MoAs. Drug discovery (DD), the search for novel drugs or chemical compounds to treat ailments, often involves the screening of thousands of small compounds (Lin et al., 2020). Many computational approaches have been developed to accelerate and streamline this screening process (Gottlieb et al., 2011; Gan et al., 2023). Specifically, hundreds of such approaches operate upon knowledge graphs (KGs), in which nodes representing drugs, proteins, or medical conditions are connected by edges, representing the relationships between them (Chen et al., 2020).
Extreme-K categorical samples problem
Chou, Elizabeth, McVey, Catie, Hsieh, Yin-Chen, Enriquez, Sabrina, Hsieh, Fushing
With histograms as its foundation, we develop Categorical Exploratory Data Analysis (CEDA) under the extreme-$K$ sample problem, and illustrate its universal applicability through four 1D categorical datasets. Given a sizable $K$, CEDA's ultimate goal amounts to discover by data's information content via carrying out two data-driven computational tasks: 1) establish a tree geometry upon $K$ populations as a platform for discovering a wide spectrum of patterns among populations; 2) evaluate each geometric pattern's reliability. In CEDA developments, each population gives rise to a row vector of categories proportions. Upon the data matrix's row-axis, we discuss the pros and cons of Euclidean distance against its weighted version for building a binary clustering tree geometry. The criterion of choice rests on degrees of uniformness in column-blocks framed by this binary clustering tree. Each tree-leaf (population) is then encoded with a binary code sequence, so is tree-based pattern. For evaluating reliability, we adopt row-wise multinomial randomness to generate an ensemble of matrix mimicries, so an ensemble of mimicked binary trees. Reliability of any observed pattern is its recurrence rate within the tree ensemble. A high reliability value means a deterministic pattern. Our four applications of CEDA illuminate four significant aspects of extreme-$K$ sample problems.
Associative Patterns of Web Browsing Behavior
Abramson, Myriam (US Naval Research Laboratory) | Gore, Shantanu (Thomas Jefferson Science and Technology)
Abstract recognizing Web browsing signatures can complement other behavioral biometrics such as keystroke authentication to verify a claim of identity and/or identify persons of interest. The deluge of available digital traces enables the cognitive analysis of behavioral traits that differentiate between users and predict their online behavior. Recommendation systems have long capitalized on this capability to personalize search queries but have not exploited the temporal structure of preferences. This paper claims that spatio-temporal patterns of category of website visited by time of access can uniquely characterize and identify users. We present some exploratory approaches in user identification based on recurrent neural networks and empirical results based on clickstream data obtained through a user study and through an internet data provider.
Constant for associative patterns ensemble
Makarov, Leonid, Komarov, Peter
Creation procedure of associative patterns ensemble in terms of formal logic with using neural net-work (NN) model is formulated. It is shown that the associative patterns set is created by means of unique procedure of NN work which having individual parameters of entrance stimulus transformation. It is ascer-tained that the quantity of the selected associative patterns possesses is a constant.