Collaborating Authors

Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space


In this experiment, we follow the experimental setup proposed by You et al. (2018). We optimize the penalized logP score of 800 low-scoring molecules from the ZINC data set. Our genetic algorithm is initiated with a molecule from the data set, and we run each experiment for 20 generations and a population size of 500 without the discriminator. For each run, we report the molecule m that increases the penalized logP the greatest, while possessing a similarity sim(m,m′) δ with the respective reference molecules m′. We calculate molecular similarity based on Morgan Fingerprints of radius 2. To ensure generation of molecules possessing a certain similarity, for molecule m we modify the fitness to: Here, SimilarityPenalty(m) is 0 if sim(m,m′) δ and 106 otherwise.

Feature-based factorized Bilinear Similarity Model for Cold-Start Top-n Item Recommendation Machine Learning

Recommending new items to existing users has remained a challenging problem due to absence of user's past preferences for these items. The user personalized non-collaborative methods based on item features can be used to address this item cold-start problem. These methods rely on similarities between the target item and user's previous preferred items. While computing similarities based on item features, these methods overlook the interactions among the features of the items and consider them independently. Modeling interactions among features can be helpful as some features, when considered together, provide a stronger signal on the relevance of an item when compared to case where features are considered independently. To address this important issue, in this work we introduce the Feature-based factorized Bilinear Similarity Model (FBSM), which learns factorized bilinear similarity model for TOP-n recommendation of new items, given the information about items preferred by users in past as well as the features of these items. We carry out extensive empirical evaluations on benchmark datasets, and we find that the proposed FBSM approach improves upon traditional non-collaborative methods in terms of recommendation performance. Moreover, the proposed approach also learns insightful interactions among item features from data, which lead to deep understanding on how these interactions contribute to personalized recommendation.

Representation Similarity Analysis for Efficient Task taxonomy & Transfer Learning Artificial Intelligence

Transfer learning is widely used in deep neural network models when there are few labeled examples available. The common approach is to take a pre-trained network in a similar task and finetune the model parameters. This is usually done blindly without a pre-selection from a set of pre-trained models, or by finetuning a set of models trained on different tasks and selecting the best performing one by cross-validation. We address this problem by proposing an approach to assess the relationship between visual tasks and their task-specific models. Our method uses Representation Similarity Analysis (RSA), which is commonly used to find a correlation between neuronal responses from brain data and models. With RSA we obtain a similarity score among tasks by computing correlations between models trained on different tasks. Our method is efficient as it requires only pre-trained models, and a few images with no further training. We demonstrate the effectiveness and efficiency of our method for generating task taxonomy on Taskonomy dataset. We next evaluate the relationship of RSA with the transfer learning performance on Taskonomy tasks and a new task: Pascal VOC semantic segmentation. Our results reveal that models trained on tasks with higher similarity score show higher transfer learning performance. Surprisingly, the best transfer learning result for Pascal VOC semantic segmentation is not obtained from the pre-trained model on semantic segmentation, probably due to the domain differences, and our method successfully selects the high performing models.

Predicting Rodent Carcinogenicity By Learning Bayesian Classifiers

AAAI Conferences

The National Toxicology Program (NTP) uses the results of various experiments to determine if test agents are carcinogenic. Because these experiments are costly and time consuming, the rate at which test agents can be tested is limited. The ability to predict the outcome of the analysis at various points in the process would facilitate informed decisions about the allocation of testing resources. In addition, it is hoped that models resulting from attempting to make such predictions will augment expert insight into the biological pathways associated with cancers. This paper describes an approach to making such predictions which is based on learning Bayesian Classifiers. This method builds predictive models by gleaning information from a training set containing data from test agents which have previously been classified by NTP. We will focus on the structure of the training sets and the intuitiveness and cross-validated accuracy of the models learned. As the data is available, these models are being used to predict the classifications of a set of thirty test agents currently being bioassayed by NTP.