Commodity Chemicals

Prediction of amino acid side chain conformation using a deep neural network Machine Learning

A deep neural network based architecture was constructed to predict amino acid side chain conformation with unprecedented accuracy. Amino acid side chain conformation prediction is essential for protein homology modeling and protein design. Current widely-adopted methods use physics-based energy functions to evaluate side chain conformation. Here, using a deep neural network architecture without physics-based assumptions, we have demonstrated that side chain conformation prediction accuracy can be improved by more than 25%, especially for aromatic residues compared with current standard methods. More strikingly, the prediction method presented here is robust enough to identify individual conformational outliers from high resolution structures in a protein data bank without providing its structural factors. We envisage that our amino acid side chain predictor could be used as a quality check step for future protein structure model validation and many other potential applications such as side chain assignment in Cryo-electron microscopy, crystallography model auto-building, protein folding and small molecule ligand docking.

The Care and Feeding of Machine Learning - Carbon Black


The output of this task is a series of predictions about binaries' potential maliciousness and relationships to known malware families. These predictions are validated against outside intelligence.

Report 81 12 Stanford KSL

Classics (Collection 2)

Problems related to an inadequate data base of interpretation rules. The same set of production rules can suggest possible structural interpretations of 13C spectral features. Any individual 13C feature permits a great variety of st,:uctural interpretations. This paper presents an "expert system" devised to aid organic chemists in determining the structure (i.e. the arrangement of atoms and bonds) of newly isolated, naturally occurring compounds. The system exploits a data base of rules for analyzing.013

The Prediction of the Degree of Exposure to Solvent of Amino Acid Residues via Genetic Programming Simon Handley

AAAI Conferences

One of the most fundamental problems in molecular biology is the prediction of tertiary structure from primary structure: the protein folding problem. The goal of protein folding is the prediction of one feature of a folded protein (the 3D coordinates of its backbone atoms) from another feature (the sequence of amino acid residues that make up the protein). The protein folding problem is of enormous practical importance because the latter feature (the primary structure) is much easier to establish than the former (the tertiary structure). A related problem is the buriedness problem: the prediction of the degree of exposure to the solvent (the buriedness) of each amino acid residue in a folded protein. Some amino acid residues will have a buriedness of 0%: these are in the core of the protein and are likely hydrophobic. Other residues will have a buriedness of 100%: these are on the surface of the protein and are probably hydrophilic. The buriedness problem is interesting because it is a simplified version of the protein folding problem. In this paper I will show that genetic programming (Koza 1992; Koza 1994) does find programs that predict the buriedness of residues. These programs work better than would be expected of randomly generated programs and there is very little externally imposed bias towards any particular sizes, shapes/architectures or compositions.

QSPR and QSAR Models Derived with CODESSA Multipurpose Statistical Analysis Software

AAAI Conferences

An overview on the development of QSPR/QSAR equations using various descriptor mining techniques and multilinear regression analysis in the framework of program CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis) is given. The description of the methodologies applied in CODESSA is followed by the presentation of the QSAR and QSPR models derived for eighteen molecular activities and properties. The properties cover single molecular species, interactions between different molecular species, properties of surfactants, complex properties and properties of polymers.