Goto

Collaborating Authors

 neural edit operation


Neural Edit Operations for Biological Sequences

Neural Information Processing Systems

The evolution of biological sequences, such as proteins or DNAs, is driven by the three basic edit operations: substitution, insertion, and deletion. Motivated by the recent progress of neural network models for biological tasks, we implement two neural network architectures that can treat such edit operations. The first proposal is the edit invariant neural networks, based on differentiable Needleman-Wunsch algorithms. The second is the use of deep CNNs with concatenations. Our analysis shows that CNNs can recognize star-free regular expressions, and that deeper CNNs can recognize more complex regular expressions including the insertion/deletion of characters. The experimental results for the protein secondary structure prediction task suggest the importance of insertion/deletion. The test accuracy on the widely-used CB513 dataset is 71.5%, which is 1.2-points better than the current best result on non-ensemble models.


Reviews: Neural Edit Operations for Biological Sequences

Neural Information Processing Systems

It is a bit unfortunate that the authors could not find a more scalable variant of EINN, since for the larger datasets EINN was not even used because it was too slow to run, so it just came down to running a plain cnn. It's also unfortunate that the improvements due to EINN are so minor - and leads one to wonder whether this idea is useful in practice. However, the novelty of introducing a new idea like this makes it, in my opinions, worth publishing, despite not having stunning results. It seemed like the EINN could have been slightly better described. Rather than include Algorithms 2 and 3, which are barely referenced in the text, and seem to be just the result of applying chain rule to Alg. 1 (unless i'm missing something), it would be good to have an algorithm which explicitely spells out the function of an einn layer.


Neural Edit Operations for Biological Sequences

Koide, Satoshi, Kawano, Keisuke, Kutsuna, Takuro

Neural Information Processing Systems

The evolution of biological sequences, such as proteins or DNAs, is driven by the three basic edit operations: substitution, insertion, and deletion. Motivated by the recent progress of neural network models for biological tasks, we implement two neural network architectures that can treat such edit operations. The first proposal is the edit invariant neural networks, based on differentiable Needleman-Wunsch algorithms. The second is the use of deep CNNs with concatenations. Our analysis shows that CNNs can recognize star-free regular expressions, and that deeper CNNs can recognize more complex regular expressions including the insertion/deletion of characters.