Goto

Collaborating Authors

 Materials


Chemical Structure Elucidation from Mass Spectrometry by Matching Substructures

arXiv.org Machine Learning

Chemical structure elucidation is a serious bottleneck in analytical chemistry today. We address the problem of identifying an unknown chemical threat given its mass spectrum and its chemical formula, a task which might take well trained chemists several days to complete. Given a chemical formula, there could be over a million possible candidate structures. We take a data driven approach to rank these structures by using neural networks to predict the presence of substructures given the mass spectrum, and matching these substructures to the candidate structures. Empirically, we evaluate our approach on a data set of chemical agents built for unknown chemical threat identification. We show that our substructure classifiers can attain over 90% micro F1-score, and we can find the correct structure among the top 20 candidates in 88% and 71% of test cases for two compound classes.


Life support system that recycles breathable air is being installed on the ISS

Daily Mail - Science & tech

Oxygen on-board future space missions will be made from the recycled breath of astronauts. The Advanced Closed Loop System (ACLS) has been built by the European Space Agency (ESA) and is now being installed on-board the orbiting spacecraft. The apparatus recycles half the carbon dioxide (CO2) exhaled by astronauts and converts it into oxygen. Scientists have heralded the invention as an important step towards long-term missions to mars and beyond. ESA astronaut Alexander Gerst poses with the ACLS life-support rack, newly installed on the International Space Station.


Analysis of Atomistic Representations Using Weighted Skip-Connections

arXiv.org Machine Learning

In this work, we extend the SchNet architecture by using weighted skip connections to assemble the final representation. This enables us to study the relative importance of each interaction block for property prediction. We demonstrate on both the QM9 and MD17 dataset that their relative weighting depends strongly on the chemical composition and configurational degrees of freedom of the molecules which opens the path towards a more detailed understanding of machine learning models for molecules.


Arm Leads Project to Develop an Armpit-Sniffing Plastic AI Chip

IEEE Spectrum Robotics

Body odor is a stubborn problem. Sensors and the computing attached to them struggle to perceive armpit odors in the way humans do, because B.O. is really a complex mix of dozens of gaseous chemicals. The UK's PlasticArmPit project is designing the first machine learningโ€“enabled flexible plastic sensor chip. Its target audience: those who think they might stink. The prototype chip will be manufactured and tested in 2019.


Chemicals found in Martian soil releases oxygen that could be used to make the red planet habitable

Daily Mail - Science & tech

Salts found in Martian soil could be transformed into breathable air thanks to a bacteria created in the lab by a group of students. Their genetically engineered organism can convert perchlorate - a chemical compound that covers over one per cent of Mars - into oxygen. The did so by placing a solution containing the salt into a bioreactor with the engineered bacteria. It then broke the compound down into its building blocks, chloride and oxygen, the latter of which was then harvested. NASA aims to have humans on Mars by 2030 and, if successful, a colony may be built in the future.


Bees Are Dying Off. Tiny QR Code Backpacks May Help Save Them

WIRED

Science hasn't been giving us a tremendous amount of good news these days. We've screwed up the environment so badly, it's hard to even call it an environment anymore. And that's coming back to bite (or sting) us: Bee populations, which we rely on to pollinate our crops, are plummeting. But science is also coming to the rescue, by gluing QR codes to bumblebees' backs and tracking their movements with a robotic camera. Researchers have created a system that tracks individual bees as well as the dynamics of whole colonies exposed to imidacloprid, a neurotoxin that belongs to the infamous neonicotinoid group of pesticides.


CrystalGAN: Learning to Discover Crystallographic Structures with Generative Adversarial Networks

arXiv.org Machine Learning

Our main motivation is to propose an efficient approach to generate novel multi-element stable chemical compounds that can be used in real world applications. This task can be formulated as a combinatorial problem, and it takes many hours of human experts to construct, and to evaluate new data. Unsupervised learning methods such as Generative Adversarial Networks (GANs) can be efficiently used to produce new data. Cross-domain Generative Adversarial Networks were reported to achieve exciting results in image processing applications. However, in the domain of materials science, there is a need to synthesize data with higher order complexity compared to observed samples, and the state-of-the-art cross-domain GANs can not be adapted directly. In this contribution, we propose a novel GAN called CrystalGAN which generates new chemically stable crystallographic structures with increased domain complexity. We introduce an original architecture, we provide the corresponding loss functions, and we show that the CrystalGAN generates very reasonable data. We illustrate the efficiency of the proposed method on a real original problem of novel hydrides discovery that can be further used in development of hydrogen storage materials.


The Morning After: 5G iPhones

Engadget

Your 5G iPhone is unlikely to appear until 2020, an asteroid mining company gets some help from a new Blockchain owner, and drones get smarter at search and rescue. It's a match made in 2018. Planetary Resources just took an unusual turn on its path to asteroid mining -- selling itself to a blockchain company founded by Ethereum's Joe Lubin. Planetary Resources' Brian Israel said that Blockchain was a "natural solution" for commerce in space and an ideal way for people from various countries to coordinate efforts. It also adds some crucial funding to the space mining company, which had recently laid off employees.


Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile

arXiv.org Artificial Intelligence

Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile by Chin-Chia Michael Yeh Doctor of Philosophy, Graduate Program in Computer Science University of California, Riverside, September 2018 Dr. Eamonn Keogh, Chairperson The last decade has seen a flurry of research on all-pairs-similarity-search (or, self-join) for text, DNA, and a handful of other datatypes, and these systems have been applied to many diverse data mining problems. Surprisingly, however, little progress has been made on addressing this problem for time series subsequences. In this thesis, we have introduced a near universal time series data mining tool called matrix profile which solves the all-pairssimilarity-search problem and caches the output in an easy-to-access fashion. The proposed algorithm is not only parameter-free, exact and scalable, but also applicable for both single and multidimensional time series. By building time series data mining methods on top of matrix profile, many time series data mining tasks (e.g., motif discovery, discord discovery, shapelet discovery, semantic segmentation, and clustering) can be efficiently solved. Because the same matrix profile can be shared by a diverse set of time series data mining methods, matrix profile is versatile and computed-once-use-many-times data structure. We demonstrate the utility of matrix profile for many time series data mining problems, including motif discovery, discord discovery, weakly labeled time series classification, and vi representation learning on domains as diverse as seismology, entomology, music processing, bioinformatics, human activity monitoring, electrical power-demand monitoring, and medicine. We hope the matrix profile is not the end but the beginning of many more time series data mining projects.


Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)

arXiv.org Machine Learning

Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including $L_2$-ALSH, Sign-ALSH and Simple-LSH have been proposed. In this paper, we introduce the norm-range partition technique, which partitions the original dataset into sub-datasets containing items with similar 2-norms and builds hash index independently for each sub-dataset. We prove that norm-range partition reduces the query processing complexity for all existing LSH based MIPS algorithms under mild conditions. The key to performance improvement is that norm-range partition allows to use smaller normalization factor most sub-datasets. For efficient query processing, we also formulate a unified framework to rank the buckets from the hash indexes of different sub-datasets. Experiments on real datasets show that norm-range partition significantly reduces the number of probed for LSH based MIPS algorithms when achieving the same recall.