Goto

Collaborating Authors

 Materials


Accelerating the screening of amorphous polymer electrolytes by learning to reduce random and systematic errors in molecular dynamics simulations

arXiv.org Artificial Intelligence

Machine learning has been widely adopted to accelerate the screening of materials. Most existing studies implicitly assume that the training data are generated through a deterministic, unbiased process, but this assumption might not hold for the simulation of some complex materials. In this work, we aim to screen amorphous polymer electrolytes which are promising candidates for the next generation lithium-ion battery technology but extremely expensive to simulate due to their structural complexity. We demonstrate that a multi-task graph neural network can learn from a large amount of noisy, biased data and a small number of unbiased data and reduce both random and systematic errors in predicting the transport properties of polymer electrolytes. This observation allows us to achieve accurate predictions on the properties of complex materials by learning to reduce errors in the training data, instead of running repetitive, expensive simulations which is conventionally used to reduce simulation errors. With this approach, we screen a space of 6247 polymer electrolytes, orders of magnitude larger than previous computational studies. We also find a good extrapolation performance to the top polymers from a larger space of 53362 polymers and 31 experimentally-realized polymers. The strategy employed in this work may be applicable to a broad class of material discovery problems that involve the simulation of complex, amorphous materials.


Automated Synthesis of Steady-State Continuous Processes using Reinforcement Learning

arXiv.org Artificial Intelligence

Computer-aided process synthesis has been an important field of chemical engineering for decades [2]. There exists a vast amount of methods in computer-aided process synthesis, in which the roles of human and computer are quite different and vary in their proportions. On one end of the spectrum, humans invent flowsheets, provide mechanistic models of apparatus and physicochemical properties, and employ computers solely in simulations to evaluate and check the invented designs. On the other end of the spectrum, there is automated flowsheet synthesis, which we call rather human-aided process synthesis by a computer. Therein, the structure of the process and operating levels are chosen autonomously by the computer based on input by the human (typically a problem statement and the physicochemical property data). Siirola [3] classified automated flowsheet synthesis into three categories: superstructure optimization, evolutionary modification and systematic generation. In superstructure optimization, a large flowsheet structure (the superstructure) is set up in a way, so that a large set of process alternatives can be obtained by removing parts of that structure [4,5]. An objective function or cost function is defined and the optimal configuration for the flowsheet is determined by an optimization algorithm that uses decision variables to remove parts of the superstructure. Evolutionary modification works as follows: A process flowsheet is devised (by any method at hand), analyzed and changed in one or more ways repeatedly to improve it.


Data augmentation and feature selection for automatic model recommendation in computational physics

arXiv.org Machine Learning

Classification algorithms have recently found applications in computational physics for the selection of numerical methods or models adapted to the environment and the state of the physical system. For such classification tasks, labeled training data come from numerical simulations and generally correspond to physical fields discretized on a mesh. Three challenging difficulties arise: the lack of training data, their high dimensionality, and the non-applicability of common data augmentation techniques to physics data. This article introduces two algorithms to address these issues, one for dimensionality reduction via feature selection, and one for data augmentation. These algorithms are combined with a wide variety of classifiers for their evaluation. When combined with a stacking ensemble made of six multilayer perceptrons and a ridge logistic regression, they enable reaching an accuracy of 90% on our classification problem for nonlinear structural mechanics.


IIRC: Incremental Implicitly-Refined Classification

arXiv.org Artificial Intelligence

We introduce the "Incremental Implicitly-Refined Classi-fication (IIRC)" setup, an extension to the class incremental learning setup where the incoming batches of classes have two granularity levels. i.e., each sample could have a high-level (coarse) label like "bear" and a low-level (fine) label like "polar bear". Only one label is provided at a time, and the model has to figure out the other label if it has already learnfed it. This setup is more aligned with real-life scenarios, where a learner usually interacts with the same family of entities multiple times, discovers more granularity about them, while still trying not to forget previous knowledge. Moreover, this setup enables evaluating models for some important lifelong learning challenges that cannot be easily addressed under the existing setups. These challenges can be motivated by the example "if a model was trained on the class bear in one task and on polar bear in another task, will it forget the concept of bear, will it rightfully infer that a polar bear is still a bear? and will it wrongfully associate the label of polar bear to other breeds of bear?". We develop a standardized benchmark that enables evaluating models on the IIRC setup. We evaluate several state-of-the-art lifelong learning algorithms and highlight their strengths and limitations. For example, distillation-based methods perform relatively well but are prone to incorrectly predicting too many labels per image. We hope that the proposed setup, along with the benchmark, would provide a meaningful problem setting to the practitioners


New Products

Science

![Figure][1] Eppendorf now offers a further building block in supporting scientists with tailored solutions for the daily lab routine. Controlled, reliable cell thawing is mandatory for further downstream experiments in every cell-handling lab. Although water-bath based or even manual thawing of cells is still commonly practiced, these methods are not as desirable due to their limited reproducibility. The Eppendorf ThermoMixer C now features an exchangeable thermoblock, the cryo-thaw SmartBlock, which provides a dedicated thawing program for reproducible, reliable thawing of cells from frozen storage conditions up to 37°C. In response to the community's need for highly specific and reproducible antibodies for SARS-CoV-1/2 research, MilliporeSigma has designed ZooMAb recombinant monoclonal antibodies against various SARS CoV 1/2 targets. ZooMAb antibodies are all recombinantly produced, lyophilized, and free of animal components. Explore our offering of ZooMAb recombinant monoclonal antibodies that are suitable for COVID-19 research: Anti-SARS-CoV-1/2 NP, clone 1C7C7 ZooMAb Mouse Monoclonal Nucleoprotein; Anti-SARS-CoV-1/2 S Protein, clone 2B3E5 ZooMAb Mouse Monoclonal SARS-CoV-1/2 Spike Glycoprotein; and Anti-SARS-CoV-1/2 S Protein clone hu2B3E5 ZooMAb Chimeric Monoclonal. The BioScaffolder Prime from Analytik is an affordable, high-performance 3D bioprinter that delivers precision engineering in an advanced, customizable platform. The rapidly expanding field of 3D bioprinting for tissue engineering and regenerative medicine combines biocompatible/biodegradable polymers with living cells. This bioprinter package offers researchers the ability to create bioscaffolds for cell growth and to deposit layers of bioinks on implants or microfluidic objects. The unit can be equipped with multiple dispensing tools, including unique core/shell tools for simultaneous dispensing of different materials. Decentralized units for printing, media control, and computing save precious space in your biosafety cabinet and ensure superb heat dissipation. Silent but smart XYZ-drives deliver micrometer precision. In addition, the system comes with a Peltier heater/cooler cartridge for temperature-controlled bioprinting and a built-in UV-source UV-LED pen. Designed to fit and operate in a standard biosafety cabinet, BioScaffolder Prime enables you to undertake your 3D printing applications quickly, safely, and in a sterile environment. Ziath reports strong uptake of its Mohawk semiautomated tube picker in smaller biobanks and biorepositories, which need to select tubes from cold racks straight from the freezer but cannot afford the huge investment in robotics required to automatically pick and place tubes. The small, compact Mohawk can pick up 16 tubes simultaneously from a 96-position tube rack. By elevating sample tubes in racks using solenoids, the Mohawk enables biobank operators to quickly retrieve the correct tubes and put them in the destination racks. Additionally, because the Mohawk can seamlessly connect with Ziath rack scanners, biobank users can read a picking list, select tubes, and verify that the correct tubes are picked—making the process of finding and selecting the right tubes in your biobank more efficient and economical. The Cold Coil II Flow Reactor Module from Uniqsis is a flexible, entry-level solution for low temperature flow chemistry applications. Used in conjunction with an external thermoregulation circulator, the unit can maintain stable temperatures between −78°C and 150°C for extended periods of time. It is compatible with all Uniqsis coil reactors, from 2.0 mL up to 60 mL capacity. A proprietary clamping mechanism holds the coil reactor firmly in place and ensures optimal thermal contact while allowing easy interchange of coil reactors. The Cold Coil II can be easily converted into a photoreactor by coupling it with a Uniqsis PhotoSyn high-power LED light module. It is also compatible with the Uniqsis HotColumn multiple-column reactor adaptor for packed-bed applications. To ensure accurate remote measurement of the Cold Coil II reactor temperature, an optional internal temperature probe can be connected directly via RS232C. The RAPID EPS (Easy Piercing Seal) from BioChromato is designed for scientists looking to prevent contamination issues and autosampler-needle clogging when accessing samples stored in 96-well microplates ready for LC/MS analysis. For LC/MS users, a key criterion for an effective microplate seal is its resistance to solvents such as acetonitrile, methanol, and DMSO, which are commonly used in experiments and analysis. The RAPID EPS uses a synthetic rubber adhesive to create a high-integrity, airtight seal with microplates, and shows no contamination in the eluents. In addition, the unique construction of BioChromato's RAPID EPS does not leave particulate material when pierced, further safeguarding your samples from contamination and eliminating potentially harmful effects to your LC/MS autosampler. The RAPID EPS is proven to offer dependable microplate sealing over a working temperature range of −80°C to 80°C.  [1]: pending:yes


A Novel Regression Loss for Non-Parametric Uncertainty Optimization

arXiv.org Machine Learning

Quantification of uncertainty is one of the most promising approaches to establish safe machine learning. Despite its importance, it is far from being generally solved, especially for neural networks. One of the most commonly used approaches so far is Monte Carlo dropout, which is computationally cheap and easy to apply in practice. However, it can underestimate the uncertainty. We propose a new objective, referred to as second-moment loss (SML), to address this issue. While the full network is encouraged to model the mean, the dropout networks are explicitly used to optimize the model variance. We intensively study the performance of the new objective on various UCI regression datasets. Comparing to the state-of-the-art of deep ensembles, SML leads to comparable prediction accuracies and uncertainty estimates while only requiring a single model. Under distribution shift, we observe moderate improvements. As a side result, we introduce an intuitive Wasserstein distance-based uncertainty measure that is non-saturating and thus allows to resolve quality differences between any two uncertainty estimates.


Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness

arXiv.org Artificial Intelligence

Open-domain dialogue agents have vastly improved, but still confidently hallucinate knowledge or express doubt when asked straightforward questions. In this work, we analyze whether state-of-the-art chit-chat models can express metacognition capabilities through their responses: does a verbalized expression of doubt (or confidence) match the likelihood that the model's answer is incorrect (or correct)? We find that these models are poorly calibrated in this sense, yet we show that the representations within the models can be used to accurately predict likelihood of correctness. By incorporating these correctness predictions into the training of a controllable generation model, we obtain a dialogue agent with greatly improved linguistic calibration.


Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

arXiv.org Machine Learning

We formally study how Ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model using Knowledge Distillation. We consider the challenging case where the ensemble is simply an average of the outputs of a few independently trained neural networks with the SAME architecture, trained using the SAME algorithm on the SAME data set, and they only differ by the random seeds used in the initialization. We empirically show that ensemble/knowledge distillation in deep learning works very differently from traditional learning theory, especially differently from ensemble of random feature mappings or the neural-tangent-kernel feature mappings, and is potentially out of the scope of existing theorems. Thus, to properly understand ensemble and knowledge distillation in deep learning, we develop a theory showing that when data has a structure we refer to as "multi-view", then ensemble of independently trained neural networks can provably improve test accuracy, and such superior test accuracy can also be provably distilled into a single model by training a single model to match the output of the ensemble instead of the true label. Our result sheds light on how ensemble works in deep learning in a way that is completely different from traditional theorems, and how the "dark knowledge" is hidden in the outputs of the ensemble -- that can be used in knowledge distillation -- comparing to the true data labels. In the end, we prove that self-distillation can also be viewed as implicitly combining ensemble and knowledge distillation to improve test accuracy.


Knowledge Graphs in Manufacturing and Production: A Systematic Literature Review

arXiv.org Artificial Intelligence

Knowledge graphs in manufacturing and production aim to make production lines more efficient and flexible with higher quality output. This makes knowledge graphs attractive for companies to reach Industry 4.0 goals. However, existing research in the field is quite preliminary, and more research effort on analyzing how knowledge graphs can be applied in the field of manufacturing and production is needed. Therefore, we have conducted a systematic literature review as an attempt to characterize the state-of-the-art in this field, i.e., by identifying exiting research and by identifying gaps and opportunities for further research. To do that, we have focused on finding the primary studies in the existing literature, which were classified and analyzed according to four criteria: bibliometric key facts, research type facets, knowledge graph characteristics, and application scenarios. Besides, an evaluation of the primary studies has also been carried out to gain deeper insights in terms of methodology, empirical evidence, and relevance. As a result, we can offer a complete picture of the domain, which includes such interesting aspects as the fact that knowledge fusion is currently the main use case for knowledge graphs, that empirical research and industrial application are still missing to a large extent, that graph embeddings are not fully exploited, and that technical literature is fast-growing but seems to be still far from its peak.


Active Learning: Problem Settings and Recent Developments

arXiv.org Machine Learning

Supervised learning is a typical problem setting for machine learning that approximates the relationship between the input and output based on a given sets of input and output data. The accuracy of the approximation can be increased using more input and output data to build the model; however, obtaining the appropriate output for the input can be costly. A classic example is the crossbreeding of plants. The environmental conditions (e.g., average monthly temperature, type and amount of fertilizer used, watering conditions, weather) are the input, and the specific properties of the crops are the output. In this case, the controllable variables are related to the fertilizer and watering conditions, but it would take several months to years to perform experiments under various conditions and determine the optimal fertilizer composition and watering conditions.