Goto

Collaborating Authors

 Materials


Structured information extraction from complex scientific text with fine-tuned large language models

arXiv.org Artificial Intelligence

This completion can be formatted as either English sentences or a more structured schema such as a list of JSON documents. Large language models (LLMs) such as GPT-3 [12], PaLM To use this method, one only has to define the desired [25], Megatron [26], OPT [27], Gopher [28], and FLAN [29] output structure--for example, a list of JSON objects with a have been shown to have remarkable ability to leverage semantic predefined set of keys--and annotate 100 500 text passages information between tokens in natural language sequences using this format. GPT-3 is then fine-tuned on these of varying length. They are particularly adept at examples, and the resulting model is able to accurately extract sequence-to-sequence (seq2seq) tasks, where a text input is desired information from text and output information in used to seed a text response from the model. In this paper the same structured representation as shown in Figure 1.


Self-driving electric tractor promises eco-friendly, hands-off farming

Engadget

The autonomous tractor world is heating up, apparently. CNH Industrial has unveiled what it says is the "first" electric light tractor prototype with self-driving features, the New Holland T4 Electric Power. The machine promises zero emissions, quieter operation than diesel models and (according to CNH) lower running costs while reducing the amount of time farmers spend behind the wheel. Sensors and cameras on the roof help the vehicle complete tasks, dodge obstacles and work in harmony with other equipment. You can even activate it from your phone.


BASF taps LSU to help optimize its operations using artificial intelligence

#artificialintelligence

BASF, the largest chemical producer in the world, has been collaborating with LSU chemical engineers to better understand and predict its own production ebbs and flows using artificial intelligence, or AI. The project adds to an ongoing partnership between LSU and BASF to develop emerging STEM talent across disciplines in Louisiana. BASF's chemical manufacturing plant in Geismar in Ascension Parish is one of the company's six largest integrated production sites across 80 countries. It supplies products to a wide variety of industries, including agriculture, construction, energy and health. Chemicals such as solvents, amines, resins, glues, electronic-grade chemicals, industrial gases, basic petrochemicals and inorganic chemicals are produced at Geismar in about 30 interconnected production units, each containing its own subunits.


Terminator-style robot can survive being STABBED

Daily Mail - Science & tech

Sci-fi fans will know the Terminator was only a ruthless killing machine because of its effortless ability to heal itself after damage. Now, engineers at Cornell University in New York may be well on their way to recreating this remarkable self-healing ability. The experts have created a robot capable of detecting when and where it has been damaged and then restoring itself on the spot. The small soft robot, which resembles a four-legged starfish, uses light to detect changes on its surface that are created by cuts. For self-healing to work, the robot must be able to identify that there is something that needs to be fixed.


Phys. Rev. Materials 6, 123603 (2022) - Highly interpretable machine learning framework for prediction of mechanical properties of nickel based superalloys

#artificialintelligence

Superalloys are a special class of heavy-duty materials with excellent strength retention and chemical stability at very high temperatures. Nickel-based superalloys are used commercially in aircraft turbines, power plants, and space launch vehicles. The optimization of mechanical properties of alloys has been traditionally carried out using experimental approaches, which demand massive costs in terms of time and infrastructure for testing. In this paper, we propose a method for mechanical property prediction of Ni-based superalloys by learning from past experimental results using machine learning (ML). Five highly accurate ML models are developed to predict yield strength (YS), ultimate tensile strength (UTS), creep rupture life, fatigue life with stress, and strain values. We have developed an extensive database containing mechanical properties of over 1500 Ni-based superalloys. Basic material parameters such as the composition of the alloy, annealing conditions, and testing conditions are also collected and used as features for developing the ML models. The prediction root mean squared errors for the YS, UTS, creep, and fatigue life models are 0.11, 0.06, 0.19, 0.22, which are minimal, leading to a highly accurate estimation of the target values. These ML models are highly transferable and require a minimum number of input features. In addition, feature analysis performed by SHapley Additive exPlanations (SHAP) for individual properties reveals the relative significance of each descriptor in deciding the target property. We demonstrate that a unified and highly accurate ML framework can be developed using common features for all mechanical properties. The models are developed on experimental data, making them directly applicable for industries.


Stochastic Optimization for Spectral Risk Measures

arXiv.org Artificial Intelligence

At first glance, this is a natural summary, inheriting both the statistical amenability of the sample mean (Shalev-Shwartz and Ben-David, 2014) and the wide arsenal of optimization algorithms designed specifically for finite sum objectives (Le Roux et al., 2012; Defazio et al., 2014; Johnson and Zhang, 2013; Reddi et al., 2016). However, as modern learning systems are deployed in critical domain applications such as energy planning (Guigues and Sagastizábal, 2013), materials engineering (Yeh, 2006), and financial regulation (He et al., 2022), safe and reliable performance in "worst-case" scenarios is paramount. This imperative can be modeled by alternate risk measures (statistical functionals of the loss distribution), particularly those that encapsulate the behavior of the distribution's upper tail.


DialogCC: Large-Scale Multi-Modal Dialogue Dataset

arXiv.org Artificial Intelligence

As sharing images in an instant message is a crucial factor, there has been active research on learning a image-text multi-modal dialogue model. However, training a well-generalized multi-modal dialogue model is challenging because existing multi-modal dialogue datasets contain a small number of data, limited topics, and a restricted variety of images per dialogue. In this paper, we present a multi-modal dialogue dataset creation pipeline that involves matching large-scale images to dialogues based on CLIP similarity. Using this automatic pipeline, we propose a large-scale multi-modal dialogue dataset, DialogCC, which covers diverse real-world topics and various images per dialogue. With extensive experiments, we demonstrate that training a multi-modal dialogue model with our dataset can improve generalization performance. Additionally, existing models trained with our dataset achieve state-of-the-art performance on image and text retrieval tasks. The source code and the dataset will be released after publication.


How automation and artificial intelligence could impact the packaging industry

#artificialintelligence

AMP Robotics uses automation and artificial intelligence (AI) to sort materials within waste streams. We asked CEO Matanya Horowitz how this works, how data can be utilised and the ways automated sorting could impact the packaging industry. AMP Robotics developed an AI platform (AMP Neuron) to distinguish recyclable materials from waste. How did you come up with the idea? Ever since I was a child I've been interested in robotics and the origins of intelligence.


About

#artificialintelligence

The dramatic increase in using of Artificial Intelligence (AI) and traditional machine learning methods in different fields of science becomes an essential asset in the development of the chemical industry, including pharmaceutical, agro biotech, and other chemical companies. However, the application of AI in these fields is not straightforward and requires excellent knowledge of chemistry. Thus, there is a strong need to train and prepare a new generation of scientists who have skills both in machine learning and in chemistry and can advance medicinal chemistry, which is the prime goal of the AIDD proposal. Research WPs include sixteen topics selected to cover the key innovative directions in machine learning in chemistry. Fellows employed will be supervised by academics who have excellent complementary expertise and contributed some of the fundamental AI algorithms which are used billions of times per day in the world, and leading EU Pharma companies who are in charge of new medicine and public health.


Tensor-reduced atomic density representations

arXiv.org Artificial Intelligence

Density based representations of atomic environments that are invariant under Euclidean symmetries have become a widely used tool in the machine learning of interatomic potentials, broader data-driven atomistic modelling and the visualisation and analysis of materials datasets.The standard mechanism used to incorporate chemical element information is to create separate densities for each element and form tensor products between them. This leads to a steep scaling in the size of the representation as the number of elements increases. Graph neural networks, which do not explicitly use density representations, escape this scaling by mapping the chemical element information into a fixed dimensional space in a learnable way. We recast this approach as tensor factorisation by exploiting the tensor structure of standard neighbour density based descriptors. In doing so, we form compact tensor-reduced representations whose size does not depend on the number of chemical elements, but remain systematically convergeable and are therefore applicable to a wide range of data analysis and regression tasks.