AITopics

2308.1357

Country:

Oceania > Australia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Materials (0.46)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

arXiv.org Artificial IntelligenceOct-5-2023

Spatial-temporal associations representation and application for process monitoring using graph convolution neural network

Ren, Hao, Liang, Xiaojun, Yang, Chunhua, Chen, Zhiwen, Gui, Weihua

Thank you very much for the attention and concern of colleagues and scholars in this work. With the comments and guidance of experts, editors, and reviewers, this work has been accepted for publishing in the journal "Process Safety and Environmental Protection". The theme of this paper relies on the Spatial-temporal associations of numerous variables in the same industrial processes, which refers to numerous variables obtained in dynamic industrial processes with Spatial-temporal correlation characteristics, i.e., these variables are not only highly correlated in time but also interrelated in space. To handle this problem, three key issues need to be well addressed: variable characteristics modeling and representation, graph network construction (temporal information), and graph characteristics perception. The first issue is implemented by assuming the data follows one improved Gaussian distribution, while the graph network can be defined by the monitoring variables and their edges which are calculated by their characteristics in time. Finally, these networks corresponding to process states at different times are fed into a graph convolutional neural network to implement graph classification to achieve process monitoring. A benchmark experiment (Tennessee Eastman chemical process) and one application study (cobalt purification from zinc solution) are employed to demonstrate the feasibility and applicability of this paper.

industrial process, neural network, spatial-temporal association representation and application, (10 more...)

2205.0525

Country:

North America > United States > Tennessee (0.25)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Industry: Materials > Chemicals (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

arXiv.org Artificial IntelligenceOct-3-2023

Contrastive Post-training Large Language Models on Data Curriculum

Xu, Canwen, Rosset, Corby, Del Corro, Luciano, Mahajan, Shweti, McAuley, Julian, Neville, Jennifer, Awadallah, Ahmed Hassan, Rao, Nikhil

Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we explore contrastive post-training techniques for alignment by automatically constructing preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We carefully compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-function improvement even after continueing SFT saturates. We also explore a data curriculum learning scheme for contrastive posttraining, which starts by learning from "easier" pairs and transitioning to "harder" ones, which further improves alignment. Finally, we scale up our experiments to train with more data and larger models like Orca. Remarkably, contrastive post-training further improves the performance of Orca, already a state-of-the-art instruction learning model tuned with GPT-4 outputs, to exceed that of ChatGPT. The rapid evolution of Large Language Models (LLMs) has ushered in a new era of natural language processing capabilities. These models, when scaled to billions of parameters and pretrained over trillions of text tokens, demonstrate unprecedented proficiency in a wide array of tasks (Brown et al., 2020; Chowdhery et al., 2022). Various post-training procedures like supervised instruction tuning and Reinforcement Learning from Human Feedback (RLHF) fine-tune pretrained LLMs to better align with human expectations and preferences (Ouyang et al., 2022; OpenAI, 2023; Touvron et al., 2023a). This additional alignment procedure is crucial, because the pretraining objective of essentially predicting the next token in a text sequence is known to produce LLMs whose outputs are at times incorrect, irrelevant, or unsafe (Bai et al., 2022a). Traditionally, these post-training techniques rely on human preference annotations to inform an LLM which behaviors it ought to adopt in the scenario at hand. For instance, RLHF fits a reward model on these preference pairs, against which a LLM policy is then optimized (Ziegler et al., 2019; Bai et al., 2022a; Touvron et al., 2023b). However, such human feedback is expensive to obtain and often noisy (Stiennon et al., 2020; Ouyang et al., 2022; Bai et al., 2022a). To align an LLM without human feedback, other methods such as Reinforcement Learning from AI Feedback (RLAIF) harvest preference signals via automatic feedback from another LLM (Lee et al., 2023; Bai et al., 2022b). However, studies have found AI feedback has a low agreement rate with humans (Perez et al., 2022; Casper et al., 2023b; Lee et al., 2021). Also, these methods suffer from the same drawbacks as RLHF, such as reward hacking (Skalse et al., 2022).

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2310.02263

Country:

Europe > United Kingdom (0.28)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Consumer Health (1.00)
Energy > Oil & Gas > Downstream (0.93)
Materials (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Eghbalpoor, Roozbeh, Sheidaei, Azadeh

A peridynamic-informed deep learning model for brittle damage prediction

In this study, a novel approach that combines the principles of peridynamic (PD) theory with PINN is presented to predict quasi-static damage and crack propagation in brittle materials. To achieve high prediction accuracy and convergence rate, the linearized PD governing equation is enforced in the PINN's residual-based loss function. The proposed PD-INN is able to learn and capture intricate displacement patterns associated with different geometrical parameters, such as pre-crack position and length. Several enhancements like cyclical annealing schedule and deformation gradient aware optimization technique are proposed to ensure the model would not get stuck in its trivial solution. The model's performance assessment is conducted by monitoring the behavior of loss function throughout the training process. The PD-INN predictions are also validated through several benchmark cases with the results obtained from high-fidelity techniques such as PD direct numerical method and Extended-Finite Element Method. Our results show the ability of the nonlocal PD-INN to predict damage and crack propagation accurately and efficiently.

deformation, equation, loss function, (14 more...)

2310.0135

Country: North America > United States > Iowa > Story County > Ames (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Materials > Construction Materials (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Aykol, Muratahan, Merchant, Amil, Batzner, Simon, Wei, Jennifer N., Cubuk, Ekin Dogus

Predicting emergence of crystals from amorphous matter with deep learning

Crystallization of the amorphous phases into metastable crystals plays a fundamental role in the formation of new matter, from geological to biological processes in nature to synthesis and development of new materials in the laboratory. Predicting the outcome of such phase transitions reliably would enable new research directions in these areas, but has remained beyond reach with molecular modeling or ab-initio methods. Here, we show that crystallization products of amorphous phases can be predicted in any inorganic chemistry by sampling the crystallization pathways of their local structural motifs at the atomistic level using universal deep learning potentials. We show that this approach identifies the crystal structures of polymorphs that initially nucleate from amorphous precursors with high accuracy across a diverse set of material systems, including polymorphic oxides, nitrides, carbides, fluorides, chlorides, chalcogenides, and metal alloys. Our results demonstrate that Ostwald's rule of stages can be exploited mechanistically at the molecular level to predictably access new metastable crystals from the amorphous phase in material synthesis.

amorphous phase, crystallization, subcell, (17 more...)

2310.01117

Country: North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Materials > Chemicals (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Michańków, Jakub, Kwiatkowski, Łukasz, Morajda, Janusz

Combining Deep Learning and GARCH Models for Financial Volatility and Risk Forecasting

In this paper, we develop a hybrid approach to forecasting the volatility and risk of financial instruments by combining common econometric GARCH time series models with deep learning neural networks. For the latter, we employ Gated Recurrent Unit (GRU) networks, whereas four different specifications are used as the GARCH component: standard GARCH, EGARCH, GJR-GARCH and APARCH. Models are tested using daily logarithmic returns on the S&P 500 index as well as gold price Bitcoin prices, with the three assets representing quite distinct volatility dynamics. As the main volatility estimator, also underlying the target function of our hybrid models, we use the price-range-based Garman-Klass estimator, modified to incorporate the opening and closing prices. Volatility forecasts resulting from the hybrid models are employed to evaluate the assets' risk using the Value-at-Risk (VaR) and Expected Shortfall (ES) at two different tolerance levels of 5% and 1%. Gains from combining the GARCH and GRU approaches are discussed in the contexts of both the volatility and risk forecasts. In general, it can be concluded that the hybrid solutions produce more accurate point volatility forecasts, although it does not necessarily translate into superior VaR and ES forecasts.

forecast, garch model, sstd, (14 more...)

2310.01063

Country:

Europe > Poland > Lesser Poland Province > Kraków (0.05)
Europe > Poland > Łódź Province > Łódź (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.31)

Industry:

Banking & Finance > Trading (1.00)
Materials > Metals & Mining > Gold (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Resolving Knowledge Conflicts in Large Language Models

Wang, Yike, Feng, Shangbin, Wang, Heng, Shi, Weijia, Balachandran, Vidhisha, He, Tianxing, Tsvetkov, Yulia

Large language models (LLMs) often encounter knowledge conflicts, scenarios where discrepancy arises between the internal parametric knowledge of LLMs and non-parametric information provided in the prompt context. In this work we ask what are the desiderata for LLMs when a knowledge conflict arises and whether existing LLMs fulfill them. We posit that LLMs should 1) identify knowledge conflicts, 2) pinpoint conflicting information segments, and 3) provide distinct answers or viewpoints in conflicting scenarios. To this end, we introduce KNOWLEDGE CONFLICT, an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals. KNOWLEDGE CONFLICT includes diverse and complex situations of knowledge conflict, knowledge from diverse entities and domains, two synthetic conflict creation methods, and settings with progressively increasing difficulty to reflect realistic knowledge conflicts. Extensive experiments with the KNOWLEDGE CONFLICT framework reveal that while LLMs perform well in identifying the existence of knowledge conflicts, they struggle to determine the specific conflicting knowledge and produce a response with distinct answers amidst conflicting information. To address these challenges, we propose new instruction-based approaches that augment LLMs to better achieve the three goals. Further analysis shows that abilities to tackle knowledge conflicts are greatly impacted by factors such as knowledge domain and prompt text, while generating robust responses to knowledge conflict scenarios remains an open research question.

conflict, knowledge, knowledge conflict, (13 more...)

2310.00935

Country:

South America > Brazil (0.04)
Africa (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.33)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Machine LearningOct-2-2023

ChemCrow: Augmenting large-language models with chemistry tools

Bran, Andres M, Cox, Sam, Schilter, Oliver, Baldassari, Carlo, White, Andrew D, Schwaller, Philippe

Over the last decades, excellent computational chemistry tools have been developed. Integrating them into a single platform with enhanced accessibility could help reaching their full potential by overcoming steep learning curves. Recently, large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems. Moreover, these models lack access to external knowledge sources, limiting their usefulness in scientific applications. In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design. By integrating 18 expert-designed tools, ChemCrow augments the LLM performance in chemistry, and new capabilities emerge. Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore. Our evaluation, including both LLM and expert assessments, demonstrates ChemCrow's effectiveness in automating a diverse set of chemical tasks. Surprisingly, we find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4 completions and Chemcrow's performance. Our work not only aids expert chemists and lowers barriers for non-experts, but also fosters scientific advancement by bridging the gap between experimental and computational chemistry.

chemcrow, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2304.05376

Country:

North America > United States (0.14)
Oceania > Australia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-1-2023

Twin Neural Network Improved k-Nearest Neighbor Regression

Wetzel, Sebastian J.

Twin neural network regression is trained to predict differences between regression targets rather than the targets themselves. A solution to the original regression problem can be obtained by ensembling predicted differences between the targets of an unknown data point and multiple known anchor data points. Choosing the anchors to be the nearest neighbors of the unknown data point leads to a neural network-based improvement of k-nearest neighbor regression. This algorithm is shown to outperform both neural networks and k-nearest neighbor regression on small to medium-sized data sets.

artificial intelligence, machine learning, neighbor, (14 more...)

2310.00664

Country: North America > Canada > Ontario (0.15)

Genre: Research Report (0.64)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.68)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.68)
Energy > Oil & Gas > Midstream (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Mendoza, Mijaíl Jaén, Naclerio, Nicholas D., Hawkes, Elliot W.

High-curvature, high-force, vine robot for inspection

arXiv.org Artificial IntelligenceOct-1-2023

Robot performance has advanced considerably both in and out of the factory, however in tightly constrained, unknown environments such as inside a jet engine or the human heart, current robots are less adept. In such cases where a borescope or endoscope can't reach, disassembly or surgery are costly. One promising inspection device inspired by plant growth are "vine robots" that can navigate cluttered environments by extending from their tip. Yet, these vine robots are currently limited in their ability to simultaneously steer into tight curvatures and apply substantial forces to the environment. Here, we propose a plant-inspired method of steering by asymmetrically lengthening one side of the vine robot to enable high curvature and large force application. Our key development is the introduction of an extremely anisotropic, composite, wrinkled film with elastic moduli 400x different in orthogonal directions. The film is used as the vine robot body, oriented such that it can stretch over 120% axially, but only 3% circumferentially. With the addition of controlled layer jamming, this film enables a steering method inspired by plants in which the circumference of the robot is inextensible, but the sides can stretch to allow turns. This steering method and body pressure do not work against each other, allowing the robot to exhibit higher forces and tighter curvatures than previous vine robot architectures. This work advances the abilities of vine robots--and robots more generally--to not only access tightly constrained environments, but perform useful work once accessed.

curvature, jamming structure, robot, (14 more...)

2310.00851

Country: North America > United States > California (0.04)

Genre: Research Report (0.40)

Industry:

Energy (0.48)
Materials (0.47)
Aerospace & Defense (0.34)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)