Goto

Collaborating Authors

 Materials


Symbol tuning improves in-context learning in language models

arXiv.org Artificial Intelligence

We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g., "foo/bar"). Symbol tuning leverages the intuition that when a model cannot use instructions or natural language labels to figure out a task, it must instead do so by learning the input-label mappings. We experiment with symbol tuning across Flan-PaLM models up to 540B parameters and observe benefits across various settings. First, symbol tuning boosts performance on unseen in-context learning tasks and is much more robust to underspecified prompts, such as those without instructions or without natural language labels. Second, symbol-tuned models are much stronger at algorithmic reasoning tasks, with up to 18.2% better performance on the List Functions benchmark and up to 15.3% better performance on the Simple Turing Concepts benchmark. Finally, symbol-tuned models show large improvements in following flipped-labels presented in-context, meaning that they are more capable of using in-context information to override prior semantic knowledge.


Accelerating Process Development for 3D Printing of New Metal Alloys

arXiv.org Artificial Intelligence

Additive manufacturing (AM) can be considered one of the pillars of the fourth industrial revolution. The industry has the potential to play a major role in innovation processes and in the US and global economy (1). Metal AM is becoming essential in many industries, including healthcare, aerospace, and defense, due to the benefits of lead time reduction, enhanced production efficiency, part consolidation, and design freedom. Laser powder bed fusion (L-PBF) is the most widely used technology for printing metal alloys. The technology uses a high-power laser as an energy source to melt and fuse powders in specific locations to form certain shapes, a recoater then spreads a new layer of powder, and the process repeats until 3D objects are formed. The variability problem is the main obstacle that hinders the reliability of the quality of printed parts and thus the potential for full production. The mechanical properties and dimensional accuracy of printed parts vary depending on the powder and machine used, the scanning strategy, and the printing conditions (2-4).


Embedded feature selection in LSTM networks with multi-objective evolutionary ensemble learning for time series forecasting

arXiv.org Artificial Intelligence

Time series forecasting plays a crucial role in diverse fields, necessitating the development of robust models that can effectively handle complex temporal patterns. In this article, we present a novel feature selection method embedded in Long Short-Term Memory networks, leveraging a multi-objective evolutionary algorithm. Our approach optimizes the weights and biases of the LSTM in a partitioned manner, with each objective function of the evolutionary algorithm targeting the root mean square error in a specific data partition. The set of non-dominated forecast models identified by the algorithm is then utilized to construct a meta-model through stacking-based ensemble learning. Furthermore, our proposed method provides an avenue for attribute importance determination, as the frequency of selection for each attribute in the set of non-dominated forecasting models reflects their significance. This attribute importance insight adds an interpretable dimension to the forecasting process. Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the generalization ability of conventional LSTMs, effectively reducing overfitting. Comparative analyses against state-of-the-art CancelOut and EAR-FS methods highlight the superior performance of our approach.


Russia accuses US of threatening global energy security

Al Jazeera

Russia has claimed that US sanctions levied against the Arctic LNG 2 project undermine global energy security. The Russian foreign ministry's spokeswoman hit out on Wednesday at Washington's "unacceptable" move to clamp down on the massive Arctic LNG 2. The sanctions are just the latest measure implemented as the West seeks to limit Moscow's financial ability to wage war in Ukraine. The remarks came after Washington announced sanctions against the new liquefied natural gas plant that is under development on the Gydan Peninsula in the Arctic last month. "We consider such actions unacceptable, especially in relation to such large international commercial projects as Arctic LNG 2, which affect the energy balance of many states," said foreign ministry spokesperson Maria Zakharova. "The situation around Arctic LNG 2 once again confirms the destructive role for global economic security played by Washington, which speaks of the need to maintain this security but in fact, by pursuing its own selfish interests, tries to oust competitors and destroy global energy security."


Sorting of Smartphone Components for Recycling Through Convolutional Neural Networks

arXiv.org Artificial Intelligence

In a report released by the United Nations University (UNU) in 2020, the global generation of waste electrical and electronic equipment (WEEE) was estimated at 53.6 million tons annually, or 7.3 kg per capita, with WEEE being the fastest-growing solid waste stream in recent years (from 9.2 million tons in 2014 to a projected 74.7 million tons annually by 2030) [1]. The context of WEEE generation also includes a high degree of informality in end-of-life management, with only 17.4% being properly documented and disposed of through formal means, primarily due to technological challenges in collection and recycling faced by the actors involved in this process [1]. From this scenario, the report emphasizes that recycling is a fundamental strategy for minimizing the environmental and societal impacts of the WEEE generation, as it is an essential component of the 2030 Agenda for Sustainable Development under the following United Nations Sustainable Development Goals: Goal 3 (Good Health and Well-being), Goal 6 (Clean Water and Sanitation), Goal 8 (Decent Work and Economic Growth), Goal 11 (Sustainable Cities and Communities), Goal 12 (Responsible Consumption and Production), and Goal 14 (Life Below Water). Over the past decade, there has been a concentration of scientific efforts to find recycling solutions for WEEE. Typically, methods established in the metallurgical industry are adapted for WEEE processing. It is the case of the company Umicore, considered a global benchmark in the field, which has its processes based on copper and lead metallurgy, adding only 15% of WEEE to the primary ores and recovering only the most precious metals, such as gold and silver [2, 3].


What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

arXiv.org Artificial Intelligence

Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged and have been applied in various kinds of areas such as science, finance and software engineering. However, the capability of LLMs to advance the field of chemistry remains unclear. In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain. We identify three key chemistry-related capabilities including understanding, reasoning and explaining to explore in LLMs and establish a benchmark containing eight chemistry tasks. Our analysis draws on widely recognized datasets facilitating a broad exploration of the capacities of LLMs within the context of practical chemistry. Five LLMs (GPT-4, GPT-3.5, Davinci-003, Llama and Galactica) are evaluated for each chemistry task in zero-shot and few-shot in-context learning settings with carefully selected demonstration examples and specially crafted prompts. Our investigation found that GPT-4 outperformed other models and LLMs exhibit different competitive levels in eight chemistry tasks. In addition to the key findings from the comprehensive benchmark analysis, our work provides insights into the limitation of current LLMs and the impact of in-context learning settings on LLMs' performance across various chemistry tasks. The code and datasets used in this study are available at https://github.com/ChemFoundationModels/ChemLLMBench.


Task Contamination: Language Models May Not Be Few-Shot Anymore

arXiv.org Artificial Intelligence

Large language models (LLMs) offer impressive performance in various zero-shot and few-shot tasks. However, their success in zero-shot and few-shot settings may be affected by task contamination, a potential limitation that has not been thoroughly examined. This paper investigates how zero-shot and few-shot performance of LLMs has changed chronologically over time. Utilizing GPT-3 series models and several other recent open-sourced LLMs, and controlling for dataset difficulty, we find that on datasets released before the LLM training data creation date, LLMs perform surprisingly better than on datasets released after. This strongly indicates that, for many LLMs, there exists task contamination on zero-shot and few-shot evaluation for datasets released prior to the LLMs' training data creation date. Additionally, we utilize training data inspection, task example extraction, and a membership inference attack, which reveal further evidence of task contamination. Importantly, we find that for classification tasks with no possibility of task contamination, LLMs rarely demonstrate statistically significant improvements over simple majority baselines, in both zero and few-shot settings.


Data-driven decision-focused surrogate modeling

arXiv.org Artificial Intelligence

We introduce the concept of decision-focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real-time settings. The proposed data-driven framework seeks to learn a simpler, e.g. convex, surrogate optimization model that is trained to minimize the decision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data-driven inverse optimization problem to which we apply a decomposition-based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision-focused surrogate modeling with standard data-driven surrogate modeling methods and demonstrate that our approach is significantly more data-efficient while producing simple surrogate models with high decision prediction accuracy.


Predicting Confinement Effect of Carbon Fiber Reinforced Polymers on Strength of Concrete using Metaheuristics-based Artificial Neural Networks

arXiv.org Artificial Intelligence

Keywords: carbon fiber reinforced polymer, concrete, confinement effect, strength, particle swarm optimization, grey wolf optimizer, bat algorithm Abstract This article deals with the study of predicting the confinement effect of carbon fiber reinforced polymers (CFRPs) on concrete cylinder strength using metaheuristics-based artificial neural networks. Three metaheuristic models are implemented including particle swarm optimization (PSO), grey wolf optimizer (GWO), and bat algorithm (BA). These algorithms are trained on the data using an objective function of mean square error and their predicted results are validated against the experimental studies and finite element analysis. The study shows that the hybrid model of PSO predicted the strength of CFRP-confined concrete cylinders with maximum accuracy of 99.13% and GWO predicted the results with an accuracy of 98.17%. The high accuracy of axial compressive strength predictions demonstrated that these prediction models are a reliable solution to the empirical methods. The prediction models are especially suitable for avoiding full-scale time-consuming experimental tests that make the process quick and economical. 1 Introduction Fiber-reinforced polymer is a composite material comprising fibers of either glass, aramid, or carbon and a polymer matrix. These fibers improve the properties of the polymer matrix mechanically including its stiffness and strength. The popularity of these composites has increased significantly in civil engineering due to their ability to strengthen concrete structural members. FRPs can be used either as a bar or plates embedded in concrete as an internal reinforcement and can be used as an external reinforcement by wrapping FRP sheets to existing structural members. The FRP bars have significantly higher strength than the steel reinforcement bars. They are highly durable and resistant to chemicals, corrosion (Cousin et al. 2019, Ananthkumar et al. 2020, Zhang et al. 2020), and radiation, their higher strength-to-weight ratio (Zhou et al. 2019) makes them ideal for structures that require high strength but need not be heavy. They can be molded into any required shape that provides higher design flexibility. Moreover, it has a lower environmental impact (Lee and Jain 2009), unlike concrete and timber.


3D Programming of Patterned Heterogeneous Interface for 4D Smart Robotics

arXiv.org Artificial Intelligence

Shape memory structures are playing an important role in many cutting-edge intelligent fields. However, the existing technologies can only realize 4D printing of a single polymer or metal, which limits practical applications. Here, we report a construction strategy for TSMP/M heterointerface, which uses Pd2+-containing shape memory polymer (AP-SMR) to induce electroless plating reaction and relies on molecular dynamics, which has both shape memory properties and metal activity and information processing power. Through multi-material DLP 3D printing technology, the interface can be 3D selectively programmed on functional substrate parts of arbitrary shapes to become 4D electronic smart devices (Robotics). Microscopically, this type of interface appears as a composite structure with a nanometer-micrometer interface height, which is composed of a pure substrate layer (smart materials), an intermediate layer (a composite structure in which metal particles are embedded in a polymer cross-linked network) and a pure metal layer. The structure programmed by TSMP/M heterointerface exhibits both SMA characteristics and metal properties, thus having more intelligent functions (electroactive, electrothermal deformation, electronically controlled denaturation) and higher performance (selectivity of shape memory structures can be realized control, remote control, inline control and low voltage control). This is expected to provide a more flexible manufacturing process as platform technology for designing, manufacturing and applying smart devices with new concepts, and promote the development of cutting-edge industries such as smart robots and smart electronics.