Materials
Counterfactual Explanations of Neural Network-Generated Response Curves
Morales, Giorgio, Sheppard, John
Response curves exhibit the magnitude of the response of a sensitive system to a varying stimulus. However, response of such systems may be sensitive to multiple stimuli (i.e., input features) that are not necessarily independent. As a consequence, the shape of response curves generated for a selected input feature (referred to as "active feature") might depend on the values of the other input features (referred to as "passive features"). In this work, we consider the case of systems whose response is approximated using regression neural networks. We propose to use counterfactual explanations (CFEs) for the identification of the features with the highest relevance on the shape of response curves generated by neural network black boxes. CFEs are generated by a genetic algorithm-based approach that solves a multi-objective optimization problem. In particular, given a response curve generated for an active feature, a CFE finds the minimum combination of passive features that need to be modified to alter the shape of the response curve. We tested our method on a synthetic dataset with 1-D inputs and two crop yield prediction datasets with 2-D inputs. The relevance ranking of features and feature combinations obtained on the synthetic dataset coincided with the analysis of the equation that was used to generate the problem. Results obtained on the yield prediction datasets revealed that the impact on fertilizer responsivity of passive features depends on the terrain characteristics of each field.
Quantifying and Explaining Machine Learning Uncertainty in Predictive Process Monitoring: An Operations Research Perspective
Mehdiyev, Nijat, Majlatow, Maxim, Fettke, Peter
In today's highly competitive and complex business environment, organizations are under constant pressure to optimize their performance and decision-making processes. According to Herbert Simon, enhancing organizational performance relies on effectively channeling finite human attention towards critical data for decision-making, necessitating the integration of information systems (IS), artificial intelligence (AI) and operations research (OR) insights [1]. Recent OR research provides evidence in support of this proposition, as the discipline has witnessed a transformation due to the abundant availability of rich and voluminous data from various sources coupled with advances in machine learning [2]. As of late, heightened academic attention has been devoted to prescriptive analytics, a discipline that suggests combining the results of predictive analytics with optimization techniques in a probabilistic framework to generate responsive, automated, restricted, time-sensitive, and ideal decisions [3]. The confluence of AI and OR is evident due to their interdependent and complementary nature, as both disciplines strive to augment decision-making processes through computational and mathematical methodologies [4].
A Palm-Shape Variable-Stiffness Gripper based on 3D-Printed Fabric Jamming
Soft grippers have excellent adaptability for a variety of objects and tasks. Jamming-based variable stiffness materials can further increase soft grippers' gripping force and capacity. Previous universal grippers enabled by granular jamming have shown great capability of handling objects with various shapes and weight. However, they require a large pushing force on the object during gripping, which is not suitable for very soft or free-hanging objects. In this paper, we create a novel palm-shape anthropomorphic variable-stiffness gripper enabled by jamming of 3D printed fabrics. This gripper is conformable and gentle to objects with different shapes, requires little pushing force, and increases gripping strength only when necessary. We present the design, fabrication and performance of this gripper and tested its conformability and gripping capacity. Our design utilizes soft pneumatic actuators to drive two wide palms to enclose objects, thanks to the excellent conformability of the structured fabrics. While the pinch force is low, the palm can significantly increase stiffness to lift heavy objects with a maximum gripping force of $17\,$N and grip-to-pinch force ratio of $42$. We also explore different variable-stiffness materials in the gripper, including sheets for layer jamming, to compare their performances. We conduct gripping tests on standard objects and daily items to show the great capacity of our gripper design.
Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT
Xie, Tong, Wan, Yuwei, Huang, Wei, Zhou, Yufei, Liu, Yixuan, Linghu, Qingyuan, Wang, Shaozhou, Kit, Chunyu, Grazian, Clara, Zhang, Wenjie, Hoex, Bram
The amount of data has growing significance in exploring cutting-edge materials and a number of datasets have been generated either by hand or automated approaches. However, the materials science field struggles to effectively utilize the abundance of data, especially in applied disciplines where materials are evaluated based on device performance rather than their properties. This article presents a new natural language processing (NLP) task called structured information inference (SII) to address the complexities of information extraction at the device level in materials science. We accomplished this task by tuning GPT-3 on an existing perovskite solar cell FAIR (Findable, Accessible, Interoperable, Reusable) dataset with 91.8% F1-score and extended the dataset with data published since its release. The produced data is formatted and normalized, enabling its direct utilization as input in subsequent data analysis. This feature empowers materials scientists to develop models by selecting high-quality review articles within their domain. Additionally, we designed experiments to predict the electrical performance of solar cells and design materials or devices with targeted parameters using large language models (LLMs). Our results demonstrate comparable performance to traditional machine learning methods without feature selection, highlighting the potential of LLMs to acquire scientific knowledge and design new materials akin to materials scientists.
Emergent autonomous scientific research capabilities of large language models
Boiko, Daniil A., MacKnight, Robert, Gomes, Gabe
Transformer-based large language models are rapidly advancing in the field of machine learning research, with applications spanning natural language, biology, chemistry, and computer programming. Extreme scaling and reinforcement learning from human feedback have significantly improved the quality of generated text, enabling these models to perform various tasks and reason about their choices. In this paper, we present an Intelligent Agent system that combines multiple large language models for autonomous design, planning, and execution of scientific experiments. We showcase the Agent's scientific research capabilities with three distinct examples, with the most complex being the successful performance of catalyzed cross-coupling reactions. Finally, we discuss the safety implications of such systems and propose measures to prevent their misuse.
A Survey of Resources and Methods for Natural Language Processing of Serbian Language
Marovac, Ulfeta A., Avdiฤ, Aldina R., Miloลกeviฤ, Nikola Lj.
The Serbian language is a Slavic language spoken by over 12 million speakers and well understood by over 15 million people. In the area of natural language processing, it can be considered a low-resourced language. Also, Serbian is considered a high-inflectional language. The combination of many word inflections and low availability of language resources makes natural language processing of Serbian challenging. Nevertheless, over the past three decades, there have been a number of initiatives to develop resources and methods for natural language processing of Serbian, ranging from developing a corpus of free text from books and the internet, annotated corpora for classification and named entity recognition tasks to various methods and models performing these tasks. In this paper, we review the initiatives, resources, methods, and their availability.
A Predictive Model using Machine Learning Algorithm in Identifying Students Probability on Passing Semestral Course
This study aims to determine a predictive model to learn students probability to pass their courses taken at the earliest stage of the semester. To successfully discover a good predictive model with high acceptability, accurate, and precision rate which delivers a useful outcome for decision making in education systems, in improving the processes of conveying knowledge and uplifting students academic performance, the proponent applies and strictly followed the CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology. This study employs classification for data mining techniques, and decision tree for algorithm. With the utilization of the newly discovered predictive model, the prediction of students probabilities to pass the current courses they take gives 0.7619 accuracy, 0.8333 precision, 0.8823 recall, and 0.8571 f1 score, which shows that the model used in the prediction is reliable, accurate, and recommendable. Considering the indicators and the results, it can be noted that the prediction model used in this study is highly acceptable. The data mining techniques provides effective and efficient innovative tools in analyzing and predicting student performances. The model used in this study will greatly affect the way educators understand and identify the weakness of their students in the class, the way they improved the effectiveness of their learning processes gearing to their students, bring down academic failure rates, and help institution administrators modify their learning system outcomes. Further study for the inclusion of some students demographic information, vast amount of data within the dataset, automated and manual process of predictive criteria indicators where the students can regulate to which criteria, they must improve more for them to pass their courses taken at the end of the semester as early as midterm period are highly needed.
Bayesian Optimization of Catalysts With In-context Learning
Ramos, Mayk Caldas, Michtavy, Shane S., Porosoff, Marc D., White, Andrew D.
Large language models (LLMs) are able to do accurate classification with zero or only a few examples (in-context learning). We show a prompting system that enables regression with uncertainty for in-context learning with frozen LLM (GPT-3, GPT-3.5, and GPT-4) models, allowing predictions without features or architecture tuning. By incorporating uncertainty, our approach enables Bayesian optimization for catalyst or molecule optimization using natural language, eliminating the need for training or simulation. Here, we performed the optimization using the synthesis procedure of catalysts to predict properties. Working with natural language mitigates difficulty synthesizability since the literal synthesis procedure is the model's input. We showed that in-context learning could improve past a model context window (maximum number of tokens the model can process at once) as data is gathered via example selection, allowing the model to scale better. Although our method does not outperform all baselines, it requires zero training, feature selection, and minimal computing while maintaining satisfactory performance. We also find Gaussian Process Regression on text embeddings is strong at Bayesian optimization. The code is available in our GitHub repository: https://github.com/ur-whitelab/BO-LIFT
A Self-attention Knowledge Domain Adaptation Network for Commercial Lithium-ion Batteries State-of-health Estimation under Shallow Cycles
Chen, Xin, Qin, Yuwen, Zhao, Weidong, Yang, Qiming, Cai, Ningbo, Wu, Kai
Accurate state-of-health (SOH) estimation is critical to guarantee the safety, efficiency and reliability of battery-powered applications. Most SOH estimation methods focus on the 0-100\% full state-of-charge (SOC) range that has similar distributions. However, the batteries in real-world applications usually work in the partial SOC range under shallow-cycle conditions and follow different degradation profiles with no labeled data available, thus making SOH estimation challenging. To estimate shallow-cycle battery SOH, a novel unsupervised deep transfer learning method is proposed to bridge different domains using self-attention distillation module and multi-kernel maximum mean discrepancy technique. The proposed method automatically extracts domain-variant features from charge curves to transfer knowledge from the large-scale labeled full cycles to the unlabeled shallow cycles. The CALCE and SNL battery datasets are employed to verify the effectiveness of the proposed method to estimate the battery SOH for different SOC ranges, temperatures, and discharge rates. The proposed method achieves a root-mean-square error within 2\% and outperforms other transfer learning methods for different SOC ranges. When applied to batteries with different operating conditions and from different manufacturers, the proposed method still exhibits superior SOH estimation performance. The proposed method is the first attempt at accurately estimating battery SOH under shallow-cycle conditions without needing a full-cycle characteristic test.
Incorporating Structured Sentences with Time-enhanced BERT for Fully-inductive Temporal Relation Prediction
Chen, Zhongwu, Xu, Chengjin, Su, Fenglong, Huang, Zhen, Dou, Yong
Temporal relation prediction in incomplete temporal knowledge graphs (TKGs) is a popular temporal knowledge graph completion (TKGC) problem in both transductive and inductive settings. Traditional embedding-based TKGC models (TKGE) rely on structured connections and can only handle a fixed set of entities, i.e., the transductive setting. In the inductive setting where test TKGs contain emerging entities, the latest methods are based on symbolic rules or pre-trained language models (PLMs). However, they suffer from being inflexible and not time-specific, respectively. In this work, we extend the fully-inductive setting, where entities in the training and test sets are totally disjoint, into TKGs and take a further step towards a more flexible and time-sensitive temporal relation prediction approach SST-BERT, incorporating Structured Sentences with Time-enhanced BERT. Our model can obtain the entity history and implicitly learn rules in the semantic space by encoding structured sentences, solving the problem of inflexibility. We propose to use a time masking MLM task to pre-train BERT in a corpus rich in temporal tokens specially generated for TKGs, enhancing the time sensitivity of SST-BERT. To compute the probability of occurrence of a target quadruple, we aggregate all its structured sentences from both temporal and semantic perspectives into a score. Experiments on the transductive datasets and newly generated fully-inductive benchmarks show that SST-BERT successfully improves over state-of-the-art baselines.