Materials
DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models
Rassin, Royi, Ravfogel, Shauli, Goldberg, Yoav
We study the way DALLE-2 maps symbols (words) in the prompt to their references (entities or properties of entities in the generated image). We show that in stark contrast to the way human process language, DALLE-2 does not follow the constraint that each word has a single role in the interpretation, and sometimes re-use the same symbol for different purposes. We collect a set of stimuli that reflect the phenomenon: we show that DALLE-2 depicts both senses of nouns with multiple senses at once; and that a given word can modify the properties of two distinct entities in the image, or can be depicted as one object and also modify the properties of another object, creating a semantic leakage of properties between entities. Taken together, our study highlights the differences between DALLE-2 and human language processing and opens an avenue for future study on the inductive biases of text-to-image models.
Self-learning locally-optimal hypertuning using maximum entropy, and comparison of machine learning approaches for estimating fatigue life in composite materials
Ben-Yelun, Ismael, Diaz-Lago, Miguel, Saucedo-Mora, Luis, Sanz, Miguel Angel, Callado, Ricardo, Montans, Francisco Javier
Applications of Structural Health Monitoring (SHM) combined with Machine Learning (ML) techniques enhance real-time performance tracking and increase structural integrity awareness of civil, aerospace and automotive infrastructures. This SHM-ML synergy has gained popularity in the last years thanks to the anticipation of maintenance provided by arising ML algorithms and their ability of handling large quantities of data and considering their influence in the problem. In this paper we develop a novel ML nearest-neighbors-alike algorithm based on the principle of maximum entropy to predict fatigue damage (Palmgren-Miner index) in composite materials by processing the signals of Lamb Waves -- a non-destructive SHM technique -- with other meaningful features such as layup parameters and stiffness matrices calculated from the Classical Laminate Theory (CLT). The full data analysis cycle is applied to a dataset of delamination experiments in composites. The predictions achieve a good level of accuracy, similar to other ML algorithms, e.g. Neural Networks or Gradient-Boosted Trees, and computation times are of the same order of magnitude. The key advantages of our proposal are: (1) The automatic determination of all the parameters involved in the prediction, so no hyperparameters have to be set beforehand, which saves time devoted to hypertuning the model and also represents an advantage for autonomous, self-supervised SHM. (2) No training is required, which, in an \textit{online learning} context where streams of data are fed continuously to the model, avoids repeated training -- essential for reliable real-time, continuous monitoring.
Adaptive Neural Network Ensemble Using Frequency Distribution
Neural network (NN) ensembles can reduce large prediction variance of NN and improve prediction accuracy. For highly nonlinear problems with insufficient data set, the prediction accuracy of NN models becomes unstable, resulting in a decrease in the accuracy of ensembles. Therefore, this study proposes a frequency distribution-based ensemble that identifies core prediction values, which are expected to be concentrated near the true prediction value. The frequency distribution-based ensemble classifies core prediction values supported by multiple prediction values by conducting statistical analysis with a frequency distribution, which is based on various prediction values obtained from a given prediction point. The frequency distribution-based ensemble can improve predictive performance by excluding prediction values with low accuracy and coping with the uncertainty of the most frequent value. An adaptive sampling strategy that sequentially adds samples based on the core prediction variance calculated as the variance of the core prediction values is proposed to improve the predictive performance of the frequency distribution-based ensemble efficiently. Results of various case studies show that the prediction accuracy of the frequency distribution-based ensemble is higher than that of Kriging and other existing ensemble methods. In addition, the proposed adaptive sampling strategy effectively improves the predictive performance of the frequency distribution-based ensemble compared with the previously developed space-filling and prediction variance-based strategies.
Machine-Learning-Optimized Perovskite Nanoplatelet Synthesis
Lampe, Carola, Kouroudis, Ioannis, Harth, Milan, Martin, Stefan, Gagliardi, Alessio, Urban, Alexander S.
With the demand for renewable energy and efficient devices rapidly increasing, a need arises to find and optimize novel (nano)materials. This can be an extremely tedious process, often relying significantly on trial and error. Machine learning has emerged recently as a powerful alternative; however, most approaches require a substantial amount of data points, i.e., syntheses. Here, we merge three machine-learning models with Bayesian Optimization and are able to dramatically improve the quality of CsPbBr3 nanoplatelets (NPLs) using only approximately 200 total syntheses. The algorithm can predict the resulting PL emission maxima of the NPL dispersions based on the precursor ratios, which lead to previously unobtainable 7 and 8 ML NPLs. Aided by heuristic knowledge, the algorithm should be easily applicable to other nanocrystal syntheses and significantly help to identify interesting compositions and rapidly improve their quality.
EU sanctions Iran for human rights abuses after 22-year-old woman dies in custody of so-called morality police
Petrochemical workers strike as demonstrations continue across Iran in defiance of the regime. The European Union sanctioned Iran on Monday for the death of a 22-year-old woman while in custody of the regime's so-called morality police and the subsequent violent crackdown on protests. Numerous Iranian law enforcement officials were added to the sanctions list, including two leaders of the morality police, Mohammad Rostami and Hajahmad Mirzaei. Iran's Minister of Information and Communications Technology, Issa Zarepour, was also sanctioned for his role in censoring the internet and social media during widespread protests over the death of Mahsa Amini. Mahsa Amini, a 22-year-old Iranian woman, was reportedly murdered by Iran's morality police.
Tiny particles work together to do big things
MIT chemical engineers have shown that specialized particles can oscillate together, demonstrating a phenomenon known as emergent behavior. Taking advantage of a phenomenon known as emergent behavior in the microscale, MIT engineers have designed simple microparticles that can collectively generate complex behavior, much the same way that a colony of ants can dig tunnels or collect food. Working together, the microparticles can generate a beating clock that oscillates at a very low frequency. These oscillations can then be harnessed to power tiny robotic devices, the researchers showed. "In addition to being interesting from a physics point of view, this behavior can also be translated into an on-board oscillatory electrical signal, which can be very powerful in microrobotic autonomy. There are a lot of electrical components that require such an oscillatory input," says Jingfan Yang, a recent MIT PhD recipient and one of the lead authors of the new study.
Comparing Synthetic Tabular Data Generation Between a Probabilistic Model and a Deep Learning Model for Education Use Cases
Combrink, Herkulaas MvE, Marivate, Vukosi, Rosman, Benjamin
The ability to generate synthetic data has a variety of use cases across different domains. In education research, there is a growing need to have access to synthetic data to test certain concepts and ideas. In recent years, several deep learning architectures were used to aid in the generation of synthetic data - but with varying results. In the education context, the sophistication of implementing different models requiring large datasets is becoming very important. This study aims to compare the application of synthetic tabular data generation between a probabilistic model specifically a Bayesian Network, and a deep learning model, specifically a Generative Adversarial Network using a classification task. The results of this study indicate that synthetic tabular data generation is better suited for the education context using probabilistic models (overall accuracy of 75%) than deep learning architecture (overall accuracy of 38%) because of probabilistic interdependence. Lastly, we recommend that other data types, should be explored and evaluated for their application in generating synthetic data for education use cases.
Modular machine learning-based elastoplasticity: generalization in the context of limited data
Fuhg, Jan N., Hamel, Craig M., Johnson, Kyle, Jones, Reese, Bouklas, Nikolaos
The development of accurate constitutive models for materials that undergo path-dependent processes continues to be a complex challenge in computational solid mechanics. Challenges arise both in considering the appropriate model assumptions and from the viewpoint of data availability, verification, and validation. Recently, data-driven modeling approaches have been proposed that aim to establish stress-evolution laws that avoid user-chosen functional forms by relying on machine learning representations and algorithms. However, these approaches not only require a significant amount of data but also need data that probes the full stress space with a variety of complex loading paths. Furthermore, they rarely enforce all necessary thermodynamic principles as hard constraints. Hence, they are in particular not suitable for low-data or limited-data regimes, where the first arises from the cost of obtaining the data and the latter from the experimental limitations of obtaining labeled data, which is commonly the case in engineering applications. In this work, we discuss a hybrid framework that can work on a variable amount of data by relying on the modularity of the elastoplasticity formulation where each component of the model can be chosen to be either a classical phenomenological or a data-driven model depending on the amount of available information and the complexity of the response. The method is tested on synthetic uniaxial data coming from simulations as well as cyclic experimental data for structural materials. The discovered material models are found to not only interpolate well but also allow for accurate extrapolation in a thermodynamically consistent manner far outside the domain of the training data. Training aspects and details of the implementation of these models into Finite Element simulations are discussed and analyzed.
Graph Machine Learning for Design of High-Octane Fuels
Rittig, Jan G., Ritzert, Martin, Schweidtmann, Artur M., Winkler, Stefanie, Weber, Jana M., Morsch, Philipp, Heufer, K. Alexander, Grohe, Martin, Mitsos, Alexander, Dahmen, Manuel
Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.
Mechanical features based object recognition
Uttayopas, Pakorn, Cheng, Xiaoxiao, Eden, Jonathan, Burdet, Etienne
Current robotic haptic object recognition relies on statistical measures derived from movement dependent interaction signals such as force, vibration or position. Mechanical properties that can be identified from these signals are intrinsic object properties that may yield a more robust object representation. Therefore, this paper proposes an object recognition framework using multiple representative mechanical properties: the coefficient of restitution, stiffness, viscosity and friction coefficient. These mechanical properties are identified in real-time using a dual Kalman filter, then used to classify objects. The proposed framework was tested with a robot identifying 20 objects through haptic exploration. The results demonstrate the technique's effectiveness and efficiency, and that all four mechanical properties are required for best recognition yielding a rate of 98.18 $\pm$ 0.424 %. Clustering with Gaussian mixture models further shows that using these mechanical properties results in superior recognition as compared to using statistical parameters of the interaction signals.