Materials
Unveiling Secrets of Brain Function With Generative Modeling: Motion Perception in Primates & Cortical Network Organization in Mice
This Dissertation is comprised of two main projects, addressing questions in neuroscience through applications of generative modeling. Project #1 (Chapter 4) explores how neurons encode features of the external world. I combine Helmholtz's "Perception as Unconscious Inference" -- paralleled by modern generative models like variational autoencoders (VAE) -- with the hierarchical structure of the visual cortex. This combination leads to the development of a hierarchical VAE model, which I test for its ability to mimic neurons from the primate visual cortex in response to motion stimuli. Results show that the hierarchical VAE perceives motion similar to the primate brain. Additionally, the model identifies causal factors of retinal motion inputs, such as object- and self-motion, in a completely unsupervised manner. Collectively, these results suggest that hierarchical inference underlines the brain's understanding of the world, and hierarchical VAEs can effectively model this understanding. Project #2 (Chapter 5) investigates the spatiotemporal structure of spontaneous brain activity and its reflection of brain states like rest. Using simultaneous fMRI and wide-field Ca2+ imaging data, this project demonstrates that the mouse cortex can be decomposed into overlapping communities, with around half of the cortical regions belonging to multiple communities. Comparisons reveal similarities and differences between networks inferred from fMRI and Ca2+ signals. The introduction (Chapter 1) is divided similarly to this abstract: sections 1.1 to 1.8 provide background information about Project #1, and sections 1.9 to 1.13 are related to Project #2. Chapter 2 includes historical background, Chapter 3 provides the necessary mathematical background, and finally, Chapter 6 contains concluding remarks and future directions.
Automated Materials Discovery Platform Realized: Scanning Probe Microscopy of Combinatorial Libraries
Liu, Yu, Pant, Rohit, Takeuchi, Ichiro, Spurling, R. Jackson, Maria, Jon-Paul, Ziatdinov, Maxim, Kalinin, Sergei V.
These libraries typically contain binary or ternary isothermal cross-sections of multicomponent phase diagrams, and more advanced synthesis methods can generate spatially encoded 4D and 5D compositional spaces [1]. This versatility makes them well-suited for both optimizing materials through direct exploration of compositional spaces and advancing physics discovery by exploring property and microstructure evolution [2-10]. Additionally, temperature gradients during synthesis can help reveal the effects of synthesis variables, while localized ion-or laser-based annealing enables broader exploration of the processing and chemical spaces within the selected material systems [8, 11, 12]. The first experiments in combinatorial research date back to the 1960s [13, 14], with renewed interest in the 1990s following the discovery of high-temperature superconductors [3, 4, 11, 15-17]. However, it quickly became apparent that successful combinatorial research requires not only synthesis but also detailed characterization, along with the ability to derive insights from characterization results and use these for subsequent experiment planning or transition towards different fabrication routes.
Adaptive Signal Analysis for Automated Subsurface Defect Detection Using Impact Echo in Concrete Slabs
Pavurala, Deepthi, Liao, Duoduo, Pasunuru, Chaithra Reddy
This pilot study presents a novel, automated, and scalable methodology for detecting and evaluating subsurface defect-prone regions in concrete slabs using Impact Echo (IE) signal analysis. The approach integrates advanced signal processing, clustering, and visual analytics to identify subsurface anomalies. A unique adaptive thresholding method tailors frequency-based defect identification to the distinct material properties of each slab. The methodology generates frequency maps, binary masks, and k-means cluster maps to automatically classify defect and non-defect regions. Key visualizations, including 3D surface plots, cluster maps, and contour plots, are employed to analyze spatial frequency distributions and highlight structural anomalies. The study utilizes a labeled dataset constructed at the Federal Highway Administration (FHWA) Advanced Sensing Technology Nondestructive Evaluation Laboratory. Evaluations involve ground-truth masking, comparing the generated defect maps with top-view binary masks derived from the information provided by the FHWA. The performance metrics, specifically F1-scores and AUC-ROC, achieve values of up to 0.95 and 0.83, respectively. The results demonstrate the robustness of the methodology, consistently identifying defect-prone areas with minimal false positives and few missed defects. Adaptive frequency thresholding ensures flexibility in addressing variations across slabs, providing a scalable framework for detecting structural anomalies. Additionally, the methodology is adaptable to other frequency-based signals due to its generalizable thresholding mechanism and holds potential for integrating multimodal sensor fusion. This automated and scalable pipeline minimizes manual intervention, ensuring accurate and efficient defect detection, further advancing Non-Destructive Evaluation (NDE) techniques.
MineAgent: Towards Remote-Sensing Mineral Exploration with Multimodal Large Language Models
Yu, Beibei, Shen, Tao, Na, Hongbin, Chen, Ling, Li, Denqi
Remote-sensing mineral exploration is critical for identifying economically viable mineral deposits, yet it poses significant challenges for multimodal large language models (MLLMs). These include limitations in domain-specific geological knowledge and difficulties in reasoning across multiple remote-sensing images, further exacerbating long-context issues. To address these, we present MineAgent, a modular framework leveraging hierarchical judging and decision-making modules to improve multi-image reasoning and spatial-spectral integration. Complementing this, we propose MineBench, a benchmark specific for evaluating MLLMs in domain-specific mineral exploration tasks using geological and hyperspectral data. Extensive experiments demonstrate the effectiveness of MineAgent, highlighting its potential to advance MLLMs in remote-sensing mineral exploration.
Inverse design of potential metastructures inspired from Indian medieval architectural elements
Bhattacharya, Bishakh, Gupta, Tanuj, Sharma, Arun Kumar, Dwivedi, Ankur, Gupta, Vivek, Sahana, Subhadeep, Pathak, Suryansh, Awasthi, Ashish
In this study, we immerse in the intricate world of patterns, examining the structural details of Indian medieval architecture for the discovery of motifs with great application potential from the mechanical metastructure perspective. The motifs that specifically engrossed us are derived from the tomb of I'timad-ud-Daula, situated in the city of Agra, close to the Taj Mahal. In an exploratory study, we designed nine interlaced metastructures inspired from the tomb's motifs. We fabricated the metastructures using additive manufacturing and studied their vibration characteristics experimentally and numerically. We also investigated bandgap modulation with metallic inserts in honeycomb interlaced metastructures. The comprehensive study of these metastructure panels reveals their high performance in controlling elastic wave propagation and generating suitable frequency bandgaps, hence having potential applications as waveguides for noise and vibration control. Finally, we developed a novel AI-based model trained on numerical datasets for the inverse design of metastructures with a desired bandgap.
Modeling the Dynamics of Sub-Millisecond Electroadhesive Engagement and Release Times
Electroadhesion is an electrically controllable switchable adhesive commonly used in soft robots and haptic user interfaces. It can form strong bonds to a wide variety of surfaces at low power consumption. However, electroadhesive clutches in the literature engage to and release from substrates several orders of magnitude slower than a traditional electrostatic model would predict, limiting their usefulness in high-bandwidth applications. We develop a novel electromechanical model for electroadhesion, factoring in polarization dynamics and contact mechanics between the dielectric and substrate. We show in simulation and experimentally how different design parameters affect the engagement and release times of electroadhesive clutches to metallic substrates. In particular, we find that higher drive frequencies and narrower substrate aspect ratios enable significantly faster dynamics. We demonstrate designs with engagement times under 15 us and release times as low as 875 us, which are 10x and 17.1x faster, respectively, than the best times found in prior literature.
Transformer-based toxin-protein interaction analysis prioritizes airborne particulate matter components with potential adverse health effects
Zhu, Yan, Wang, Shihao, Han, Yong, Lu, Yao, Qiu, Shulan, Jin, Ling, Li, Xiangdong, Zhang, Weixiong
Air pollution, particularly airborne particulate matter (PM), poses a significant threat to public health globally. It is crucial to comprehend the association between PM-associated toxic components and their cellular targets in humans to understand the mechanisms by which air pollution impacts health and to establish causal relationships between air pollution and public health consequences. Current methods for modeling and analyzing these interactions are rudimentary, with experimental approaches offering limited throughput and comprehensiveness. Leveraging cutting-edge deep learning technologies, we developed tipFormer (toxin-protein interaction prediction based on transformer), a novel machine-learning approach for identifying toxic components capable of penetrating human cells and instigating pathogenic biological activities and signaling cascades. It incorporates dual pre-trained language models to derive encodings for protein sequences and chemicals. It employs a convolutional encoder to assimilate the sequential attributes of proteins and chemicals. It then introduces a novel learning module with a cross-attention mechanism to decode and elucidate the multifaceted interactions pivotal for the hotspots binding proteins and chemicals. Through thorough experimentation, tipFormer was shown to be proficient in capturing interactions between proteins and toxic components. This approach offers significant value to the air quality and toxicology research communities by enabling high-throughput, high-content identification and prioritization of hazards. Keywords: Air pollution, toxin-protein interaction, computational modeling, attention mechanisms 1. Introduction Air pollution has emerged as a critical global health concern, primarily driven by rapid economic, industrial and population growth and further exacerbated by climate change and other non-anthropogenic factors [1]. The World Health Organization estimates that approximately 7 million premature deaths occur every year due to air pollution exposure. The consequences of air pollution extend far beyond individual health implications and exacerbate the strain on societal and healthcare systems in numerous ways [2]. The health risks associated with airborne particulate matter (PM) are particularly concerning for public health [3].
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Chan, Jun Shern, Chowdhury, Neil, Jaffe, Oliver, Aung, James, Sherburn, Dane, Mays, Evan, Starace, Giulio, Liu, Kevin, Maksin, Leon, Patwardhan, Tejal, Weng, Lilian, Mฤ dry, Aleksander
We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering. To this end, we curate 75 ML engineering-related competitions from Kaggle, creating a diverse set of challenging tasks that test real-world ML engineering skills such as training models, preparing datasets, and running experiments. We establish human baselines for each competition using Kaggle's publicly available leaderboards. We use open-source agent scaffolds to evaluate several frontier language models on our benchmark, finding that the best-performing setup--OpenAI's o1-preview with AIDE scaffolding--achieves at least the level of a Kaggle bronze medal in 16.9% of competitions. In addition to our main results, we investigate various forms of resource scaling for AI agents and the impact of contamination from pre-training. We open-source our benchmark code (github.com/openai/mle-bench/) to facilitate future research in understanding the ML engineering capabilities of AI agents.
Adaptable and Precise: Enterprise-Scenario LLM Function-Calling Capability Training Pipeline
Zeng, Guancheng, Ding, Wentao, Xu, Beining, Zhang, Chi, Han, Wenqiang, Li, Gang, Mo, Jingjing, Qiu, Pengxu, Tao, Xinran, Tao, Wang, Hu, Haowen
Enterprises possess a vast array of API assets scattered across various functions, forming the backbone of existing business processes. By leveraging these APIs as functional tools, enterprises can design diverse, scenario-specific agent applications, driven by on-premise function-calling models as the core engine. However, generic models often fail to meet enterprise requirements in terms of computational efficiency, output accuracy, and stability, necessitating scenario-specific adaptation. In this paper, we propose a training pipeline for function-calling capabilities tailored to real-world business scenarios. This pipeline includes the synthesis and augmentation of scenario-specific function-calling data, model fine-tuning, and performance evaluation and analysis. Using this pipeline, we generated 1,260 fully AI-generated samples and 1,035 augmented manually-labeled samples in digital HR agent scenario. The Qwen2.5-Coder-7B-Instruct model was employed as the base model and fine-tuned using the LoRA method on four GPUs with 24GB VRAM. Our fine-tuned model demonstrated outstanding performance in evaluations and practical applications, surpassing GPT-4 and GPT-4o in accuracy on the test set. These results validate the reliability of the proposed pipeline for training scenario-specific function-calling models.
MetaScientist: A Human-AI Synergistic Framework for Automated Mechanical Metamaterial Design
Qi, Jingyuan, Jia, Zian, Liu, Minqian, Zhan, Wangzhi, Zhang, Junkai, Wen, Xiaofei, Gan, Jingru, Chen, Jianpeng, Liu, Qin, Ma, Mingyu Derek, Li, Bangzheng, Wang, Haohui, Kulkarni, Adithya, Chen, Muhao, Zhou, Dawei, Li, Ling, Wang, Wei, Huang, Lifu
The discovery of novel mechanical metamaterials, whose properties are dominated by their engineered structures rather than chemical composition, is a knowledge-intensive and resource-demanding process. To accelerate the design of novel metamaterials, we present MetaScientist, a human-in-the-loop system that integrates advanced AI capabilities with expert oversight with two primary phases: (1) hypothesis generation, where the system performs complex reasoning to generate novel and scientifically sound hypotheses, supported with domain-specific foundation models and inductive biases retrieved from existing literature; (2) 3D structure synthesis, where a 3D structure is synthesized with a novel 3D diffusion model based on the textual hypothesis and refined it with a LLM-based refinement model to achieve better structure properties. At each phase, domain experts iteratively validate the system outputs, and provide feedback and supplementary materials to ensure the alignment of the outputs with scientific principles and human preferences. Through extensive evaluation from human scientists, MetaScientist is able to deliver novel and valid mechanical metamaterial designs that have the potential to be highly impactful in the metamaterial field.