Materials
The Download: unlocking lithium and controlling Ebola
Plus: Anthropic is now valued higher than OpenAI. How a new extraction process could unlock the world's lithium A new method for extracting lithium could cut costs and emissions from one of the world's most important materials for EVs and energy storage. The technique uses a weak acid to dissolve silicate minerals. That frees not only the lithium but also other useful materials, including alumina and silica. "At scale, we believe this will be the lowest-cost way of sourcing lithium in the world," says Yet-Ming Chiang, an MIT professor who co-authored a study of the process published yesterday in . Startup Rock Zero is already working to commercialize the research.
Statistical Embeddings for Similarity, Retrieval, and Interpretable Alignment of Numeric Tabular Datasets
Kunz, M. Ross, Merickel, John, Wilson, Keith
Numeric tabular datasets are the dominant data format in scientific practice, yet large language models lack native mechanisms for representing numeric datasets in a meaningful way across heterogeneous feature spaces. Existing approaches either target predictive modeling over individual datasets, which requires a shared set of variable definitions, or lack mechanisms for interpretable cross-dataset alignment. The proposed methodology characterizes numeric tabular datasets through structured exploratory data analysis descriptors, embeds those descriptors into a shared vector space using a pretrained sentence transformer, and quantifies cross-dataset similarity via Canonical Correlation Analysis (CCA). Furthermore, a penalized formulation of CCA is applied to recover sparse, interpretable variable-level correspondences between datasets, identifying which statistical descriptors or variable-level quantities drive cross-dataset alignment without requiring shared variable names or feature conventions. Differential privacy is optionally applied to the descriptor set prior to embedding, supporting deployment in sensitive data contexts without requiring access to raw observations at time of comparison. The methodology is evaluated across 15 datasets spanning general-purpose benchmarks, materials informatics, and nuclear-grade graphite characterization. Results demonstrate a total P@1 score of 0.9, with known nearest-neighbor retrieval and cluster structure remaining robust across embedding ablations and differential privacy budgets. The proposed framework provides a principled pathway for integrating heterogeneous numeric data into retrieval-augmented generation pipelines while preserving statistical context, with direct applications to data-driven algorithm selection and simulation model initialization for unknown datasets.
Integrating Bayesian Spectral Deconvolution and Expert Scientific Reasoning for Robust Peak Estimation
Okubo, Hayato, Amamoto, Yoshifumi, Aritake, Toshimitsu, Kumazoe, Hiroyuki, Nakano, Shiryu, Jamison, Evan, Tanaka, Satoshi, Mototake, Yoh-ichi
Spectral deconvolution is essential for extracting peak structures that encode material properties and chemical structures, but conventional automated methods often fail when spectra contain high-intensity noise or unknown background components. In practice, scientists rarely interpret spectra in isolation. Instead, they identify physically meaningful peaks by relating spectral structures to auxiliary information such as physical-property values, chemical structures, and trends across related measurements. Here, we propose a Bayesian framework that integrates spectral deconvolution with a model of expert scientific reasoning. In this work, expert scientific reasoning refers to the practice of evaluating candidate spectral structures by their consistency with independently measured physical-property values, rather than to manual expert intervention during inference. We formalize this reasoning as a physical-property regression layer, implemented using Gaussian process regression, and couple it with Bayesian spectral deconvolution. By averaging the physical-property likelihood over posterior predictive spectra inferred from Bayesian spectral deconvolution, the proposed method selects spectral models according to the consistency between inferred spectral structures and physical-property information. We validate the framework using synthetic spectra with high-intensity noise or unknown backgrounds and infrared spectra of poly(lactic acid). The method recovers physically meaningful peak structures that conventional Bayesian spectral deconvolution misses or misidentifies from spectra alone, including weak peaks in poly(lactic acid) IR spectra related to measured degradation rates. These results demonstrate that integrating expert scientific reasoning with Bayesian spectral deconvolution enables robust peak estimation under conditions where spectrum-only inference is unreliable.
How to remove bamboo from your yard
More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. If bamboo appears unexpectedly in your yard, don't panic. Breakthroughs, discoveries, and DIY tips sent six days a week. Bamboo may feel like an easy landscaping win because it's a fast-growing privacy screen that can turn a plain yard into a lush retreat. But then a few shoots start popping up in random places all over your yard.
TabPFN-3: Technical Report
Grinsztajn, Lรฉo, Flรถge, Klemens, Key, Oscar, Birkel, Felix, Jund, Philipp, Roof, Brendan, Manium, Mihir, Bin, Shi, Hoo, null, Bรผhler, Magnus, Garg, Anurag, Safaric, Dominik, Robertson, Jake, Jรคger, Benjamin, Alessi, Simone, Hayler, Adrian, Moroshan, Vladyslav, Purucker, Lennart, Singer, Philipp, Arazi, Alan, Siems, Julien, Metzen, Jan Hendrik, Grab, Georg, Erickson, Nick, Guo, Siyuan, Kalfon, Eliott, Bing, Simon, Salinas, David, Cornu, Clara, Wehrhahn, Lilly Charlotte, Kriuchkova, Diana, Kaya, Kursat, Sidhoum, Lydia, Salmon, Marie, Chen, Jerry, Hulsebos, Madelon, LeCun, Yann, Mรผller, Samuel, Schรถlkopf, Bernhard, Gambhir, Sauraj, Hollmann, Noah, Hutter, Frank
Tabular data underpins most high-value prediction problems in science and industry, and TabPFN has driven the foundation model revolution for this modality. Designed with feedback from our users, TabPFN-3 builds on this foundation to scale state-of-the-art performance to datasets with 1M training rows and substantially reduce training and inference time. Pretrained exclusively on synthetic data from our prior, TabPFN-3 dramatically pushes the frontier of tabular prediction and brings substantial gains on time series, relational, and tabular-text data. On the standard tabular benchmark TabArena, a forward pass of TabPFN-3 outperforms all other models, including tuned and ensembled baselines, by a significant margin, and pareto-dominates the speed/performance frontier. On more diverse datasets, TabPFN-3 ranks first on datasets with many classes, and beats 8-hour-tuned gradient-boosted-tree baselines on datasets up to 1M training rows and 200 features. TabPFN-3 introduces test-time compute scaling to tabular foundation models. Our API offering TabPFN-3-Plus (Thinking) exploits this to beat all non-TabPFN models by over 200 Elo on TabArena, rising to 420 Elo on the largest data subset, and outperforms AutoGluon 1.5 extreme while being 10x faster, without using LLMs, real data, internet search or any other model besides TabPFN. TabPFN-3 extends the capabilities of our models, enabling SOTA prediction on relational data (new SOTA foundation model on RelBenchV1) and tabular-text data (SOTA on TabSTAR via TabPFN-3-Plus); and improves existing integrations: a specialized checkpoint, TabPFN-TS-3, ranks 2nd on the time-series benchmark fev-bench, and SHAP-value computation is up to 120x faster. TabPFN-3 achieves this performance while being up to 20x faster than TabPFN-2.5. In addition, a reduced KV cache and row-chunking scale to 1M rows on one H100 with fast inference speed.
From Data to Action: Accelerating Refinery Optimization with AI
Pfeifer, Dรกniel, Papp, รbrahรกm, Bernรกth, Tibor, Varga, Tamรกs Zoltรกn, Czifra, Mรกrk, Szilรกgyi, Botond, Kovรกcs, Edith Alice
Nowadays refinery optimization utilizes sheer amounts of data, which can be handled with modern Linear Programming (LP) software, but the interpreting and applying the results remains challenging. Large petrochemical companies use massive models, with hundreds of thousands of input matrix elements. The LP solution is mathematically correct, but simplifications are made in the model, and data supply errors may occur. Therefore, further insight is needed to trust the results. The LP solver does not have a memory, so additional understanding could be gained by analyzing historical data and comparing it to the current plan. As such, machine learning approaches were suggested to support decision making based on the LP solution. Among these, Anomaly Detection tools are proposed to be used in tandem with the LP output. A transformed version of the popular ECOD methodology is applied. New methods are proposed to handle high-dimensional data: choosing the most informative pairs. Then, this is used alongside two 2D Anomaly Detection algorithms, revealing several business opportunities and data supply errors in the MOL refinery scheduling and planning architecture.
Open-Ended Task Discovery via Bayesian Optimization
Adachi, Masaki, Suzuki, Yuta, Ziomek, Juliusz
When applying Bayesian optimization (BO) to scientific workflow, a major yet often overlooked source of uncertainty is the task itself -- namely, what to optimize and how to evaluate it -- which can evolve as evidence accumulates. We introduce Generate-Select-Refine (GSR), a open-ended BO framework that alternates between task generation and task optimization. Starting from a user-provided seed task, GSR generates new tasks in a coarse-to-fine manner while a task-acquisition function schedules optimization. Asymptotically, it concentrates evaluations on the best task, incurring only logarithmic regret overhead relative to single-task BO. We apply GSR to new product development, chemical synthesis scaling, algorithm analysis, and patent repurposing, where it outperforms existing LLM-based optimizers.
Trump's Team Wants Him to Accept an Iran Deal He's Already Rejected
As chaotic negotiations over the end of the Iran war continue, US negotiators think they have the framework for a deal in place. Now they just have to sell the president on it. President Donald Trump's negotiators face the arduous task of trying to convince the president that a deal he previously rejected is their best option in Iran . Last month, Trump initially gave his blessing for a so-called "cash for uranium" deal, under which the US would release around $20 billion in frozen funds in exchange for Iran handing over its stockpile of highly enriched uranium, sources familiar with the matter tell WIRED. Trump's negotiators, vice president JD Vance, special envoy Steve Witkoff, and Jared Kushner, Trump's son-in-law, received repeated approvals from the president while they were in Islamabad, giving them confidence a deal was close.
Graph Convolutional Support Vector Regression for Robust Spatiotemporal Forecasting of Urban Air Pollution
Jahan, Nourin, Panja, Madhurima, T, Muhammed Navas, Chakraborty, Tanujit
Urban air quality forecasting is challenging because pollutant concentrations are nonlinear, nonstationary, spatiotemporally dependent, and often affected by anomalous observations caused by traffic congestion, industrial emissions, and seasonal meteorological variability. This study proposes a Graph Convolutional Support Vector Regression (GCSVR) framework for robust spatiotemporal forecasting of urban air pollution. The model combines graph convolutional learning to capture inter-station spatial dependence with support vector regression to model nonlinear temporal dynamics while reducing sensitivity to outlier observations. The proposed framework is evaluated using air quality records from 37 monitoring stations in Delhi and 18 stations in Mumbai, representing inland and coastal metropolitan environments in India. Forecasting performance is assessed across multiple horizons and compared with established temporal and spatiotemporal benchmarks. The results show that GCSVR consistently improves predictive accuracy and maintains stable performance across seasons and outlier-prone pollution episodes. Statistical test further confirms the reliability of the proposed approach across the two cities. Finally, conformal prediction is integrated with GCSVR to generate calibrated prediction intervals, enhancing its practical value for uncertainty-aware air quality monitoring and public health decision-making.
Robotically assembled building blocks could make construction more efficient and sustainable
Robotically assembled building blocks could be a more environmentally friendly method for erecting large-scale structures than some existing construction techniques, according to a new study by MIT researchers. The team conducted a feasibility study to evaluate the efficiency of constructing a simple building using "voxels," which are modular 3D subunits that assemble into complex, durable structures. After studying the performance of multiple voxels, the researchers developed three new designs intended to streamline building construction. They also produced a robotic assembler and a user-friendly interface for generating voxel-based building layouts and feeding instructions to the robots. Their results indicate this voxel-based robotic assembly system could reduce embodied carbon -- all of the carbon emitted during the lifecycle of building materials -- by as much as 82 percent, compared with popular techniques like 3D concrete printing, precast modular concrete, and steel framing.