spectroscopy
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada (0.04)
Deep imitation learning for molecular inverse problems
Many measurement modalities arise from well-understood physical processes and result in information-rich but difficult-to-interpret data. Much of this data still requires laborious human interpretation. This is the case in nuclear magnetic resonance (NMR) spectroscopy, where the observed spectrum of a molecule provides a distinguishing fingerprint of its bond structure. Here we solve the resulting inverse problem: given a molecular formula and a spectrum, can we infer the chemical structure? We show for a wide variety of molecules we can quickly compute the correct molecular structure, and can detect with reasonable certainty when our method cannot. We treat this as a problem of graph-structured prediction, where armed with per-vertex information on a subset of the vertices, we infer the edges and edge types. We frame the problem as a Markov decision process (MDP) and incrementally construct molecules one bond at a time, training a deep neural network via imitation learning, where we learn to imitate a subisomorphic oracle which knows which remaining bonds are correct. Our method is fast, accurate, and is the first among recent chemical-graph generation approaches to exploit per-vertex information and generate graphs with vertex constraints. Our method points the way towards automation of molecular structure identification and potentially active learning for spectroscopy.
Water Quality Estimation Through Machine Learning Multivariate Analysis
Cardia, Marco, Chessa, Stefano, Micheli, Alessio, Luminare, Antonella Giuliana, Gambineri, Francesca
The quality of water is key for the quality of agrifood sector. Water is used in agriculture for fertigation, for animal husbandry, and in the agrifood processing industry. In the context of the progressive digitalization of this sector, the automatic assessment of the quality of water is thus becoming an important asset. In this work, we present the integration of Ultraviolet-Visible (UV-Vis) spectroscopy with Machine Learning in the context of water quality assessment aiming at ensuring water safety and the compliance of water regulation. Furthermore, we emphasize the importance of model inter-pretability by employing SHapley Additive exPlanations (SHAP) to understand the contribution of absorbance at different wavelengths to the predictions. Our approach demonstrates the potential for rapid, accurate, and interpretable assessment of key water quality parameters.
- Europe > Italy > Tuscany (0.05)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Europe > Switzerland (0.04)
Self-supervised and Multi-fidelity Learning for Extended Predictive Soil Spectroscopy
Sun, Luning, Safanelli, José L., Sanderman, Jonathan, Georgiou, Katerina, Brungard, Colby, Grover, Kanchan, Hopkins, Bryan G., Liu, Shusen, Bremer, Timo
We propose a self-supervised machine learning (SSML) framework for multi-fidelity learning and extended predictive soil spectroscopy based on latent space embeddings. A self-supervised representation was pretrained with the large MIR spectral library and the Variational Autoencoder algorithm to obtain a compressed latent space for generating spectral embeddings. At this stage, only unlabeled spectral data were used, allowing us to leverage the full spectral database and the availability of scan repeats for augmented training. We also leveraged and froze the trained MIR decoder for a spectrum conversion task by plugging it into a NIR encoder to learn the mapping between NIR and MIR spectra in an attempt to leverage the predictive capabilities contained in the large MIR library with a low cost portable NIR scanner. This was achieved by using a smaller subset of the KSSL library with paired NIR and MIR spectra. Downstream machine learning models were then trained to map between original spectra, predicted spectra, and latent space embeddings for nine soil properties. The performance of was evaluated independently of the KSSL training data using a gold-standard test set, along with regression goodness-of-fit metrics. Compared to baseline models, the proposed SSML and its embeddings yielded similar or better accuracy in all soil properties prediction tasks. Predictions derived from the spectrum conversion (NIR to MIR) task did not match the performance of the original MIR spectra but were similar or superior to predictive performance of NIR-only models, suggesting the unified spectral latent space can effectively leverage the larger and more diverse MIR dataset for prediction of soil properties not well represented in current NIR libraries.
- North America > United States > Washington > King County > Bellevue (0.04)
- North America > United States > Utah > Utah County > Provo (0.04)
- North America > United States > Oregon > Benton County > Corvallis (0.04)
- (4 more...)
- Food & Agriculture > Agriculture (1.00)
- Energy (0.93)
- Government > Regional Government > North America Government > United States Government (0.68)
GAMMA_FLOW: Guided Analysis of Multi-label spectra by MAtrix Factorization for Lightweight Operational Workflows
Rädle, Viola, Hartwig, Tilman, Oesen, Benjamin, Kröger, Emily Alice, Vogt, Julius, Gericke, Eike, Baron, Martin
GAMMA_FLOW is an open-source Python package for real-time analysis of spectral data. It supports classification, denoising, decomposition, and outlier detection of both single- and multi-component spectra. Instead of relying on large, computationally intensive models, it employs a supervised approach to non-negative matrix factorization (NMF) for dimensionality reduction. This ensures a fast, efficient, and adaptable analysis while reducing computational costs. gamma_flow achieves classification accuracies above 90% and enables reliable automated spectral interpretation. Originally developed for gamma-ray spectra, it is applicable to any type of one-dimensional spectral data. As an open and flexible alternative to proprietary software, it supports various applications in research and industry.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > Hungary > Budapest > Budapest (0.04)
- (4 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Law (0.68)
MolSpectLLM: A Molecular Foundation Model Bridging Spectroscopy, Molecule Elucidation, and 3D Structure Generation
Shen, Shuaike, Xie, Jiaqing, Yang, Zhuo, Zhang, Antong, Sun, Shuzhou, Gao, Ben, Fu, Tianfan, Qi, Biqing, Li, Yuqiang
Recent advances in molecular foundation models have shown impressive performance in molecular property prediction and de novo molecular design, with promising applications in areas such as drug discovery and reaction prediction. Nevertheless, most existing approaches rely exclusively on SMILES representations and overlook both experimental spectra and 3D structural information-two indispensable sources for capturing molecular behavior in real-world scenarios. This limitation reduces their effectiveness in tasks where stereochemistry, spatial conformation, and experimental validation are critical. To overcome these challenges, we propose MolSpectLLM, a molecular foundation model pretrained on Qwen2.5-7B that unifies experimental spectroscopy with molecular 3D structure. By explicitly modeling molecular spectra, MolSpectLLM achieves state-of-the-art performance on spectrum-related tasks, with an average accuracy of 0.53 across NMR, IR, and MS benchmarks. MolSpectLLM also shows strong performance on the spectra analysis task, obtaining 15.5% sequence accuracy and 41.7% token accuracy on Spectra-to-SMILES, substantially outperforming large general-purpose LLMs. More importantly, MolSpectLLM not only achieves strong performance on molecular elucidation tasks, but also generates accurate 3D molecular structures directly from SMILES or spectral inputs, bridging spectral analysis, molecular elucidation, and molecular design. Code are available at \href{https://github.com/Eurekashen/MolSpectLLM}{https://github.com/Eurekashen/MolSpectLLM}.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- (2 more...)
SCANS: A Soft Gripper with Curvature and Spectroscopy Sensors for In-Hand Material Differentiation
Hanson, Nathaniel, Allison, Austin, DiMarzio, Charles, Padır, Taşkın, Dorsey, Kristen L.
We introduce the soft curvature and spectroscopy (SCANS) system: a versatile, electronics-free, fluidically actuated soft manipulator capable of assessing the spectral properties of objects either in hand or through pre-touch caging. This platform offers a wider spectral sensing capability than previous soft robotic counterparts. We perform a material analysis to explore optimal soft substrates for spectral sensing, and evaluate both pre-touch and in-hand performance. Experiments demonstrate explainable, statistical separation across diverse object classes and sizes (metal, wood, plastic, organic, paper, foam), with large spectral angle differences between items. Through linear discriminant analysis, we show that sensitivity in the near-infrared wavelengths is critical to distinguishing visually similar objects. These capabilities advance the potential of optics as a multi-functional sensory modality for soft robots. The complete parts list, assembly guidelines, and processing code for the SCANS gripper are accessible at: https://parses-lab.github.io/scans/.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Lexington (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)
SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy
Yang, Zhuo, Xie, Jiaqing, Shen, Shuaike, Wang, Daolang, Chen, Yeyun, Gao, Ben, Sun, Shuzhou, Qi, Biqing, Zhou, Dongzhan, Bai, Lei, Chen, Linjiang, Zhang, Shufei, Gu, Qinying, Jiang, Jun, Fu, Tianfan, Li, Yuqiang
Deep learning holds immense promise for spectroscopy, yet research and evaluation in this emerging field often lack standardized formulations. To address this issue, we introduce SpectrumLab, a pioneering unified platform designed to systematize and accelerate deep learning research in spectroscopy. SpectrumLab integrates three core components: a comprehensive Python library featuring essential data processing and evaluation tools, along with leaderboards; an innovative SpectrumAnnotator module that generates high-quality benchmarks from limited seed data; and SpectrumBench, a multi-layered benchmark suite covering 14 spectroscopic tasks and over 10 spectrum types, featuring spectra curated from over 1.2 million distinct chemical substances. Thorough empirical studies on SpectrumBench with 18 cutting-edge multimodal LLMs reveal critical limitations of current approaches. We hope SpectrumLab will serve as a crucial foundation for future advancements in deep learning-driven spectroscopy.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
- (3 more...)
Network-Based Detection of Autism Spectrum Disorder Using Sustainable and Non-invasive Salivary Biomarkers
Fernandes, Janayna M., Sabino-Silva, Robinson, Carneiro, Murillo G.
Autism Spectrum Disorder (ASD) lacks reliable biological markers, delaying early diagnosis. Using 159 salivary samples analyzed by ATR-FTIR spectroscopy, we developed GANet, a genetic algorithm-based network optimization framework leveraging PageRank and Degree for importance-based feature characterization. GANet systematically optimizes network structure to extract meaningful patterns from high-dimensional spectral data. It achieved superior performance compared to linear discriminant analysis, support vector machines, and deep learning models, reaching 0.78 accuracy, 0.61 sensitivity, 0.90 specificity, and a 0.74 harmonic mean. These results demonstrate GANet's potential as a robust, bio-inspired, non-invasive tool for precise ASD detection and broader spectral-based health applications.
- South America > Brazil (0.04)
- North America > United States (0.04)
- North America > Central America (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- (2 more...)
CNN-BiLSTM for sustainable and non-invasive COVID-19 detection via salivary ATR-FTIR spectroscopy
Junior, Anisio P. Santos, Sabino-Silva, Robinson, Martins, Mário Machado, Cunha, Thulio Marquez, Carneiro, Murillo G.
The COVID-19 pandemic has placed unprecedented strain on healthcare systems and remains a global health concern, especially with the emergence of new variants. Although real-time polymerase chain reaction (RT-PCR) is considered the gold standard for COVID-19 detection, it is expensive, time-consuming, labor-intensive, and sensitive to issues with RNA extraction. In this context, ATR-FTIR spectroscopy analysis of biofluids offers a reagent-free, cost-effective alternative for COVID-19 detection. We propose a novel architecture that combines Convolutional Neural Networks (CNN) with Bidirectional Long Short-Term Memory (BiLSTM) networks, referred to as CNN-BiLSTM, to process spectra generated by ATR-FTIR spectroscopy and diagnose COVID-19 from spectral samples. We compare the performance of this architecture against a standalone CNN and other state-of-the-art machine learning techniques. Experimental results demonstrate that our CNN-BiLSTM model outperforms all other models, achieving an average accuracy and F1-score of 0.80 on a challenging real-world COVID-19 dataset. The addition of the BiLSTM layer to the CNN architecture significantly enhances model performance, making CNN-BiLSTM a more accurate and reliable choice for detecting COVID-19 using ATR-FTIR spectra of non-invasive saliva samples.
- South America > Brazil (0.04)
- North America > United States (0.04)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)