contaminant
Machine learning powers new approach to detecting soil contaminants
A team of researchers at Rice University and Baylor College of Medicine has developed a new strategy for identifying hazardous pollutants in soil, even ones that have never been isolated or studied in a lab. The new approach, described in a study published in Proceedings of the National Academy of Sciences, uses light-based imaging, theoretical predictions of compounds' light signatures and machine learning (ML) algorithms to detect toxic compounds like polycyclic aromatic hydrocarbons (PAHs) and their derivative compounds (PACs) in soil. A common by-product of combustion, PAHs and PACs have been linked to cancer, developmental issues and other serious health problems. Identifying pollutants in soil usually requires advanced laboratories and standard physical reference samples of the suspected contaminants. However, for many environmental pollutants that pose a public health risk, there is no experimental data available that can be used to detect them.
- Energy > Oil & Gas (0.91)
- Health & Medicine > Consumer Health (0.56)
- Government > Regional Government > North America Government > United States Government (0.36)
Hyperspectral Imaging for Identifying Foreign Objects on Pork Belly
Ghimpeteanu, Gabriela, Rajani, Hayat, Quintana, Josep, Garcia, Rafael
Ensuring food safety and quality is critical in the food processing industry, where the detection of contaminants remains a persistent challenge. This study presents an automated solution for detecting foreign objects on pork belly meat using hyperspectral imaging (HSI). A hyperspectral camera was used to capture data across various bands in the near-infrared (NIR) spectrum (900-1700 nm), enabling accurate identification of contaminants that are often undetectable through traditional visual inspection methods. The proposed solution combines pre-processing techniques with a segmentation approach based on a lightweight Vision Transformer (ViT) to distinguish contaminants from meat, fat, and conveyor belt materials. The adopted strategy demonstrates high detection accuracy and training efficiency, while also addressing key industrial challenges such as inherent noise, temperature variations, and spectral similarity between contaminants and pork belly. Experimental results validate the effectiveness of hyperspectral imaging in enhancing food safety, highlighting its potential for broad real-time applications in automated quality control processes.
- North America > Canada > Alberta > Census Division No. 13 > Athabasca County (0.07)
- Europe > Spain (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.73)
Testing and Improving the Robustness of Amortized Bayesian Inference for Cognitive Models
Wu, Yufei, Radev, Stefan, Tuerlinckx, Francis
Contaminant observations and outliers often cause problems when estimating the parameters of cognitive models, which are statistical models representing cognitive processes. In this study, we test and improve the robustness of parameter estimation using amortized Bayesian inference (ABI) with neural networks. To this end, we conduct systematic analyses on a toy example and analyze both synthetic and real data using a popular cognitive model, the Drift Diffusion Models (DDM). First, we study the sensitivity of ABI to contaminants with tools from robust statistics: the empirical influence function and the breakdown point. Next, we propose a data augmentation or noise injection approach that incorporates a contamination distribution into the data-generating process during training. We examine several candidate distributions and evaluate their performance and cost in terms of accuracy and efficiency loss relative to a standard estimator. Introducing contaminants from a Cauchy distribution during training considerably increases the robustness of the neural density estimator as measured by bounded influence functions and a much higher breakdown point. Overall, the proposed method is straightforward and practical to implement and has a broad applicability in fields where outlier detection or removal is challenging.
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement
Wang, Kuan-Chen, Liu, Kai-Chun, Yeh, Ping-Cheng, Peng, Sheng-Yu, Tsao, Yu
Surface electromyography (sEMG) is a widely employed bio-signal that captures human muscle activity via electrodes placed on the skin. Several studies have proposed methods to remove sEMG contaminants, as non-invasive measurements render sEMG susceptible to various contaminants. However, these approaches often rely on heuristic-based optimization and are sensitive to the contaminant type. A more potent, robust, and generalized sEMG denoising approach should be developed for various healthcare and human-computer interaction applications. This paper proposes a novel neural network (NN)-based sEMG denoising method called TrustEMG-Net. It leverages the potent nonlinear mapping capability and data-driven nature of NNs. TrustEMG-Net adopts a denoising autoencoder structure by combining U-Net with a Transformer encoder using a representation-masking approach. The proposed approach is evaluated using the Ninapro sEMG database with five common contamination types and signal-to-noise ratio (SNR) conditions. Compared with existing sEMG denoising methods, TrustEMG-Net achieves exceptional performance across the five evaluation metrics, exhibiting a minimum improvement of 20%. Its superiority is consistent under various conditions, including SNRs ranging from -14 to 2 dB and five contaminant types. An ablation study further proves that the design of TrustEMG-Net contributes to its optimality, providing high-quality sEMG and serving as an effective, robust, and generalized denoising solution for sEMG applications.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Asia > Taiwan (0.05)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.85)
- Health & Medicine > Diagnostic Medicine > Imaging (0.71)
WasteGAN: Data Augmentation for Robotic Waste Sorting through Generative Adversarial Networks
Bacchin, Alberto, Barcellona, Leonardo, Terreran, Matteo, Ghidoni, Stefano, Menegatti, Emanuele, Kiyokawa, Takuya
Robotic waste sorting poses significant challenges in both perception and manipulation, given the extreme variability of objects that should be recognized on a cluttered conveyor belt. While deep learning has proven effective in solving complex tasks, the necessity for extensive data collection and labeling limits its applicability in real-world scenarios like waste sorting. To tackle this issue, we introduce a data augmentation method based on a novel GAN architecture called wasteGAN. The proposed method allows to increase the performance of semantic segmentation models, starting from a very limited bunch of labeled examples, such as few as 100. The key innovations of wasteGAN include a novel loss function, a novel activation function, and a larger generator block. Overall, such innovations helps the network to learn from limited number of examples and synthesize data that better mirrors real-world distributions. We then leverage the higher-quality segmentation masks predicted from models trained on the wasteGAN synthetic data to compute semantic-aware grasp poses, enabling a robotic arm to effectively recognizing contaminants and separating waste in a real-world scenario. Through comprehensive evaluation encompassing dataset-based assessments and real-world experiments, our methodology demonstrated promising potential for robotic waste sorting, yielding performance gains of up to 5.8\% in picking contaminants. The project page is available at https://github.com/bach05/wasteGAN.git
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- North America > United States (0.04)
- Europe > Italy (0.04)
- Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)
NotPlaNET: Removing False Positives from Planet Hunters TESS with Machine Learning
Poleo, Valentina Tardugno, Eisner, Nora, Hogg, David W.
Differentiating between real transit events and false positive signals in photometric time series data is a bottleneck in the identification of transiting exoplanets, particularly long-period planets. This differentiation typically requires visual inspection of a large number of transit-like signals to rule out instrumental and astrophysical false positives that mimic planetary transit signals. We build a one-dimensional convolutional neural network (CNN) to separate eclipsing binaries and other false positives from potential planet candidates, reducing the number of light curves that require human vetting. Our CNN is trained using the TESS light curves that were identified by Planet Hunters citizen scientists as likely containing a transit. We also include the background flux and centroid information. The light curves are visually inspected and labeled by project scientists and are minimally pre-processed, with only normalization and data augmentation taking place before training. The median percentage of contaminants flagged across the test sectors is 18% with a maximum of 37% and a minimum of 10%. Our model keeps 100% of the planets for 16 of the 18 test sectors, while incorrectly flagging one planet candidate (0.3%) for one sector and two (0.6%) for the remaining sector. Our method shows potential to reduce the number of light curves requiring manual vetting by up to a third with minimal misclassification of planet candidates.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Extracting chemical food safety hazards from the scientific literature automatically using large language models
Özen, Neris, Mu, Wenjuan, van Asselt, Esther D., Bulk, Leonieke M. van den
The number of scientific articles published in the domain of food safety has consistently been increasing over the last few decades. It has therefore become unfeasible for food safety experts to read all relevant literature related to food safety and the occurrence of hazards in the food chain. However, it is important that food safety experts are aware of the newest findings and can access this information in an easy and concise way. In this study, an approach is presented to automate the extraction of chemical hazards from the scientific literature through large language models. The large language model was used out-of-the-box and applied on scientific abstracts; no extra training of the models or a large computing cluster was required. Three different styles of prompting the model were tested to assess which was the most optimal for the task at hand. The prompts were optimized with two validation foods (leafy greens and shellfish) and the final performance of the best prompt was evaluated using three test foods (dairy, maize and salmon). The specific wording of the prompt was found to have a considerable effect on the results. A prompt breaking the task down into smaller steps performed best overall. This prompt reached an average accuracy of 93% and contained many chemical contaminants already included in food monitoring programs, validating the successful retrieval of relevant hazards for the food safety domain. The results showcase how valuable large language models can be for the task of automatic information extraction from the scientific literature.
Monitoring water contaminants in coastal areas through ML algorithms leveraging atmospherically corrected Sentinel-2 data
Razzano, Francesca, Mauro, Francesco, Di Stasio, Pietro, Meoni, Gabriele, Esposito, Marco, Schirinzi, Gilda, Ullo, Silvia Liberata
Monitoring water contaminants is of paramount importance, ensuring public health and environmental well-being. Turbidity, a key parameter, poses a significant problem, affecting water quality. Its accurate assessment is crucial for safeguarding ecosystems and human consumption, demanding meticulous attention and action. For this, our study pioneers a novel approach to monitor the Turbidity contaminant, integrating CatBoost Machine Learning (ML) with high-resolution data from Sentinel-2 Level-2A. Traditional methods are labor-intensive while CatBoost offers an efficient solution, excelling in predictive accuracy. Leveraging atmospherically corrected Sentinel-2 data through the Google Earth Engine (GEE), our study contributes to scalable and precise Turbidity monitoring. A specific tabular dataset derived from Hong Kong contaminants monitoring stations enriches our study, providing region-specific insights. Results showcase the viability of this integrated approach, laying the foundation for adopting advanced techniques in global water quality management.
Selection functions of strong lens finding neural networks
Herle, A., O'Riordan, C. M., Vegetti, S.
Convolution Neural Networks trained for the task of lens finding with similar architecture and training data as is commonly found in the literature are biased classifiers. An understanding of the selection function of lens finding neural networks will be key to fully realising the potential of the large samples of strong gravitational lens systems that will be found in upcoming wide-field surveys. We use three training datasets, representative of those used to train galaxy-galaxy and galaxy-quasar lens finding neural networks. The networks preferentially select systems with larger Einstein radii and larger sources with more concentrated source-light distributions. Increasing the detection significance threshold to 12$\sigma$ from 8$\sigma$ results in 50 per cent of the selected strong lens systems having Einstein radii $\theta_\mathrm{E}$ $\ge$ 1.04 arcsec from $\theta_\mathrm{E}$ $\ge$ 0.879 arcsec, source radii $R_S$ $\ge$ 0.194 arcsec from $R_S$ $\ge$ 0.178 arcsec and source S\'ersic indices $n_{\mathrm{Sc}}^{\mathrm{S}}$ $\ge$ 2.62 from $n_{\mathrm{Sc}}^{\mathrm{S}}$ $\ge$ 2.55. The model trained to find lensed quasars shows a stronger preference for higher lens ellipticities than those trained to find lensed galaxies. The selection function is independent of the slope of the power-law of the mass profiles, hence measurements of this quantity will be unaffected. The lens finder selection function reinforces that of the lensing cross-section, and thus we expect our findings to be a general result for all galaxy-galaxy and galaxy-quasar lens finding neural networks.
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- South America > Argentina (0.04)
When Spectral Modeling Meets Convolutional Networks: A Method for Discovering Reionization-era Lensed Quasars in Multi-band Imaging Data
Andika, Irham Taufik, Jahnke, Knud, van der Wel, Arjen, Bañados, Eduardo, Bosman, Sarah E. I., Davies, Frederick B., Eilers, Anna-Christina, Jaelani, Anton Timur, Mazzucchelli, Chiara, Onoue, Masafusa, Schindler, Jan-Torge
Over the last two decades, around 300 quasars have been discovered at $z\gtrsim6$, yet only one has identified as being strongly gravitationally lensed. We explore a new approach -- enlarging the permitted spectral parameter space, while introducing a new spatial geometry veto criterion -- which is implemented via image-based deep learning. We first apply this approach to a systematic search for reionization-era lensed quasars, using data from the Dark Energy Survey, the Visible and Infrared Survey Telescope for Astronomy Hemisphere Survey, and the Wide-field Infrared Survey Explorer.Our search method consists of two main parts: (i) the preselection of the candidates based on their spectral energy distributions (SEDs) using catalog-level photometry and (ii) relative probabilities calculation of the candidates being a lens or some contaminant, utilizing a convolutional neural network (CNN) classification. The training data sets are constructed by painting deflected point-source lights over actual galaxy images, to generate realistic galaxy-quasar lens models, optimized to find systems with small image separations, i.e., Einstein radii of $\theta_\mathrm{E} \leq 1$ arcsec. Visual inspection is then performed for sources with CNN scores of $P_\mathrm{lens} > 0.1$, which leads us to obtain 36 newly selected lens candidates, which are awaiting spectroscopic confirmation. These findings show that automated SED modeling and deep learning pipelines, supported by modest human input, are a promising route for detecting strong lenses from large catalogs that can overcome the veto limitations of primarily dropout-based SED selection approaches.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Netherlands > South Holland > Leiden (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (14 more...)