Goto

Collaborating Authors

 data evaluation


Bayesian Optimization of Process Parameters of a Sensor-Based Sorting System using Gaussian Processes as Surrogate Models

Kronenwett, Felix, Maier, Georg, Längle, Thomas

arXiv.org Artificial Intelligence

Sensor-based sorting systems enable the physical separation of a material stream into two fractions. The sorting decision is based on the image data evaluation of the sensors used and is carried out using actuators. Various process parameters must be set depending on the properties of the material stream, the dimensioning of the system, and the required sorting accuracy. However, continuous verification and re-adjustment are necessary due to changing requirements and material stream compositions. In this paper, we introduce an approach for optimizing, recurrently monitoring and adjusting the process parameters of a sensor-based sorting system. Based on Bayesian Optimization, Gaussian process regression models are used as surrogate models to achieve specific requirements for system behavior with the uncertainties contained therein. This method minimizes the number of necessary experiments while simultaneously considering two possible optimization targets based on the requirements for both material output streams. In addition, uncertainties are considered during determining sorting accuracies in the model calculation. We evaluated the method with three example process parameters.


scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction

Wang, Qing, Pan, Yining, Zhou, Minghao, Tang, Zijia, Wang, Yanfei, Wang, Guangyu, Song, Qianqian

arXiv.org Artificial Intelligence

Drug resistance presents a major challenge in cancer therapy. Single cell profiling offers insights into cellular heterogeneity, yet the application of large-scale foundation models for predicting drug response in single cell data remains underexplored. To address this, we developed scDrugMap, an integrated framework featuring both a Python command-line interface and a web server for drug response prediction. scDrugMap evaluates a wide range of foundation models, including eight single-cell models and two large language models, using a curated dataset of over 326,000 cells in the primary collection and 18,800 cells in the validation set, spanning 36 datasets and diverse tissue and cancer types. We benchmarked model performance under pooled-data and cross-data evaluation settings, employing both layer freezing and Low-Rank Adaptation (LoRA) fine-tuning strategies. In the pooled-data scenario, scFoundation achieved the best performance, with mean F1 scores of 0.971 (layer freezing) and 0.947 (fine-tuning), outperforming the lowest-performing model by over 50%. In the cross-data setting, UCE excelled post fine-tuning (mean F1: 0.774), while scGPT led in zero-shot learning (mean F1: 0.858). Overall, scDrugMap provides the first large-scale benchmark of foundation models for drug response prediction in single-cell data and serves as a user-friendly, flexible platform for advancing drug discovery and translational research.


Invisible Analytics and Embedded AI Are the Future of Data Evaluation

#artificialintelligence

Entrepreneurs can make informed mission-critical decisions on-the-fly using invisible analytics and embedded artificial intelligence (AI). Businesses leverage data to capture remarkable insights about consumers. However, now it's not the actionable data that matters the most for forward-thinking business leaders. Today's cutting-edge executives want convenient access to information at the point of decisions. Now, you can access invaluable insights when it matters the most.


5 ways to improve AI/ML deployments

#artificialintelligence

In January 2019, Gartner released a survey where 37% of respondents said they were already using artificial intelligence (AI) in some capacity, but 54% of respondents reported skills shortages in their organisations that prevented them from moving forward with AI more aggressively. This is not referring to data scientists, who continue to be in short demand and are aggressively being hired, rather to the fact that many organisations do not operational their AI efforts with IT project methodologies to ensure that projects meet their business goals. "What we are seeing is a lot of data science teams that are working on many concurrent ML and AI initiatives, but fewer that have deployed the models into actual production applications," said Nathaniel Gates, CEO of Alegion, which specializes in training machine learning (ML) data. Gates added that highly skilled data scientists may lack practical business experience in data preparation and project management. "These people are skilled at conceptualizing, building out, and testing AI and ML algorithms," he continued.


Data Version Control: iterative machine learning

#artificialintelligence

It is hardly possible in real life to develop a good machine learning model in a single pass. ML modeling is an iterative process and it is extremely important to keep track of your steps, dependencies between the steps, dependencies between your code and data files and all code running arguments. This becomes even more important and complicated in a team environment where data scientists' collaboration takes a serious amount of the team's effort. Today, we are pleased to announce the beta version release of new open source tool -- data version control or DVC. DVC is designed to help data scientists keep track of their ML processes and file dependencies in the simple form of git-like commands: "dvc run python train_model.py Your existing ML processes can be easily transformed into reproducible DVC pipelines regardless of which programming language or tool was used.


Data Version Control: iterative machine learning

@machinelearnbot

ML modeling is an iterative process and it is extremely important to keep track of your steps, dependencies between the steps, dependencies between your code and data files and all code running arguments. DVC is designed to help data scientists keep track of their ML processes and file dependencies in the simple form of git-like commands: "dvc run python train_model.py Your existing ML processes can be easily transformed into reproducible DVC pipelines regardless of which programming language or tool was used. This blog post walks you through an iterative process of building a machine learning model with DVC using stackoverflow posts dataset. Thus, the model can be improved iteratively and DVC simplifies the iterative ML process and aids collaboration between data scientists.


Data Version Control: iterative machine learning

#artificialintelligence

It is hardly possible in real life to develop a good machine learning model in a single pass. ML modeling is an iterative process and it is extremely important to keep track of your steps, dependencies between the steps, dependencies between your code and data files and all code running arguments. This becomes even more important and complicated in a team environment where data scientists' collaboration takes a serious amount of the team's effort. Today, we are pleased to announce the beta version release of new open source tool -- data version control or DVC. DVC is designed to help data scientists keep track of their ML processes and file dependencies in the simple form of git-like commands: "dvc run python train_model.py


Data Version Control: iterative machine learning

#artificialintelligence

It is hardly possible in real life to develop a good machine learning model in a single pass. ML modeling is an iterative process and it is extremely important to keep track of your steps, dependencies between the steps, dependencies between your code and data files and all code running arguments. This becomes even more important and complicated in a team environment where data scientists' collaboration takes a serious amount of the team's effort. Today, we are pleased to announce the beta version release of new open source tool -- data version control or DVC. DVC is designed to help data scientists keep track of their ML processes and file dependencies in the simple form of git-like commands: "dvc run python train_model.py Your existing ML processes can be easily transformed into reproducible DVC pipelines regardless of which programming language or tool was used.


Data Evaluation in Smart Sensor Networks Using Inverse Methods and Artificial Intelligence (AI): Towards Real-Time Capability and Enhanced Flexibility

#artificialintelligence

Data evaluation is crucial for gaining information from sensor networks. Main challenges include processing speed and adaptivity to system change, both prerequisites for SHM-based weight reduction via relaxed safety factors. Our study looks at soft real time solutions providing feedback within defined but flexible, application-controlled intervals. These can rely on minimizing computation/communication latencies e.g. by parallel computation. Strategies towards this aim can be model-based, including inverse FEM, or model-free, including machine learning, which in practice bases training on a defined system state, too, hence also facing challenges at state changes.