Basel-Landschaft
TabINR: An Implicit Neural Representation Framework for Tabular Data Imputation
Ochs, Vincent, Bieder, Florentin, Hadramy, Sidaty el, Friedrich, Paul, Taha-Mehlitz, Stephanie, Taha, Anas, Cattin, Philippe C.
Tabular data builds the basis for a wide range of applications, yet real-world datasets are frequently incomplete due to collection errors, privacy restrictions, or sensor failures. As missing values degrade the performance or hinder the applicability of downstream models, and while simple imputing strategies tend to introduce bias or distort the underlying data distribution, we require imputers that provide high-quality imputations, are robust across dataset sizes and yield fast inference. INR, an auto-decoder based Implicit Neural Representation (INR) framework that models tables as neural functions. Building on recent advances in generalizable INRs, we introduce learnable row and feature embeddings that effectively deal with the discrete structure of tabular data and can be inferred from partial observations, enabling instance adaptive imputations without modifying the trained model. We evaluate our framework across a diverse range of twelve real-world datasets and multiple missingness mechanisms, demonstrating consistently strong imputation accuracy, mostly matching or outperforming classical (KNN, MICE, MissForest) and deep learning based models (GAIN, ReMasker), with the clearest gains on high-dimensional datasets. Tabular data remains one of the most common data formats across domains such as healthcare, finance, and the social sciences (Shwartz-Ziv & Armon, 2022). In these fields, missing values are ubiquitous and can severely degrade the performance of downstream machine learning models. Poor handling of missingness not only reduces predictive accuracy but may also lead to biased decisions, with real-world consequences for applications such as medical diagnostics or financial risk assessment. These challenges make robust imputation a critical step for trustworthy tabular learning and data-driven decision making (Rubin, 1976).
- Europe > Switzerland > Basel-City > Basel (0.04)
- North America > United States > North Carolina > Pitt County > Greenville (0.04)
- North America > United States > California (0.04)
- (2 more...)
Towards AI-Based Precision Oncology: A Machine Learning Framework for Personalized Counterfactual Treatment Suggestions based on Multi-Omics Data
Schürch, Manuel, Boos, Laura, Heinzelmann-Schwarz, Viola, Gut, Gabriele, Krauthammer, Michael, Wicki, Andreas, Consortium, Tumor Profiler
AI-driven precision oncology has the transformative potential to reshape cancer treatment by leveraging the power of AI models to analyze the interaction between complex patient characteristics and their corresponding treatment outcomes. New technological platforms have facilitated the timely acquisition of multimodal data on tumor biology at an unprecedented resolution, such as single-cell multi-omics data, making this quality and quantity of data available for data-driven improved clinical decision-making. In this work, we propose a modular machine learning framework designed for personalized counterfactual cancer treatment suggestions based on an ensemble of machine learning experts trained on diverse multi-omics technologies. These specialized counterfactual experts per technology are consistently aggregated into a more powerful expert with superior performance and can provide both confidence and an explanation of its decision. The framework is tailored to address critical challenges inherent in data-driven cancer research, including the high-dimensional nature of the data, and the presence of treatment assignment bias in the retrospective observational data. The framework is showcased through comprehensive demonstrations using data from in-vitro and in-vivo treatment responses from a cohort of patients with ovarian cancer. Our method aims to empower clinicians with a reality-centric decision-support tool including probabilistic treatment suggestions with calibrated confidence and personalized explanations for tailoring treatment strategies to multi-omics characteristics of individual cancer patients.
- Europe > Switzerland > Zürich > Zürich (0.18)
- Europe > Switzerland > Basel-City > Basel (0.06)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
- (6 more...)
CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation
Alam, Md Mahfuz Ibn, Ahmadi, Sina, Anastasopoulos, Antonios
Neural machine translation (NMT) systems exhibit limited robustness in handling source-side linguistic variations. Their performance tends to degrade when faced with even slight deviations in language usage, such as different domains or variations introduced by second-language speakers. It is intuitive to extend this observation to encompass dialectal variations as well, but the work allowing the community to evaluate MT systems on this dimension is limited. To alleviate this issue, we compile and release \dataset, a contrastive dialectal benchmark encompassing 882 different variations from nine different languages. We also quantitatively demonstrate the challenges large MT models face in effectively translating dialectal variants. We are releasing all code and data.
- Europe > Germany (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Veneto (0.04)
- (67 more...)