Zug
DiffInfinite: Large Mask-Image Synthesis via Parallel Random Patch Diffusion in Histopathology
We present DiffInfinite, a hierarchical diffusion model that generates arbitrarily large histological images while preserving long-range correlation structural information. Our approach first generates synthetic segmentation masks, subsequently used as conditions for the high-fidelity generative diffusion process.
In-silico biological discovery with large perturbation models
Miladinovic, Djordje, Hรถppe, Tobias, Chevalley, Mathieu, Georgiou, Andreas, Stuart, Lachlan, Mehrjou, Arash, Bantscheff, Marcus, Schรถlkopf, Bernhard, Schwab, Patrick
Data generated in perturbation experiments link perturbations to the changes they elicit and therefore contain information relevant to numerous biological discovery tasks -- from understanding the relationships between biological entities to developing therapeutics. However, these data encompass diverse perturbations and readouts, and the complex dependence of experimental outcomes on their biological context makes it challenging to integrate insights across experiments. Here, we present the Large Perturbation Model (LPM), a deep-learning model that integrates multiple, heterogeneous perturbation experiments by representing perturbation, readout, and context as disentangled dimensions. LPM outperforms existing methods across multiple biological discovery tasks, including in predicting post-perturbation transcriptomes of unseen experiments, identifying shared molecular mechanisms of action between chemical and genetic perturbations, and facilitating the inference of gene-gene interaction networks.
Multi-megabase scale genome interpretation with genetic language models
Trรคuble, Frederik, Stuart, Lachlan, Georgiou, Andreas, Notin, Pascal, Mehrjou, Arash, Schwessinger, Ron, Chevalley, Mathieu, Branson, Kim, Schรถlkopf, Bernhard, van Duijn, Cornelia, Marks, Debora, Schwab, Patrick
Understanding how molecular changes caused by genetic variation drive disease risk is crucial for deciphering disease mechanisms. However, interpreting genome sequences is challenging because of the vast size of the human genome, and because its consequences manifest across a wide range of cells, tissues and scales -- spanning from molecular to whole organism level. Here, we present Phenformer, a multi-scale genetic language model that learns to generate mechanistic hypotheses as to how differences in genome sequence lead to disease-relevant changes in expression across cell types and tissues directly from DNA sequences of up to 88 million base pairs. Using whole genome sequencing data from more than 150 000 individuals, we show that Phenformer generates mechanistic hypotheses about disease-relevant cell and tissue types that match literature better than existing state-of-the-art methods, while using only sequence data. Furthermore, disease risk predictors enriched by Phenformer show improved prediction performance and generalisation to diverse populations. Accurate multi-megabase scale interpretation of whole genomes without additional experimental data enables both a deeper understanding of molecular mechanisms involved in disease and improved disease risk prediction at the level of individuals.
Measuring the Groundedness of Legal Question-Answering Systems
Trautmann, Dietrich, Ostapuk, Natalia, Grail, Quentin, Pol, Adrian Alan, Bonifazi, Guglielmo, Gao, Shang, Gajek, Martin
In high-stakes domains like legal question-answering, the accuracy and trustworthiness of generative AI systems are of paramount importance. This work presents a comprehensive benchmark of various methods to assess the groundedness of AI-generated responses, aiming to significantly enhance their reliability. Our experiments include similarity-based metrics and natural language inference models to evaluate whether responses are well-founded in the given contexts. We also explore different prompting strategies for large language models to improve the detection of ungrounded responses. We validated the effectiveness of these methods using a newly created grounding classification corpus, designed specifically for legal queries and corresponding responses from retrieval-augmented prompting, focusing on their alignment with source material. Our results indicate potential in groundedness classification of generated responses, with the best method achieving a macro-F1 score of 0.8. Additionally, we evaluated the methods in terms of their latency to determine their suitability for real-world applications, as this step typically follows the generation process. This capability is essential for processes that may trigger additional manual verification or automated response regeneration. In summary, this study demonstrates the potential of various detection methods to improve the trustworthiness of generative AI in legal settings.
Unlabeled Debiasing in Downstream Tasks via Class-wise Low Variance Regularization
Masoudian, Shahed, Frohmann, Markus, Rekabsaz, Navid, Schedl, Markus
Language models frequently inherit societal biases from their training data. Numerous techniques have been proposed to mitigate these biases during both the pre-training and fine-tuning stages. However, fine-tuning a pre-trained debiased language model on a downstream task can reintroduce biases into the model. Additionally, existing debiasing methods for downstream tasks either (i) require labels of protected attributes (e.g., age, race, or political views) that are often not available or (ii) rely on indicators of bias, which restricts their applicability to gender debiasing since they rely on gender-specific words. To address this, we introduce a novel debiasing regularization technique based on the class-wise variance of embeddings. Crucially, our method does not require attribute labels and targets any attribute, thus addressing the shortcomings of existing debiasing methods. Our experiments on encoder language models and three datasets demonstrate that our method outperforms existing strong debiasing baselines that rely on target attribute labels while maintaining performance on the target task.
Towards Hypermedia Environments for Adaptive Coordination in Industrial Automation
Ramanathan, Ganesh, Mayer, Simon, Ciortea, Andrei
Electromechanical systems manage physical processes through a network of inter-connected components. Today, programming the interactions required for coordinating these components is largely a manual process. This process is time-consuming and requires manual adaptation when system features change. To overcome this issue, we use autonomous software agents that process semantic descriptions of the system to determine coordination requirements and constraints; on this basis, they then interact with one another to control the system in a decentralized and coordinated manner.Our core insight is that coordination requirements between individual components are, ultimately, largely due to underlying physical interdependencies between the components, which can be (and, in many cases, already are) semantically modeled in automation projects. Agents then use hypermedia to discover, at run time, the plans and protocols required for enacting the coordination. A key novelty of our approach is the use of hypermedia-driven interaction: it reduces coupling in the system and enables its run-time adaptation as features change.
Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models
Tragakis, Athanasios, Aversa, Marco, Kaul, Chaitanya, Murray-Smith, Roderick, Faccio, Daniele
In this work, we introduce Pixelsmith, a zero-shot text-to-image generative framework to sample images at higher resolutions with a single GPU. We are the first to show that it is possible to scale the output of a pre-trained diffusion model by a factor of 1000, opening the road for gigapixel image generation at no additional cost. Our cascading method uses the image generated at the lowest resolution as a baseline to sample at higher resolutions. For the guidance, we introduce the Slider, a tunable mechanism that fuses the overall structure contained in the first-generated image with enhanced fine details. At each inference step, we denoise patches rather than the entire latent space, minimizing memory demands such that a single GPU can handle the process, regardless of the image's resolution. Our experimental results show that Pixelsmith not only achieves higher quality and diversity compared to existing techniques, but also reduces sampling time and artifacts.