structural variant
GQVis: A Dataset of Genomics Data Questions and Visualizations for Generative AI
Walters, Skylar Sargent, Valderrama, Arthea, Smits, Thomas C., Kouřil, David, Nguyen, Huyen N., L'Yi, Sehi, Lange, Devin, Gehlenborg, Nils
Data visualization is a fundamental tool in genomics research, enabling the exploration, interpretation, and communication of complex genomic features. While machine learning models show promise for transforming data into insightful visualizations, current models lack the training foundation for domain-specific tasks. In an effort to provide a foundational resource for genomics-focused model training, we present a framework for generating a dataset that pairs abstract, low-level questions about genomics data with corresponding visualizations. Building on prior work with statistical plots, our approach adapts to the complexity of genomics data and the specialized representations used to depict them. We further incorporate multiple linked queries and visualizations, along with justifications for design choices, figure captions, and image alt-texts for each item in the dataset. We use genomics data retrieved from three distinct genomics data repositories (4DN, ENCODE, Chromoscope) to produce GQVis: a dataset consisting of 1.14 million single-query data points, 628k query pairs, and 589k query chains. The GQVis dataset and generation code are available at https://huggingface.co/datasets/HIDIVE/GQVis and https://github.com/hms-dbmi/GQVis-Generation.
Revisit Choice Network for Synthesis and Technology Mapping
Chen, Chen, Yin, Jiaqi, Yu, Cunxi
--Choice network construction is a critical technique for alleviating structural bias issues in Boolean optimization, equivalence checking, and technology mapping. Previous works on lossless synthesis utilize independent optimization to generate multiple snapshots, and use simulation and SA T solvers to identify functionally equivalent nodes. These nodes are then merged into a subject graph with choice nodes. However, such methods often neglect the quality of these choices--raising the question of whether they truly contribute to effective technology mapping. This paper introduces CRISTAL, a novel methodology and framework to constructing Boolean choice networks. Specifically, CRISTAL introduces a novel flow of choice network-based synthesis and mapping, includes representative logic cone search, structural mutation for generating diverse choice structures via equality saturation, and priority-ranking choice selection along with choice network construction and validation. Our experimental results demonstrate that CRISTAL outperforms the state-of-the-art Boolean choice network construction implemented in ABC in the post-mapping stage, achieving average reductions of 3.85%/8.35% The concept of choice network was pioneered to address optimization limitations in Electronic Design Automation (EDA).
Daily Digest
The human brain forms functional networks of correlated activity, which have been linked with both cognitive and clinical outcomes. However, the genetic variants affecting brain function are largely unknown. Here, researchers used resting-state functional magnetic resonance images from 47,276 individuals to discover and validate common genetic variants influencing intrinsic brain activity. They identified 45 new genetic regions associated with brain functional signatures (P 2.8 10 11). Although technological advances improved the identification of structural variants (SVs) in the human genome, their interpretation remains challenging.
Daily Digest
Recent development of spatial transcriptomic technologies has made it possible to characterize cellular heterogeneity with spatial information. Here, researchers present spatialDWLS, to quantitatively estimate the cell-type composition at each spatial location. They benchmark the performance of spatialDWLS by comparing it with a number of existing deconvolution methods and find that spatialDWLS outperforms the other methods in terms of accuracy and speed. Long-read sequencing (LRS) promises to improve the characterization of structural variants (SVs). Researchers generated LRS data from 3,622 Icelanders and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions).