Goto

Collaborating Authors

 mlp 0


A Additional experimental details

Neural Information Processing Systems

RBF kernel to increase pretraining data diversity. Architectural details In all experiments, we use the same ExPT architecture. This section details how we constructed new objectives from the original D'Kitty and Ant that we In Ant-Energy, the reward at each time step is: R =1+ Survival reward Control cost Contact cost, (6) which means we incentivize the robot to conserve energy instead of running fast. D'Kitty tasks In D'Kitty, the goal is to design a morphology that allows the D'Kitty robot to reach We found the approximate oracle provided by Design-Bench not accurate enough to provide a reliable comparison of optimization methods on this task. C.1 Effects of GP hyperparameters We empirically examine the impact of two GP hyperparameters, the variance and the length scale ` Specifically, we evaluate the performance of ExPT on D'Kitty We average the performance across 3 seeds.



Murmur2Vec: A Hashing Based Solution For Embedding Generation Of COVID-19 Spike Sequences

Ali, Sarwan, Murad, Taslim

arXiv.org Artificial Intelligence

Early detection and characterization of coronavirus disease (COVID-19), caused by SARS-CoV-2, remain critical for effective clinical response and public-health planning. The global availability of large-scale viral sequence data presents significant opportunities for computational analysis; however, existing approaches face notable limitations. Phylogenetic tree-based methods are computationally intensive and do not scale efficiently to today's multi-million-sequence datasets. Similarly, current embedding-based techniques often rely on aligned sequences or exhibit suboptimal predictive performance and high runtime costs, creating barriers to practical large-scale analysis. In this study, we focus on the most prevalent SARS-CoV-2 lineages associated with the spike protein region and introduce a scalable embedding method that leverages hashing to generate compact, low-dimensional representations of spike sequences. These embeddings are subsequently used to train a variety of machine learning models for supervised lineage classification. We conduct an extensive evaluation comparing our approach with multiple baseline and state-of-the-art biological sequence embedding methods across diverse metrics. Our results demonstrate that the proposed embeddings offer substantial improvements in efficiency, achieving up to 86.4\% classification accuracy while reducing embedding generation time by as much as 99.81\%. This highlights the method's potential as a fast, effective, and scalable solution for large-scale viral sequence analysis.


A Datasets 568 A.1 Dataset format

Neural Information Processing Systems

For each dataset, all unprocessed raw files are represented in .json The datasets are subject to the MIT license. In this subsection, we further analyze the link prediction from the various models applied in the study. Table 6 and 7 represent the effect of link prediction on different datasets from various distinct. In this subsection, we further analyze the node classification results from various models.


SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks - Supplementary Material

Neural Information Processing Systems

In the main text, we described how some edges may receive no supervision during latent graph learning. Specifically, we tested SLAPS without self-supervision (i.e. The accuracies for 2, 4, and 6-layer models are 66.2%, 67.1%, and 55.8% respectively. It can be viewed that increasing the number of layers from 2 to 4 provides an improvement. Work was done when authors were at Borealis AI. 35th Conference on Neural Information Processing Systems (NeurIPS 2021).



Dual-Pathway Fusion of EHRs and Knowledge Graphs for Predicting Unseen Drug-Drug Interactions

Lee, Franklin, Ma, Tengfei

arXiv.org Artificial Intelligence

Drug-drug interactions (DDIs) remain a major source of preventable harm, and many clinically important mechanisms are still unknown. Existing models either rely on pharmacologic knowledge graphs (KGs), which fail on unseen drugs, or on electronic health records (EHRs), which are noisy, temporal, and site-dependent. We introduce, to our knowledge, the first system that conditions KG relation scoring on patient-level EHR context and distills that reasoning into an EHR-only model for zero-shot inference. A fusion "Teacher" learns mechanism-specific relations for drug pairs represented in both sources, while a distilled "Student" generalizes to new or rarely used drugs without KG access at inference. Both operate under a shared ontology (set) of pharmacologic mechanisms (drug relations) to produce interpretable, auditable alerts rather than opaque risk scores. Trained on a multi-institution EHR corpus paired with a curated DrugBank DDI graph, and evaluated using a a clinically aligned, decision-focused protocol with leakage-safe negatives that avoid artificially easy pairs, the system maintains precision across multi-institutuion test data, produces mechanism-specific, clinically consistent predictions, reduces false alerts (higher precision) at comparable overall detection performance (F1), and misses fewer true interactions compared to prior methods. Case studies further show zero-shot identification of clinically recognized CYP-mediated and pharmacodynamic mechanisms for drugs absent from the KG, supporting real-world use in clinical decision support and pharmacovigilance.


Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Bae, Sangmin, Kim, Yujin, Bayat, Reza, Kim, Sungnyun, Ha, Jiyoun, Schuster, Tal, Fisch, Adam, Harutyunyan, Hrayr, Ji, Ziwei, Courville, Aaron, Yun, Se-Young

arXiv.org Artificial Intelligence

Scaling language models unlocks impressive capabilities, but the accompanying computational and memory demands make both training and deployment expensive. Existing efficiency efforts typically target either parameter sharing or adaptive computation, leaving open the question of how to attain both simultaneously. We introduce Mixture-of-Recursions (MoR), a unified framework that combines the two axes of efficiency inside a single Recursive Transformer. MoR reuses a shared stack of layers across recursion steps to achieve parameter efficiency, while lightweight routers enable adaptive token-level thinking by dynamically assigning different recursion depths to individual tokens. This allows MoR to focus quadratic attention computation only among tokens still active at a given recursion depth, further improving memory access efficiency by selectively caching only their key-value pairs. Beyond these core mechanisms, we also propose a KV sharing variant that reuses KV pairs from the first recursion, specifically designed to further decrease memory footprint. Across model scales ranging from 135M to 1.7B parameters, MoR forms a new Pareto frontier: at equal training FLOPs and smaller model sizes, it significantly lowers validation perplexity and improves few-shot accuracy, while delivering higher throughput compared with vanilla and existing recursive baselines. These gains demonstrate that MoR is an effective path towards large-model quality without incurring large-model cost.


TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness

Zhao, Zhiyuan, Ni, Juntong, Xu, Shangqing, Liu, Haoxin, Jin, Wei, Prakash, B. Aditya

arXiv.org Artificial Intelligence

Time-series forecasting is an essential task with wide real-world applications across domains. While recent advances in deep learning have enabled time-series forecasting models with accurate predictions, there remains considerable debate over which architectures and design components, such as series decomposition or normalization, are most effective under varying conditions. Existing benchmarks primarily evaluate models at a high level, offering limited insight into why certain designs work better. To mitigate this gap, we propose TimeRecipe, a unified benchmarking framework that systematically evaluates time-series forecasting methods at the module level. TimeRecipe conducts over 10,000 experiments to assess the effectiveness of individual components across a diverse range of datasets, forecasting horizons, and task settings. Our results reveal that exhaustive exploration of the design space can yield models that outperform existing state-of-the-art methods and uncover meaningful intuitions linking specific design choices to forecasting scenarios. Furthermore, we release a practical toolkit within TimeRecipe that recommends suitable model architectures based on these empirical insights. The benchmark is available at: https://github.com/AdityaLab/TimeRecipe.