6754e06e46dfa419d5afe3c9781cecad-AuthorFeedback.pdf
–Neural Information Processing Systems
So,thefactthatourtraining8 data comes solely from infectious virus, which would be highly probable (or "grammatical") sequences under our9 language model (LM), isakeyfeature ofourapproach. Importantly,however,we note that,fundamentally,CSCS ispresented ingenerality here sothese methods are19 not strictly "competitor methods" in the sense that, if one were to work better, it would still be incorporable within20 theCSCSframework. "`1 rather than Euclidean": We used`1 since it has nicer properties than, e.g.,`2 in26 high-dimensional spaces(Aggarwaletal.,ICDT,2001)butotherdistance metrics couldbeempirically quantified. "theoretical44 detail"/"how the method works": We apologize for sparsity of detail. "choice of beta": We find good robustness ofβ values reasonably close to 1 (e.g, 0.5-2).56
Neural Information Processing Systems
Feb-8-2026, 17:35:50 GMT