Goto

Collaborating Authors

 Ding, Zhimin


Online Cascade Learning for Efficient Inference over Streams

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have a natural role in answering complex queries about data streams, but the high computational cost of LLM inference makes them infeasible in many such tasks. We propose online cascade learning, the first approach to addressing this challenge. The objective here is to learn a "cascade" of models, starting with lower-capacity models (such as logistic regressors) and ending with a powerful LLM, along with a deferral policy that determines the model that is used on a given input. We formulate the task of learning cascades online as an imitation-learning problem and give a no-regret algorithm for the problem. Experimental results across four benchmarks show that our method parallels LLMs in accuracy while cutting down inference costs by as much as 90%, underscoring its efficacy and adaptability in stream processing.


Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

arXiv.org Artificial Intelligence

We consider the problem of how to differentiate In addition to scalability, executing such a code on a relational computations expressed relationally. We show engine has the advantage that the database query experimentally that a relational engine running an optimizer will automatically distribute the computation, taking auto-differentiated relational algorithm can easily into account the sizes of the two matrices. If A and B are scale to very large datasets, and is competitive both large matrices, a database optimizer will consider the with state-of-the-art, special-purpose systems for hardware constraints on each compute node (e.g.