MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Huiqiang Jiang, Yucheng Li
–Neural Information Processing Systems
Existing methods for speeding up pre-filling often fail to maintain acceptable accuracy or efficiency when applied to long-context LLMs.
Neural Information Processing Systems
Feb-14-2026, 19:37:17 GMT
- Country:
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Asia
- Europe > United Kingdom
- Scotland (0.04)
- North America
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States > Florida
- Miami-Dade County > Miami (0.04)
- South America > Chile
- Africa > Ethiopia
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education (0.46)
- Information Technology (0.46)
- Technology: