Papers | Prashant Pandey

Papers

KV Cache Recycling to Expand Usable Context Capacity in Low Parameter LLMs
accepted for publication in IJRSI 2026

Whether attention key value (KV) states computed for one prompt for a small LLM can be reused to accelerate inference on a new similar prompt, giving an increase to the space to its context memory using token recycling.

KV Cache Recycling to Expand Usable Context Capacity in Low Parameter LLMs accepted for publication in IJRSI 2026

KV Cache Recycling to Expand Usable Context Capacity in Low Parameter LLMs
accepted for publication in IJRSI 2026