Institution

Virginia Tech

A US public research university active in generative video, diffusion models, and efficient deep learning.

VideoMLA: A Low-Rank Latent KV Cache for Minute-Scale Video Diffusion

VideoMLA ports Multi-Head Latent Attention into causal video diffusion, cutting per-token KV memory 92.7% (224 vs 3,072 scalars), winning VBench at 60s, and lifting B200 throughput 1.23x.